Using the GPU backend in h2o.xgboost in a rocker based Docker container

r
Tags: #<Tag:0x00007fdab49dd5a0>

#1

Hi guys,

Cross post from stackoverflow: https://stackoverflow.com/questions/54373156/using-the-gpu-backend-in-h2o-xgboost-in-a-rocker-based-docker-container

I’ve been trying to get GPU support to work for xgboost via h2o in a rocker docker container with little success. Progress so far: GitHub, Docker Hub

I have installed CUDA + nvidia-docker on the host machine and CUDA (9.0 - 9.2) in the container. I’m running the container with the following,

nvidia-docker run -d -p 8787:8787 -e USER=tidyverse-gpu -e PASSWORD=tidyverse-gpu --name tidyverse-gpu seabbs/tidyverse-gpu

Base Xgboost works with GPU support in both R and Python (and nvidia-smi returns usage stats etc when run inside the container). When the GPU backend is enabled in h2o.xgboost the following error is returned.

Illegal argument(s) for XGBoost model: XGBoost_model_R_1548450637489_3.  Details: ERRR on field: _backend: GPU backend (gpu_id: 0) is not functional. Check CUDA_PATH and/or GPU installation.

Initially I had not added the CUDA_PATH in the Dockerfile but testing adding this has had no effect.

Sys.getenv("CUDA_PATH")
[1] "/usr/local/cuda"

The h2o startup logs show no issue with the xgboost module (that I can see). I’ve tried rolling back to CUDA 8.0 but this errors in the latest rocker containers as the gcc version being used is not supported by xgboost.

Any help would be much appreciated as I don’t have a clue :slight_smile:


#2

Just noting here that this Q is linked to this previous one: Tips for installing CUDA into a rocker docker container


#3

It looks like this has been fixed via a suggestion from Erin LeDell. Solution was to remove the stubs folder from CUDA version installed inside the Docker container - not sure if this will have any additional impact on other use cases.

Thanks for the help in getting this sorted Noam