How-to install Ollama in Ubuntu with vGPU on the ITS Private Cloud
Run large language models (LLMs) with ollama on Ubuntu Linux, utilizing vGPU resources powered by U of T ITS Private Cloud Infrastructure. Utilize the VSS-CLI commands to facilitate this process.
We've pinpointed a bug in the NVIDIA driver version 525_525.147.05 when used alongside Ubuntu kernel 5.15.0-105-generic or newer. To ensure smooth operation of this tutorial, we recommend sticking to kernel version 5.15.0-102-generic. For further details, refer to the following link: Bug Report
ollama is an LLM serving platform written in golang. It makes LLMs built on Llama standards easy to run with an API.
Table of Contents
Steps
Virtual Machine Deployment
Download and update the following attributes:
machine.folder: target logical folder. List available folders withvss-cli compute folder lsmetadata.client: your department client.metadata.inform: email address for automated notifications
Deploy your file as follows:
vss-cli --wait compute vm mk from-file ubuntu-llm-ollama.yamlAdd a virtual GPU of
16GB, specifically the16qprofile. For more information in the profile used, check the following document How-to Request a Virtual GPUvss-cli compute vm set <VM_ID> gpu mk --profile 16qOnce the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.
Power on virtual machine
vss-cli compute vm set ubuntu-llm state on
NVIDIA Driver and Licensing
Current supported driver is nvidia-linux-grid-535_535.183.01_amd64.deb
Login to the server via ssh. Note that the username may change if you further customized the VM with
cloud-initssh -p 2222 vss-admin@XXX.XXX.XXX.XXXDownload the NVIDIA drivers from VKSEY-STOR:
scp {vss-user}@vskey-stor.eis.utoronto.ca:/ut-vss-lib/nvidia-grid-vsphere-7.0-535.183.04-535.183.01-538.67/Guest_Drivers/nvidia-linux-grid-535_535.183.01_amd64.deb /tmp/Install the drivers as privileged user:
apt install dkms nvtop apt install /tmp/nvidia-linux-grid-535_535.183.01_amd64.debCreate directory: ClientConfigToken
mkdir /etc/nvidia/ClientConfigTokenCreate the NVIDIA token file:
echo -n -e $(vmware-rpctool "info-get guestinfo.ut.vss.nvidia_token") > /etc/nvidia/ClientConfigToken/client_configuration_token_12-05-2023-11-26-05.tokSet permissions to the NVIDIA token:
chmod 744 /etc/nvidia/ClientConfigToken/client_configuration_token_12-05-2023-11-26-05.tokSet the
FeatureTypeto2for “NVIDIA RTX Virtual Workstation” in/etc/nvidia/gridd.confwith the following command:sed -i 's/FeatureType=0/FeatureType=2/g' /etc/nvidia/gridd.confRestart
nvidia-griddservice to pick up the new license token:systemctl restart nvidia-griddCheck for any Error or successful activation:
journalctl -u nvidia-griddoutput:
Dec 13 11:23:20 ubu-llm systemd[1]: Stopped NVIDIA Grid Daemon. Dec 13 11:23:20 ubu-llm systemd[1]: Starting NVIDIA Grid Daemon... Dec 13 11:23:20 ubu-llm systemd[1]: Started NVIDIA Grid Daemon. Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Started (2017) Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: vGPU Software package (0) Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Ignore service provider and node-locked licensing Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: NLS initialized Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Acquiring license. (Info: vss-nvidia-ls.eis.utoronto.ca; NVIDIA RTX Virtual Workstation) Dec 13 11:23:22 ubu-llm nvidia-gridd[2017]: License acquired successfully. (Info: vss-nvidia-ls.eis.utoronto.ca, NVIDIA RTX Virtual Workstation; Expiry: 2023-12-14 16:23:22 GMT) Dec 13 14:59:24 ubu-llm nvidia-gridd[2017]: License renewed successfully. (Info: vss-nvidia-ls.eis.utoronto.ca, NVIDIA RTX Virtual Workstation; Expiry: 2023-12-14 19:59:23 GMT)Verify GPU status with
nvidia-smi:user@test:~$ nvidia-smi Wed Dec 13 14:18:27 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.147.05 Driver Version: 525.147.05 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 GRID P6-16Q On | 00000000:02:00.0 Off | N/A | | N/A N/A P8 N/A / N/A | 6426MiB / 16384MiB | 0% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 3458 C ...ld/cuda/bin/ollama-runner 6426MiB | +-----------------------------------------------------------------------------+You can also monitor in console the gpu usage with
nvtop:
Install the Ollama service
The following installation will be done as the user vss-admin.
Prerequisites
Download Anaconda package
curl https://repo.anaconda.com/archive/Anaconda3-2023.07-2-Linux-x86_64.sh --output anaconda.shInstall Anaconda package.
bash anaconda.sh
Install and configure Ollama service
Download & install
ollama
curl https://ollama.ai/install.sh | shFor testing purposes, open a terminal:
Second terminal: run the following command:
ollama pull llama2 ollama run llama2You can now test by asking questions:
Proceed to cancel the terminal.
You can exit the model by typing:
control + d
Install & configure Ollama Web UI
Prerequisites
Download and install nvm
curl https://raw.githubusercontent.com/creationix/nvm/master/install.sh | bashLoad the environment or execute the command below:
source ~/.bashrcInstall
nodejsnvm install v21.4.0 nvm use v21.4.0Install python 3
sudo apt install python3
Install and Configure Ollama Web UI
Download and install ollama-webui:
cd /home/vss-admin git clone https://github.com/ollama-webui/ollama-webui.git && cd ollama-webui/Create ollama-webui environment file:
.envcp -RPp .env.example .envInstall libraries and build the ollama-webui project
npm i npm run buildInstall
python3-pipandpython3-venv:apt install python3-pip python3-venvCreate virtual environment to isolate dependencies:
cd ./backend python3 -m venv venv && . venv/bin/activateInstall libraries and run the ollama backend
cd ./backend pip install -r requirements.txt sh start.shTest the UI application by opening a browser with the your servers ip address and port 8080:
http://XXX.XXX.XXX.XXX:8080Select a model: llama2 and ask a message:
Troubleshoot
How to verify if the nvidia licenses has been installed properly?
Solution. Run the following command as root:journalctl -fu nvidia-griddoutput:
Dec 13 11:23:20 ubu-llm systemd[1]: Stopped NVIDIA Grid Daemon. Dec 13 11:23:20 ubu-llm systemd[1]: Starting NVIDIA Grid Daemon... Dec 13 11:23:20 ubu-llm systemd[1]: Started NVIDIA Grid Daemon. Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Started (2017) Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: vGPU Software package (0) Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Ignore service provider and node-locked licensing Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: NLS initialized Dec 13 11:23:20 ubu-llm nvidia-gridd[2017]: Acquiring license. (Info: vss-nvidia-ls.eis.utoronto.ca; NVIDIA RTX Virtual Workstation) Dec 13 11:23:22 ubu-llm nvidia-gridd[2017]: License acquired successfully. (Info: vss-nvidia-ls.eis.utoronto.ca, NVIDIA RTX Virtual Workstation; Expiry: 2023-12-14 16:23:22 GMT) Dec 13 14:59:24 ubu-llm nvidia-gridd[2017]: License renewed successfully. (Info: vss-nvidia-ls.eis.utoronto.ca, NVIDIA RTX Virtual Workstation; Expiry: 2023-12-14 19:59:23 GMT)
Reference links
VSS Cloud documentation
Anaconda
nvidia-smi commands
Ollama
Ollama-webui
nvtop
Linux
https://www.baeldung.com/linux/systemd-services-environment-variables
https://www.alibabacloud.com/blog/how-to-set-environment-variables-in-a-systemd-service_598533
University of Toronto - Since 1827