How-to deploy PrivateGPT in Ubuntu with vGPU on the ITS Private Cloud
In December 2023, we announced the launch of virtual GPU capabilities on the ITS Private Cloud, as detailed in our blog post ( Introducing Virtual GPUs for Virtual Machines ) and now, we are working on practical examples to harness the power, affordability, security and privacy of the ITS Private Cloud to run Large Language Models (LLMs).
This How-To is focused on deploying a virtual machine running Ubuntu with a 16GB vGPU the vss-cli
to host PrivateGPT, an Artificial Intelligence Open Source project that allows you to ask questions about documents using the power of LLMs, without data leaving the runtime environment.
Virtual Machine Deployment
Download the
vss-cli
configuration spec: and update the following attributes:machine.folder
: target logical folder. List available folders withvss-cli compute folder ls
metadata.client
: your department client.metadata.inform
: email address for automated notifications
Deploy with the following command:
vss-cli --wait compute vm mk from-file ubuntu-llm-privategpt.yaml
Add a virtual GPU of
16GB
, specifically the16q
profile. For more information in the profile used, check the following document How-to Request a Virtual GPUvss-cli compute vm set ubuntu-llm gpu mk --profile 16q
Once the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.
Power on virtual machine
vss-cli compute vm set ubuntu-llm state on
NVIDIA Driver and Licensing
NVIDIA Driver and Licensing
Current supported driver is nvidia-linux-grid-535_535.183.01_amd64.deb
Login to the server via ssh. Note that the username may change if you further customized the VM with
cloud-init
Download the NVIDIA drivers from VKSEY-STOR:
Install the drivers as privileged user:
Create directory: ClientConfigToken
Create the NVIDIA token file:
Set permissions to the NVIDIA token:
Set the
FeatureType
to2
for “NVIDIA RTX Virtual Workstation” in/etc/nvidia/gridd.conf
with the following command:Restart
nvidia-gridd
service to pick up the new license token:Check for any Error or successful activation:
output:
Verify GPU status with
nvidia-smi
:You can also monitor in console the gpu usage with
nvtop
:
Install PrivateGPT
Dependencies
Login to the server via ssh. Note that the username may change if you further customized the VM with
cloud-init
Install OS dependencies:
Install
python 3.11
either from source or viappa:deadsnakes/ppa
:Install NVIDIA CUDA Toolkit. Needed to recompile
llama-cpp-python
later.
Install PrivateGPT
Clone source repository
Create and activate virtual environment:
Install
poetry
to get all python dependencies installed:Update
pip
andpoetry
. Then Install PrivateGPT dependencies:Install
llama-cpp-python
Enable GPU support
Export the following environment variables:
Reinstall
llama-cpp-python
:
Run PrivateGPT
Run
python3.10 -m private_gpt
to start:Open a web browser with the IP address assigned on port 8001:
http://XXX.XXX.XXX.XXX:8001
Upload a few documents and start asking questions:
Related articles
University of Toronto - Since 1827