How-to deploy PrivateGPT in Ubuntu with vGPU on the ITS Private Cloud

In December 2023, we announced the launch of virtual GPU capabilities on the ITS Private Cloud, as detailed in our blog post ( Introducing Virtual GPUs for Virtual Machines ) and now, we are working on practical examples to harness the power, affordability, security and privacy of the ITS Private Cloud to run Large Language Models (LLMs).

This How-To is focused on deploying a virtual machine running Ubuntu with a 16GB vGPU the vss-cli to host PrivateGPT, an Artificial Intelligence Open Source project that allows you to ask questions about documents using the power of LLMs, without data leaving the runtime environment.

1 Virtual Machine Deployment
2 NVIDIA Driver and Licensing
- 2.1 NVIDIA Driver and Licensing
3 Install PrivateGPT
- 3.1 Dependencies
- 3.2 Install PrivateGPT
- 3.3 Enable GPU support
- 3.4 Run PrivateGPT
4 Related articles

Virtual Machine Deployment

Download the vss-cli configuration spec: and update the following attributes:
1. machine.folder: target logical folder. List available folders with vss-cli compute folder ls
2. metadata.client : your department client.
3. metadata.inform: email address for automated notifications
Deploy with the following command:
vss-cli --wait compute vm mk from-file ubuntu-llm-privategpt.yaml
Add a virtual GPU of 16GB, specifically the 16q profile. For more information in the profile used, check the following document How-to Request a Virtual GPU
vss-cli compute vm set ubuntu-llm gpu mk --profile 16q
Once the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.
Power on virtual machine
vss-cli compute vm set ubuntu-llm state on

NVIDIA Driver and Licensing

Current supported driver is nvidia-linux-grid-535_535.183.01_amd64.deb

Login to the server via ssh. Note that the username may change if you further customized the VM with cloud-init
Download the NVIDIA drivers from VKSEY-STOR:
Install the drivers as privileged user:
Create directory: ClientConfigToken
Create the NVIDIA token file:
Set permissions to the NVIDIA token:
Set the FeatureType to 2 for “NVIDIA RTX Virtual Workstation” in /etc/nvidia/gridd.conf with the following command:
Restart nvidia-gridd service to pick up the new license token:
Check for any Error or successful activation:
output:
Verify GPU status with nvidia-smi:
You can also monitor in console the gpu usage with nvtop :

Install PrivateGPT

Dependencies

Login to the server via ssh. Note that the username may change if you further customized the VM with cloud-init
Install OS dependencies:
Install python 3.11 either from source or via ppa:deadsnakes/ppa:
Install NVIDIA CUDA Toolkit. Needed to recompile llama-cpp-python later.

Install PrivateGPT

Clone source repository
Create and activate virtual environment:
Install poetry to get all python dependencies installed:
Update pip and poetry. Then Install PrivateGPT dependencies:
Install llama-cpp-python

Enable GPU support

Export the following environment variables:
Reinstall llama-cpp-python:

Run PrivateGPT

Run python3.10 -m private_gpt to start:
Open a web browser with the IP address assigned on port 8001: http://XXX.XXX.XXX.XXX:8001
Upload a few documents and start asking questions:

Page:

How-to generate a cost estimate of virtual machines
Page:

How-to use the --columns option in the vss-cli?
Page:

How-to configure and licence a provisioned vGPU
Page:

Deploy and reconfigure Instance from Clone
Page:

How-to reset my VSS account password

Virtual Machine Deployment

NVIDIA Driver and Licensing

NVIDIA Driver and Licensing

Install PrivateGPT

Dependencies

Install PrivateGPT

Enable GPU support

Run PrivateGPT

Related articles