How-to deploy Tabby for AI powered Code Assistants on the ITS Private Cloud

Table of Contents

Introduction

Tabby is an open-source, self-hosted AI coding assistant that offers an alternative to GitHub Copilot. It is designed to help developers write code more efficiently and effectively. As a self-hosted alternative, Tabby allows developers to integrate it into their hardware and have complete control over their data and privacy

This How-To is focused on deploying a virtual machine with the vss-cli running Ubuntu to host Tabby with the following specs:

  • 8vCPUs.

  • 200GB SSD storage.

  • 16GB memory reserved.

  • 16GB vGPU.

Instructions

Virtual Machine deployment

  1. Download and update the following attributes:

    1. machine.folder: target logical folder. List available folders with vss-cli compute folder ls

    2. metadata.client : your department client.

    3. metadata.inform: email address for automated notifications

  2. Deploy your file as follows:

    vss-cli --wait compute vm mk from-file ubuntu-tabby.yaml
  3. Disable secure boot as a workaround for nvidia-gridd issues:

    vss-cli compute vm set <VM_ID> secure-boot --off
  4. Add a virtual GPU of 16GB, specifically the 16q profile. For more information in the profile used, check the following document How-to Request a Virtual GPU

    vss-cli compute vm set <VM_ID> gpu mk --profile 16q
  5. Once the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.

  6. Power on virtual machine:

    vss-cli compute vm set code-ai state on

Docker

  1. Add the docker gpg key:

    # Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc
  2. Add the repository to Apt sources:

    echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update
  3. Install Docker:

    sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

NVIDIA Container Toolkit

  1. Configure the production repository

    curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
  2. Update the packages list and Install the NVIDIA Container Toolkit packages:

    sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
  3. Configure the container runtime by using the nvidia-ctk command:

    sudo nvidia-ctk runtime configure --runtime=docker
  4. Restart Docker:

    sudo systemctl restart docker

Tabby

  1. Create the following docker-compose.yml file:

    version: '3.5' services: tabby: restart: always image: registry.tabbyml.com/tabbyml/tabby command: serve --model StarCoder-1B --chat-model Qwen2-1.5B-Instruct --device cuda volumes: - "$HOME/.tabby:/data" ports: - 8080:8080 deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu]
  2. Run docker compose up -d to deploy:

    docker compose up -d
  3. After a few minutes of pulling the image and fetching the models, you should should see something like:

    docker compose logs -f
CleanShot 2024-09-20 at 12.07.57-20240920-160802.png
  1. Open a browser with the IP address and point to port 8080 and continue creating an admin account:

    CleanShot 2024-09-20 at 12.09.23-20240920-160926.png
  2. Once that’s completed, you’d be able to access your newly deployed instance:

    CleanShot 2024-09-20 at 12.10.57-20240920-161100.png

     

  3. Go to Settings/General and update the Endpoint URL to match the IP address. Eventually that’d be the FQDN assigned to the server.

  4. Use the new endpoint URL and Token to configure either your IntelliJ or VSCode IDE

  5. Celebrate

References

 

Related content

University of Toronto - Since 1827