How-to deploy Danswer for AI powered Virtual Assistants on the ITS Private Cloud

Introduction

Danswer is the AI Assistant that can be connected to many sources like Atlassian Confluence, Sharepoint, Slack, web, files, MS Teams and many more. It performs retrieval of documents, generates embeddings and stores them locally. Once a query comes in, a semantic similarity search is performed and the most relevant results are passed to the LLM instead of full documents, by doing this, noise to the model is reduced (System Overview).

You can use the ITS Private Cloud GPU offering to deploy both Danswer and local LLM like Ollama or vLLM and your data would never leave U of T.

Additionally, you could deploy Danswer on the ITS Private Cloud and use any remote LLM for inference service like Azure OpenAI, ChatGPT, Claude, etc., and only the most relevant vectorized data would leave our infrastructure.

This How-To is focused on deploying a virtual machine with the vss-cli running Ubuntu to host Danswer to hold ~1000 indexed documents, with the following specs:

8vCPUs.
500GB ssd storage.
16GB memory reserved.
16GB vGPU.

Instructions

Virtual Machine deployment

Download and update the following attributes:
1. machine.folder: target logical folder. List available folders with vss-cli compute folder ls
2. metadata.client : your department client.
3. metadata.inform: email address for automated notifications

Deploy your file as follows:

vss-cli --wait compute vm mk from-file ubuntu-danswer.yaml

(Optional) If planning to use Ollama, add a virtual GPU of 16GB, specifically the 16q profile. For more information in the profile used, check the following document How-to Request a Virtual GPU
```
vss-cli compute vm set <VM_ID> gpu mk --profile 16q
```
Once the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.

Power on virtual machine:

vss-cli compute vm set ubuntu-llm state on

Docker

Add the docker gpg key:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

Add the repository to Apt sources:

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install Docker:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

(Optional) `ollama`

Follow steps 2 and 3 ofHow-to install Ollama in Ubuntu with vGPU on the ITS Private Cloud

Danswer

Clone the Danswer repo:

git clone https://github.com/danswer-ai/danswer.git

Go to danswer/deployment/docker_compose:
```
cd danswer/deployment/docker_compose
```

Configure Danswer by creating a file in danswer/deployment/docker_compose/.env with the following contents:

# Configures basic email/password based login
AUTH_TYPE="basic"

# Rephrasing the query into different languages to improve search recall
MULTILINGUAL_QUERY_EXPANSION="English,Spanish"

# Set a cheaper/faster LLM for the flows that are easier (such as translating the query etc.)
FAST_GEN_AI_MODEL_VERSION="gpt-3.5-turbo"

# Setting more verbose logging
LOG_LEVEL="debug"
LOG_ALL_MODEL_INTERACTIONS="true"

DISABLE_TELEMETRY="true"

More information about configuration settings can be found here https://docs.danswer.dev/configuration_guide

Build the containers:

docker compose -f docker-compose.dev.yml -p danswer-stack up -d --build --force-recreate

Danswer will now be running on http://{ip-address}:3000.

To stop the stack:

docker compose -f docker-compose.dev.yml -p danswer-stack down

(Optional) If you are using Ollama on the same instance, use the following settings:
1. Display name: ollama
2. Provider Name: ollama
3. [Optional] API Base: http://host.docker.internal:11434
4. Model Names:
  1. llama3
  2. phi3
5. Default model: llama3
6. Fast Model: phi3
(Optional) If you are using remote inference like OpenAI or Azure OpenAI, refer to the official danswer.ai docs: https://docs.danswer.dev/quickstart#generative-ai-api-key
Create your first connector (https://docs.danswer.dev/connectors/overview )

Table of Contents

Introduction

Instructions

Virtual Machine deployment

Docker

(Optional) `ollama`

Danswer

Related articles

How-to deploy Danswer for AI powered Virtual Assistants on the ITS Private Cloud

Table of Contents

Introduction

Instructions

Virtual Machine deployment

Docker

(Optional) ollama

Danswer

Related articles

(Optional) `ollama`