How-to deploy Danswer for AI powered Virtual Assistants on the ITS Private Cloud
Table of Contents
- 1 Introduction
- 2 Instructions
- 2.1 Virtual Machine deployment
- 2.2 Docker
- 2.3 (Optional) ollama
- 2.4 Danswer
- 3 Related articles
Introduction
Danswer is the AI Assistant that can be connected to many sources like Atlassian Confluence, Sharepoint, Slack, web, files, MS Teams and many more. It performs retrieval of documents, generates embeddings and stores them locally. Once a query comes in, a semantic similarity search is performed and the most relevant results are passed to the LLM instead of full documents, by doing this, noise to the model is reduced (System Overview).
You can use the ITS Private Cloud GPU offering to deploy both Danswer and local LLM like Ollama or vLLM and your data would never leave U of T.
Additionally, you could deploy Danswer on the ITS Private Cloud and use any remote LLM for inference service like Azure OpenAI, ChatGPT, Claude, etc., and only the most relevant vectorized data would leave our infrastructure.
This How-To is focused on deploying a virtual machine with the vss-cli
running Ubuntu to host Danswer to hold ~1000 indexed documents, with the following specs:
8vCPUs.
500GB ssd storage.
16GB memory reserved.
16GB vGPU.
Instructions
Virtual Machine deployment
Download and update the following attributes:
machine.folder
: target logical folder. List available folders withvss-cli compute folder ls
metadata.client
: your department client.metadata.inform
: email address for automated notifications
Deploy your file as follows:
vss-cli --wait compute vm mk from-file ubuntu-danswer.yaml
Disable secure boot as a workaround for
nvidia-gridd
issues:vss-cli compute vm set <VM_ID> secure-boot --off
(Optional) If planning to use Ollama, add a virtual GPU of
16GB
, specifically the16q
profile. For more information in the profile used, check the following document How-to Request a Virtual GPUvss-cli compute vm set <VM_ID> gpu mk --profile 16q
Once the VM has been deployed, a confirmation email will be sent with the assigned IP address and credentials.
Power on virtual machine:
Docker
Add the docker gpg key:
Add the repository to Apt sources:
Install Docker:
(Optional) ollama
Follow steps 2 and 3 ofHow-to install Ollama in Ubuntu with vGPU on the ITS Private Cloud
Danswer
Clone the Danswer repo:
Go to
danswer/deployment/docker_compose
:Configure Danswer by creating a file in
danswer/deployment/docker_compose/.env
with the following contents:More information about configuration settings can be found here Configuring Danswer - Danswer Documentation
Build the containers:
Danswer will now be running on
http://{ip-address}:3000
.To stop the stack:
(Optional) If you are using Ollama on the same instance, use the following settings:
Display name:
ollama
Provider Name:
ollama
[Optional] API Base:
http://host.docker.internal:11434
Model Names:
llama3
phi3
Default model:
llama3
Fast Model:
phi3
(Optional) If you are using remote inference like OpenAI or Azure OpenAI, refer to the official danswer.ai docs: Quickstart - Danswer Documentation
Create your first connector (Connector Overview - Danswer Documentation )
Related articles
University of Toronto - Since 1827