Skip to content

LLM Proxy Server

What is it?

AAQ uses the LiteLLM Proxy Server for managing LLM calls, allowing you to use any LiteLLM supported model including self-hosted ones.

This proxy server runs as a separate Docker container with configs read from a config.yaml file, where you can set the appropriate model names and endpoints for each LLM task.

Example config

You can see an example litellm_proxy_config.yaml file below. In our backend code, we refer to the models by their custom task model_name (e.g. "generate-response"), but which actual LLM model each call is routed to is set here.

model_list:
  - model_name: embeddings
    litellm_params:
      model: text-embedding-ada-002
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: default
    litellm_params:
      model: gpt-4-0125-preview
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: generate-response
    litellm_params:
      model: gpt-4-0125-preview
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: detect-language
    litellm_params:
      model: gpt-3.5-turbo-1106
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: translate
    litellm_params:
      model: gpt-3.5-turbo-1106
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: paraphrase
    litellm_params:
      model: gpt-3.5-turbo-1106
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: safety
    litellm_params:
      model: gpt-3.5-turbo-1106
      api_key: "os.environ/OPENAI_API_KEY"
  - model_name: alignscore
    litellm_params:
      model: gpt-3.5-turbo-1106
      api_key: "os.environ/OPENAI_API_KEY"
litellm_settings:
  num_retries: 3
  request_timeout: 100
  telemetry: False

See the Contributing Setup and Docker Compose Setup for how this service is run in our stack.

Monitoring with Langfuse

You can log all inputs and outputs of LiteLLM Proxy server via Langfuse.

  1. Add Langfuse to litellm_proxy_config.yaml

    litellm_settings:
      success_callback: ["langfuse"]
    
  2. Include Langfuse credentials as environment variables in your deployment environment. If you are using docker compose, add the following in your deployment/docker-compose/.env file:

    LANGFUSE_PUBLIC_KEY=pk-...
    LANGFUSE_SECRET_KEY=sk-...
    

Also see