Running validation

Currently, there is validation only for retrieval, i.e. POST /embeddings-search endpoint.

To evaluate the performance of your model (along with your own configurations and guardrails), run the validation test(s) in core_backend/validation.

Retrieval (`/embeddings-search`) validation

We evaluate the "performance" of retrieval by computing "Top K Accuracy", which is defined as proportion of times the best matching answer was present in top K retrieved contents.

Preparing the data

The test assumes the validation data contains a single label representing the best matching content, rather than a ranked list of all relevant content.

An example validation data will look like

query	label
"How?"	0
"When?"	1
"What year was it?"	1
"May I?"	2

An example content data will look like

content_text	label
"Here's how."	0
"It was 2024."	1
"Yes"	2

Setting up

Create a new python environment:
```
conda create -n "aaq-validate" python=3.10
```
You can also copy the existing aaq-core environment.

Install requirements. This assumes you are in project root aaq-core.

conda activate aaq-validate
pip install -r core_backend/requirements.txt
pip install -r core_backend/validation/requirements.txt

Set environment variables.
1. You must export the required environment variables. They are defined with default values in core_backend/validation/validation.env. To ensure that these env variables are set every time you activate aaq-validate, you can run the following command for each variable:
```
conda env config vars set <VARIABLE>=<VALUE>
```
2. For optional ones, check out the defaults in core_backend/app/configs/app_config.py and modify as per your own requirements. For example:
```
conda env config vars set LITELLM_MODEL_EMBEDDING=<...>
```
3. If you are using an external LLM endpoint, e.g. OpenAI, make sure to export the API key variable as well.
```
conda env config vars set OPENAI_API_KEY=<Your OPENAI API key>
```

Running retrieval validation

In project root aaq-core run the following command. (Perform any necessary authentication steps you need to do, e.g. for AWS login).

cd aaq-core

python -m pytest core_backend/validation/validate_retrieval.py \
    --validation_data_path <path> \
    --content_data_path <path> \
    --validation_data_question_col <name> \
    --validation_data_label_col <name> \
    --content_data_label_col <name> \
    --content_data_text_col <name> \
    --notification_topic <topic ARN, if using AWS SNS> \
    --aws_profile <aws SSO profile name, if required> \
    -n auto -s

-n auto allows multiprocessing to speed up the test, and -s ensures logging by the test module is shown on your stdout.

For details of the command line arguments, see the "Custom options" section of the
output for the following command:
```shell
python -m pytest core_backend/validation/validate_retrieval.py --help
```

Running validation

Retrieval (/embeddings-search) validation

Preparing the data

Setting up

Running retrieval validation

Retrieval (`/embeddings-search`) validation