Skip to content

Instantly share code, notes, and snippets.

@manueldeprada
Last active November 15, 2023 17:20
Show Gist options
  • Select an option

  • Save manueldeprada/dcfe6fb2db107faa637338cd173fe43b to your computer and use it in GitHub Desktop.

Select an option

Save manueldeprada/dcfe6fb2db107faa637338cd173fe43b to your computer and use it in GitHub Desktop.
Running QAFactEval in 2023

First, it is you need to create a conda or venv environment. This mamba.yml file contains the packages:

name: QAFactEval
channels:
- conda-forge
dependencies:
- python=3.8.18=hd12c33a_0_cpython
- spacy=2.2.4 
- spacy-model-en_core_web_sm=2.2.5
- gdown=4.7.1
- pysocks=1.7.1
- pip:
  - qafacteval==0.10

Micromamba is a fast and self-conatined C++ conda runtime. You can create the environment and setup QAFactEval with these commands:

micromamba env create -f conda.yml -y
micromamba activate QAFactEval
git clone https://github.com/salesforce/QAFactEval.git
cd QAFactEval
./download_models.sh

Ready for some fact evaluations!

echo '{"document": {"text": "This is a source document"}, "claim": "This is a summary"}' > input.jsonl
python run.py --model_folder models --fname input.jsonl --outfname out.jsonl --cuda_device -1

You can try to use your GPU by removving the "cuda_device -1" argument, but most GPUs wont work, since this code is tied to torch 1.6 (which lacks CUDA kernels for modern architecture). If needed, you could try to update torch.

Anyway, the models are quite small. That command will create a out.jsonl file that looks like:

{"document": {"text": "This is a source document"}, "claim": "This is a summary", "metrics": {"qa-eval": {"f1": 0.0, "is_answered": 1.0, "em": 0.0}}, "qa_pairs": [[{"question": {"question_id": "dc50bcdddb09fd6e2772d349a1e8dd58", "question": "What is this?", "answer": "a summary", "sent_start": 0, "sent_end": 17, "answer_start": 8, "answer_end": 17}, "prediction": {"prediction_id": "cfcd208495d565ef66e7dff9f98764da", "prediction": "a source document", "probability": 0.735170068387259, "null_probability": 2.769950940990251e-05, "start": 8, "end": 25, "f1": 0, "is_answered": 1.0, "em": 0}}]], "qa_pairs_nonfiltered": [[{"question_id": "dc50bcdddb09fd6e2772d349a1e8dd58", "question": "What is this?", "answer": "a summary", "sent_start": 0, "sent_end": 17, "answer_start": 8, "answer_end": 17}]], "qa_summary": [[{"prediction_id": "cfcd208495d565ef66e7dff9f98764da", "prediction": "a summary", "probability": 0.6332314891401608, "null_probability": 2.6775376205867233e-05, "start": 8, "end": 17}]]}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment