First, it is you need to create a conda or venv environment. This mamba.yml file contains the packages:
name: QAFactEval
channels:
- conda-forge
dependencies:
- python=3.8.18=hd12c33a_0_cpython
- spacy=2.2.4
- spacy-model-en_core_web_sm=2.2.5
- gdown=4.7.1
- pysocks=1.7.1
- pip:
- qafacteval==0.10
Micromamba is a fast and self-conatined C++ conda runtime. You can create the environment and setup QAFactEval with this script:
micromamba env create -f conda.yml -y
micromamba activate QAFactEval
git clone https://github.com/salesforce/QAFactEval.git
cd QAFactEval
./download_models.sh
Ready for some fact evaluations!
echo '{"document": {"text": "This is a source document"}, "claim": "This is a summary"}' > input.jsonl
python run.py --model_folder models --fname input.jsonl --outfname out.jsonl --cuda_device -1
You can try to use your GPU by removving the "cuda_device -1" argument, but most GPUs wont work, since this code is tied to torch 1.6 (which lacks CUDA kernels for modern architecture). If needed, you could try to update torch.
Anyway, the models are quite small. That command will create a out.jsonl file that looks like:
{"document": {"text": "This is a source document"}, "claim": "This is a summary", "metrics": {"qa-eval": {"f1": 0.0, "is_answered": 1.0, "em": 0.0}}, "qa_pairs": [[{"question": {"question_id": "dc50bcdddb09fd6e2772d349a1e8dd58", "question": "What is this?", "answer": "a summary", "sent_start": 0, "sent_end": 17, "answer_start": 8, "answer_end": 17}, "prediction": {"prediction_id": "cfcd208495d565ef66e7dff9f98764da", "prediction": "a source document", "probability": 0.735170068387259, "null_probability": 2.769950940990251e-05, "start": 8, "end": 25, "f1": 0, "is_answered": 1.0, "em": 0}}]], "qa_pairs_nonfiltered": [[{"question_id": "dc50bcdddb09fd6e2772d349a1e8dd58", "question": "What is this?", "answer": "a summary", "sent_start": 0, "sent_end": 17, "answer_start": 8, "answer_end": 17}]], "qa_summary": [[{"prediction_id": "cfcd208495d565ef66e7dff9f98764da", "prediction": "a summary", "probability": 0.6332314891401608, "null_probability": 2.6775376205867233e-05, "start": 8, "end": 17}]]}