This directory implements the DSTC7 ubuntu response ranking task, allowing users to reproduce the results of the PolyAI contextual encoder on this task.
See also:
- Download
Ubuntu_st1_testandUbuntu_st1_ground_truthfrom the dataset page.
You should now have a directory containing ubuntu_responses_subtask_1.tsv and
ubuntu_test_subtask_1.json.
- Run the evaluation script
python dstc7/evaluate_encoder.py \
--examples_json ubuntu_test_subtask_1.json \
--labels_tsv ubuntu_responses_subtask_1.tsv \
--encoder http://models.poly-ai.com/ubuntu_convert/v1/model.tar.gzthis should give the final output:
Recall@1 = 0.712
Recall@10 = 0.931
Recall@50 = 0.986
MRR = 0.788