mlabonne’s gists

mlabonne / LLAMA_Harsha_8_B_ORDP_10k-Nous.md

Created December 2, 2024 17:50 — forked from asharsha30-1996/LLAMA_Harsha_8_B_ORDP_10k-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
LLAMA_Harsha_8_B_ORDP_10k	35.54	71.15	55.39	37.96	50.01

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	26.77	±	2.78
		acc_norm	27.17	±	2.80
agieval_logiqa_en	0	acc	31.34	±	1.82

mlabonne / Phi-3-mini-4k-instruct-Nous.md

Created June 5, 2024 21:12 — forked from CultriX-Github/Phi-3-mini-4k-instruct-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
Phi-3-mini-4k-instruct	44.44	71.88	57.77	41.9	54

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	29.13	±	2.86
		acc_norm	28.74	±	2.85
agieval_logiqa_en	0	acc	42.86	±	1.94

mlabonne / dolphin-2.8-mistral-7b-v02-Nous.md

Created April 8, 2024 23:36 — forked from tosh/dolphin-2.8-mistral-7b-v02-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
dolphin-2.8-mistral-7b-v02	38.99	72.22	51.96	40.41	50.9

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	21.65	±	2.59
		acc_norm	20.47	±	2.54
agieval_logiqa_en	0	acc	35.79	±	1.88

mlabonne / distilabeled-Marcoro14-7B-slerp-Nous.md

Created January 13, 2024 23:04 — forked from gblazex/distilabeled-Marcoro14-7B-slerp-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
distilabeled-Marcoro14-7B-slerp	45.38	76.48	65.68	48.18	58.93

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	27.56	±	2.81
		acc_norm	25.98	±	2.76
agieval_logiqa_en	0	acc	39.17	±	1.91

mlabonne / openchat-3.5-1210-Nous.md

Created January 12, 2024 11:54 — forked from gblazex/openchat-3.5-1210-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
openchat-3.5-1210	42.62	72.84	53.21	43.88	53.14

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	22.44	±	2.62
		acc_norm	24.41	±	2.70
agieval_logiqa_en	0	acc	41.17	±	1.93

mlabonne / openchat_3.5-Nous.md

Created January 10, 2024 09:59 — forked from gblazex/openchat_3.5-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
openchat_3.5	42.67	72.92	47.27	42.51	51.34

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	24.02	±	2.69
		acc_norm	24.80	±	2.72
agieval_logiqa_en	0	acc	38.86	±	1.91

mlabonne / zephyr-7b-beta-Nous.md

Created January 10, 2024 09:59 — forked from gblazex/zephyr-7b-beta-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
zephyr-7b-beta	37.33	71.83	55.1	39.7	50.99

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	21.26	±	2.57
		acc_norm	20.47	±	2.54
agieval_logiqa_en	0	acc	33.33	±	1.85

mlabonne / MistralTrix-v1-Nous-removed.md

Last active February 27, 2024 10:09 — forked from gblazex/MistralTrix-v1-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
MistralTrix-v1	44.98	76.62	71.44	47.17	60.05

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	25.59	±	2.74
		acc_norm	24.80	±	2.72
agieval_logiqa_en	0	acc	37.48	±	1.90

mlabonne / Mistral-7B-Instruct-v0.2-Nous.md

Created January 10, 2024 09:56 — forked from gblazex/Mistral-7B-Instruct-v0.2-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
Mistral-7B-Instruct-v0.2	38.5	71.64	66.82	42.29	54.81

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	23.62	±	2.67
		acc_norm	22.05	±	2.61
agieval_logiqa_en	0	acc	36.10	±	1.88

mlabonne / dolphin-2.2.1-mistral-7b-Nous.md

Created January 10, 2024 09:56 — forked from gblazex/dolphin-2.2.1-mistral-7b-Nous.md

Model	AGIEval	GPT4All	TruthfulQA	Bigbench	Average
dolphin-2.2.1-mistral-7b	38.64	72.24	54.09	39.22	51.05

AGIEval

Task	Version	Metric	Value		Stderr
agieval_aqua_rat	0	acc	23.23	±	2.65
		acc_norm	21.26	±	2.57
agieval_logiqa_en	0	acc	35.48	±	1.88

Maxime Labonne mlabonne

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval

AGIEval