![[ICO]](/icons/blank.gif) | Name | Last modified | Size | Description |
|---|
|
![[DIR]](/icons/back.gif) | Parent Directory | | - | |
![[ ]](/icons/unknown.gif) | knowledge_toxicity.json | 09-Jan-2024 17:38 | 9.6K | |
![[ ]](/icons/unknown.gif) | reasoning_apps_metrics.json | 09-Jan-2024 17:38 | 12K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_apps_metrics.json | 09-Jan-2024 17:38 | 12K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_bbq_metrics.json | 09-Jan-2024 17:38 | 12K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_disinformation_metrics.json | 09-Jan-2024 17:38 | 18K | |
![[ ]](/icons/unknown.gif) | calibration_accuracy.json | 09-Jan-2024 17:38 | 22K | |
![[ ]](/icons/unknown.gif) | knowledge_bias.json | 09-Jan-2024 17:38 | 23K | |
![[ ]](/icons/unknown.gif) | question_answering_toxicity.json | 09-Jan-2024 17:38 | 23K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_copyright_metrics.json | 09-Jan-2024 17:38 | 24K | |
![[ ]](/icons/unknown.gif) | knowledge_efficiency.json | 09-Jan-2024 17:38 | 28K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_toxicity.json | 09-Jan-2024 17:38 | 28K | |
![[ ]](/icons/unknown.gif) | knowledge_calibration.json | 09-Jan-2024 17:38 | 28K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_calibration.json | 09-Jan-2024 17:38 | 28K | |
![[ ]](/icons/unknown.gif) | knowledge_fairness.json | 09-Jan-2024 17:38 | 32K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_fairness.json | 09-Jan-2024 17:38 | 32K | |
![[ ]](/icons/unknown.gif) | knowledge_robustness.json | 09-Jan-2024 17:38 | 32K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_robustness.json | 09-Jan-2024 17:38 | 33K | |
![[ ]](/icons/unknown.gif) | knowledge_accuracy.json | 09-Jan-2024 17:38 | 35K | |
![[ ]](/icons/unknown.gif) | question_answering_efficiency.json | 09-Jan-2024 17:38 | 40K | |
![[ ]](/icons/unknown.gif) | question_answering_calibration.json | 09-Jan-2024 17:38 | 45K | |
![[ ]](/icons/unknown.gif) | core_scenarios_toxicity.json | 09-Jan-2024 17:38 | 45K | |
![[ ]](/icons/unknown.gif) | reasoning_accuracy.json | 09-Jan-2024 17:38 | 49K | |
![[ ]](/icons/unknown.gif) | reasoning_efficiency.json | 09-Jan-2024 17:38 | 49K | |
![[ ]](/icons/unknown.gif) | question_answering_accuracy.json | 09-Jan-2024 17:38 | 50K | |
![[ ]](/icons/unknown.gif) | question_answering_fairness.json | 09-Jan-2024 17:38 | 51K | |
![[ ]](/icons/unknown.gif) | question_answering_robustness.json | 09-Jan-2024 17:38 | 52K | |
![[ ]](/icons/unknown.gif) | gsm_gsm_.json | 14-Feb-2024 14:13 | 53K | |
![[ ]](/icons/unknown.gif) | med_qa_med_qa_.json | 14-Feb-2024 14:13 | 54K | |
![[ ]](/icons/unknown.gif) | core_scenarios_calibration.json | 09-Jan-2024 17:38 | 55K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench_subset:proa.json | 14-Feb-2024 14:13 | 55K | |
![[ ]](/icons/unknown.gif) | narrative_qa_narrative_qa_.json | 14-Feb-2024 14:13 | 55K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench_subset:abercrombie.json | 14-Feb-2024 14:13 | 56K | |
![[ ]](/icons/unknown.gif) | openbookqa_openbookqa_.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14_source_language:cs,target_language:en.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14_source_language:fr,target_language:en.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14_source_language:ru,target_language:en.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14_source_language:de,target_language:en.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | natural_qa_closedbook_natural_qa_closedbook_mode:closedbook.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14_source_language:hi,target_language:en.json | 14-Feb-2024 14:13 | 57K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu_subject:abstract_algebra.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu_subject:college_chemistry.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu_subject:computer_security.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu_subject:us_foreign_policy.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu_subject:econometrics.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench_subset:international_citizenship_questions.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench_subset:function_of_decision_section.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench_subset:corporate_lobbying.json | 14-Feb-2024 14:13 | 58K | |
![[ ]](/icons/unknown.gif) | natural_qa_openbook_longans_natural_qa_openbook_longans_mode:openbook_longans.json | 14-Feb-2024 14:13 | 59K | |
![[ ]](/icons/unknown.gif) | core_scenarios_summarization_metrics.json | 09-Jan-2024 17:38 | 62K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:algebra,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 65K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:number_theory,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 65K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:precalculus,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 65K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:prealgebra,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 65K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:geometry,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 66K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:intermediate_algebra,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 67K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought_subject:counting_and_probability,level:1,use_official_examples:False,use_chain_of_thought:True.json | 14-Feb-2024 14:13 | 67K | |
![[ ]](/icons/unknown.gif) | core_scenarios_fairness.json | 09-Jan-2024 17:38 | 68K | |
![[ ]](/icons/unknown.gif) | core_scenarios_robustness.json | 09-Jan-2024 17:38 | 68K | |
![[ ]](/icons/unknown.gif) | question_answering_bias.json | 09-Jan-2024 17:38 | 83K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_accuracy.json | 09-Jan-2024 17:38 | 92K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_bias.json | 09-Jan-2024 17:38 | 105K | |
![[ ]](/icons/unknown.gif) | calibration_calibration_detailed.json | 09-Jan-2024 17:38 | 134K | |
![[ ]](/icons/unknown.gif) | wmt_14_wmt_14.json | 14-Feb-2024 14:13 | 139K | |
![[ ]](/icons/unknown.gif) | legalbench_legalbench.json | 14-Feb-2024 14:13 | 154K | |
![[ ]](/icons/unknown.gif) | knowledge_general_information.json | 09-Jan-2024 17:38 | 161K | |
![[ ]](/icons/unknown.gif) | core_scenarios_accuracy.json | 14-Feb-2024 14:13 | 174K | |
![[ ]](/icons/unknown.gif) | core_scenarios_efficiency.json | 14-Feb-2024 14:13 | 176K | |
![[ ]](/icons/unknown.gif) | mmlu_mmlu.json | 14-Feb-2024 14:13 | 177K | |
![[ ]](/icons/unknown.gif) | core_scenarios_bias.json | 09-Jan-2024 17:38 | 179K | |
![[ ]](/icons/unknown.gif) | question_answering_general_information.json | 09-Jan-2024 17:38 | 246K | |
![[ ]](/icons/unknown.gif) | reasoning_general_information.json | 09-Jan-2024 17:38 | 269K | |
![[ ]](/icons/unknown.gif) | math_chain_of_thought_math_chain_of_thought.json | 14-Feb-2024 14:13 | 280K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_efficiency_detailed.json | 09-Jan-2024 17:38 | 512K | |
![[ ]](/icons/unknown.gif) | targeted_evaluations_general_information.json | 09-Jan-2024 17:38 | 644K | |
![[ ]](/icons/unknown.gif) | core_scenarios_general_information.json | 14-Feb-2024 14:13 | 811K | |
|