Index of /helm/benchmark_output/releases/latest/groups/json

[ICO]NameLast modifiedSizeDescription

[DIR]Parent Directory  -  
[   ]instruction_following_instruction_following_metrics.json13-Feb-2024 16:23 90K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf.json13-Feb-2024 16:23 22K 
[   ]grammar_grammar.json13-Feb-2024 16:23 18K 
[   ]open_assistant_open_assistant.json13-Feb-2024 16:23 15K 
[   ]vicuna_vicuna.json13-Feb-2024 16:23 13K 
[   ]self_instruct_self_instruct.json13-Feb-2024 16:23 13K 
[   ]koala_koala.json13-Feb-2024 16:23 12K 
[   ]grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:mturk.json13-Feb-2024 16:23 9.6K 
[   ]grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:claude.json13-Feb-2024 16:23 9.5K 
[   ]grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:scale.json13-Feb-2024 16:23 9.4K 
[   ]grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:gpt4.json13-Feb-2024 16:23 9.4K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:claude.json13-Feb-2024 16:23 9.3K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:scale.json13-Feb-2024 16:23 9.2K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:mturk.json13-Feb-2024 16:23 9.2K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:gpt4.json13-Feb-2024 16:23 9.2K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:claude.json13-Feb-2024 16:23 9.2K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:scale.json13-Feb-2024 16:23 9.1K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:mturk.json13-Feb-2024 16:23 9.1K 
[   ]anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:gpt4.json13-Feb-2024 16:23 9.1K 
[   ]open_assistant_open_assistant_language:en,evaluator:mturk.json13-Feb-2024 16:23 9.0K 
[   ]open_assistant_open_assistant_language:en,evaluator:scale.json13-Feb-2024 16:23 8.8K 
[   ]open_assistant_open_assistant_language:en,evaluator:claude.json13-Feb-2024 16:23 8.8K 
[   ]open_assistant_open_assistant_language:en,evaluator:gpt4.json13-Feb-2024 16:23 8.8K 
[   ]vicuna_vicuna_category:all,evaluator:scale.json13-Feb-2024 16:23 8.3K 
[   ]vicuna_vicuna_category:all,evaluator:mturk.json13-Feb-2024 16:23 8.3K 
[   ]vicuna_vicuna_category:all,evaluator:gpt4.json13-Feb-2024 16:23 8.2K 
[   ]vicuna_vicuna_category:all,evaluator:claude.json13-Feb-2024 16:23 8.2K 
[   ]self_instruct_self_instruct_evaluator:mturk.json13-Feb-2024 16:23 8.2K 
[   ]koala_koala_evaluator:mturk.json13-Feb-2024 16:23 8.1K 
[   ]koala_koala_evaluator:claude.json13-Feb-2024 16:23 8.1K 
[   ]self_instruct_self_instruct_evaluator:claude.json13-Feb-2024 16:23 8.0K 
[   ]koala_koala_evaluator:gpt4.json13-Feb-2024 16:23 8.0K 
[   ]self_instruct_self_instruct_evaluator:scale.json13-Feb-2024 16:23 8.0K 
[   ]koala_koala_evaluator:scale.json13-Feb-2024 16:23 8.0K 
[   ]self_instruct_self_instruct_evaluator:gpt4.json13-Feb-2024 16:23 7.9K 

Apache/2.2.15 (CentOS) Server at nlp.stanford.edu Port 443