Index of /helm/benchmark_output/releases/latest/groups/json

Name	Last modified	Size

Parent Directory		-
instruction_following_instruction_following_metrics.json	13-Feb-2024 16:23	90K
anthropic_hh_rlhf_anthropic_hh_rlhf.json	13-Feb-2024 16:23	22K
grammar_grammar.json	13-Feb-2024 16:23	18K
open_assistant_open_assistant.json	13-Feb-2024 16:23	15K
vicuna_vicuna.json	13-Feb-2024 16:23	13K
self_instruct_self_instruct.json	13-Feb-2024 16:23	13K
koala_koala.json	13-Feb-2024 16:23	12K
grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:mturk.json	13-Feb-2024 16:23	9.6K
grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:claude.json	13-Feb-2024 16:23	9.5K
grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:scale.json	13-Feb-2024 16:23	9.4K
grammar_grammar_path:src_helm_benchmark_scenarios_best_chatgpt_prompts.yaml,tags:,evaluator:gpt4.json	13-Feb-2024 16:23	9.4K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:claude.json	13-Feb-2024 16:23	9.3K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:scale.json	13-Feb-2024 16:23	9.2K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:mturk.json	13-Feb-2024 16:23	9.2K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:red_team,evaluator:gpt4.json	13-Feb-2024 16:23	9.2K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:claude.json	13-Feb-2024 16:23	9.2K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:scale.json	13-Feb-2024 16:23	9.1K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:mturk.json	13-Feb-2024 16:23	9.1K
anthropic_hh_rlhf_anthropic_hh_rlhf_subset:hh,evaluator:gpt4.json	13-Feb-2024 16:23	9.1K
open_assistant_open_assistant_language:en,evaluator:mturk.json	13-Feb-2024 16:23	9.0K
open_assistant_open_assistant_language:en,evaluator:scale.json	13-Feb-2024 16:23	8.8K
open_assistant_open_assistant_language:en,evaluator:claude.json	13-Feb-2024 16:23	8.8K
open_assistant_open_assistant_language:en,evaluator:gpt4.json	13-Feb-2024 16:23	8.8K
vicuna_vicuna_category:all,evaluator:scale.json	13-Feb-2024 16:23	8.3K
vicuna_vicuna_category:all,evaluator:mturk.json	13-Feb-2024 16:23	8.3K
vicuna_vicuna_category:all,evaluator:gpt4.json	13-Feb-2024 16:23	8.2K
vicuna_vicuna_category:all,evaluator:claude.json	13-Feb-2024 16:23	8.2K
self_instruct_self_instruct_evaluator:mturk.json	13-Feb-2024 16:23	8.2K
koala_koala_evaluator:mturk.json	13-Feb-2024 16:23	8.1K
koala_koala_evaluator:claude.json	13-Feb-2024 16:23	8.1K
self_instruct_self_instruct_evaluator:claude.json	13-Feb-2024 16:23	8.0K
koala_koala_evaluator:gpt4.json	13-Feb-2024 16:23	8.0K
self_instruct_self_instruct_evaluator:scale.json	13-Feb-2024 16:23	8.0K
koala_koala_evaluator:scale.json	13-Feb-2024 16:23	8.0K
self_instruct_self_instruct_evaluator:gpt4.json	13-Feb-2024 16:23	7.9K

Apache/2.2.15 (CentOS) Server at nlp.stanford.edu Port 443