This website is experimental.

Provider Model Total Reasoning (4) History (4) Humanities (5) Science (5) Coding (5) Creativity (4) Opinion (4) Math (5) Philosophy (5) Engineering (5)
OpenAI o3-pro 1,266 1185 1143 1115 980 1104 1055 1125 1034 1135 1122
OpenAI GPT-5-mini 1,212 1060 1047 1077 1172 1032 1109 1063 1103 1115 1131
OpenAI GPT-5 1,161 1111 1102 1181 1092 1088 1197 1119 1029 1046 1142
OpenAI GPT-OSS 1,160 1010 1082 1051 1052 1062 995 1086 1076 1073 1202
OpenAI o3 1,150 1200 1151 1256 1113 953 1100 1240 1027 1057 1115
OpenAI GPT-5-nano 1,087 1022 1072 1010 1012 1062 1016 1034 1079 1034 1086
DeepSeek R1 1,058 991 1059 996 1090 873 1071 1098 1011 1079 992
DeepSeek V3-1-thinking 1,031 1034 1053 1052 1021 986 964 982 978 1141 922
Qwen Qwen-3-thinking 980 1051 894 1053 994 1037 938 1011 1112 962 962
MoonshotAI kimi-k2 973 845 1008 1010 1048 944 963 919 929 1093 987
DeepSeek V3-1 965 911 1009 1052 1000 871 960 1005 1011 1029 998
Anthropic Claude-opus-4-1 947 1062 966 948 901 1191 963 933 1000 1010 895
xAI Grok-4 897 946 982 912 896 1021 895 837 913 768 829
OpenAI o4-mini 878 981 953 962 1025 946 1069 1086 1017 930 1038
Google Gemini-2.5-flash 871 816 858 880 916 985 924 874 919 907 888
Google Gemini-2.5-pro 855 919 871 806 900 1025 938 798 931 804 866
Anthropic Claude-sonnet-4 791 959 891 858 850 959 935 950 879 891 908
Qwen Qwen-3-coder 718 900 858 779 941 861 907 841 954 927 917