This website is experimental.

Provider Model Total History (4) Humanities (5) Opinion (4) Creativity (4) Coding (5) Science (5) Philosophy (5) Math (5) Reasoning (4) Engineering (5)
OpenAI GPT-5.1 1,081 1114 1050 1039 1025 962 1072 1002 903 1031 1075
Moonshot Kimi-k2.5 1,181 1098 1124 1021 1076 1072 995 1063 1066 1082 1032
OpenAI GPT-5.2 1,148 1064 1017 1017 1139 1078 1168 1047 1013 1071 1013
OpenAI GPT-OSS 1,079 1041 967 1148 981 1041 992 1072 1058 1064 1216
Google Gemini-3-Pro 994 1016 1038 1002 1017 1016 1027 970 1030 998 1025
Z GLM-5 871 1012 912 942 957 1009 871 933 932 988 953
DeepSeek V3-2-thinking 969 1008 1019 921 1074 964 935 1015 969 875 956
DeepSeek R1 1,013 1006 1028 1124 1030 891 1070 1048 995 966 846
Qwen Qwen-3-thinking 1,005 982 993 1055 883 994 1083 977 1033 1019 971
Moonshot Kimi-k2 865 980 904 926 1025 927 1044 920 920 947 976
Z GLM-4.7 894 972 991 912 934 1003 945 932 1023 968 984
Anthropic Claude-Opus-4-5 1,102 964 987 1030 1002 997 1007 1129 1053 1013 1068
OpenAI o4-mini 913 955 934 976 982 996 964 918 975 955 931
Google Gemini-3-Flash 1,044 904 1067 973 910 1056 890 1012 1049 1049 1066
xAI Grok-4-1-fast 842 884 968 915 963 997 935 961 981 974 888