Jim's Page

 

Chess IQ Test

Compiled by Jim Monaghan

 

 Download the IQ Test

 

I. Tactical IQ Test for Chess Programs

This is a test suite from the master section of Livshits' book "Test Your Chess IQ". There are 360 positions that are carefully balanced with medium to hard examples.

The test as presented here is intended to estimate the tactical strength of chess engines. The test duration is one hour. Run the test on a chess engine at 10 seconds per position, note the percentage score achieved and compare it with the table below:

Percent   IQ Elo   Correct
Correct           Solutions

  100      2764      360
   90      2644      324
   80      2524      288
   70      2404      252
   60      2284      216
   50      2164      180
   40      2044      144
   30      1924      108
   20      1804       72
   10      1684       36
    0      1564        0

The floor for the test is 1564, the ceiling is 2764. Note that each percentage point correct earns 12 "IQ Elo" points. So the rating formula is:

IQ elo = 1564 + (% correct x 12)

The test generates a tactical IQ rating for an engine on whatever hardware it is tested on. Strategical factors, and endgame knowledge are not considered in the test and are in no way measured. The ratings don't compare to "real" rating lists. It's just for fun, not too serious.

II. Comparison of Ply Depth verses Rating Performance

Working with average ply depth instead of time, makes the table below more universally applicable to different programs and hardware. Again there is nothing absolute about the rating numbers -- it's the differences that are important. By pumping the new IQ test suite through Yace 0.99.56 at ascending ply depths I came up with the following table:

Cel 1.3 Ghz/256 (32 MB HT)

Avg Depth   Found        Percent        IQ Elo      Rating Gain
(Plys)    (Total=360)    Correct      (Max=2764)   (Difference)

 5           159          44.17%          2094          ---
 6           206          57.22%          2251          157
 7           241          66.94%          2367          116
 8           271          75.28%          2467          100
 9           288          80.00%          2524           57
10           308          85.56%          2591           67
11           315          87.50%          2614           23

Notice in the Ply Depth table above that Yace solved 315/360 at 11 ply. Even after 15 minutes and how many ply, a ceiling is definitely reached. Some unique type of pruning extensions will be needed to pierce this barrier or a lot of speed.

The tree really explodes at 12 ply on a lot of these positions and would take a huge amount of time. I would expect very little gain in performance anyway.

Knowing (or assuming) that the relationship between ply depth and rating performance is a logarithmic function, I wonder if a mathematician can extrapolate this table working with columns 1 and 5. Hopefully there is enough of a trend. The rating gain going from ply 8 to ply 9 is either a little low at 57, or the gain from ply 9 to ply 10 is a little high at 67, or both. The chart could perhaps step a little better, but I'm reporting it as directly as it came out. Maybe the table can be smoothed out for projection purposes. What would the expected gain be going from ply 11 to 12, ply 12 to ply 13, etc? Is there a ceiling? In a theoretical sense, I guess there isn't a ceiling -- but practically when you consider time spent to achieve higher plys, there might as well be. Interesting...

 

III. Real Tough IQ
 
The following 16 positions are unsolved by Yace 0.99.75c after one hour per position on a P4 2.53:
 
6k1/1p3r1p/r2q1Pp1/2p1n1P1/p1PpP3/P2P2Q1/1P4B1/4RRK1 w - - bm Rf5; id "IQ.1023";
5rk1/p1q1ppb1/3p2p1/3P1bBp/1pr4P/5PN1/PPPQ2P1/1KR4R b - - bm Bc3; id "IQ.1027";
2rq1rk1/pp1bnpbp/4p1p1/3pP1N1/3P2Q1/2PB4/P4PPP/R1B1R1K1 w - - bm Nxh7; id "IQ.1031";
r3qrk1/1b3p2/1p1npnp1/2b1N1N1/p1P5/P1B5/1PB1QP1P/R2R2K1 w - - bm Nd7; id "IQ.1057";
r3k2r/2pn1pp1/p1p2qb1/Np2P1b1/6P1/3P2B1/PPP2P2/RN1Q1RK1 b kq - bm Nxe5; id "IQ.1111";
r2r4/pp3p2/4bkpp/8/7P/3B1P2/PP4P1/1K1R3R b - - bm Rxd3; id "IQ.1149";
1rb1r1k1/3n1ppp/p1p1p3/q3P3/7P/2N1Q1R1/PPP3P1/2KR1B2 w - - bm Rxd7; id "IQ.1182";
r2qnrk1/pp2ppb1/3p3B/2p5/7Q/1PNP3b/1PP3PP/R4RK1 w - - bm Ne4; id "IQ.1191";
r2r2k1/pp3ppp/2p1bn2/7q/NbP3P1/1P2B2P/PQ3PB1/R4RK1 b - - bm Bxg4; id "IQ.1194";
r1b2rk1/2q1bppp/p2pp3/2n3PQ/1p1BP3/1BN5/PPP2P1P/2KR2R1 w - - bm Bf6; id "IQ.1198";
2rq1rk1/1b3pb1/5np1/1pn1p1Np/4P2Q/2N1B3/BPP3PP/R4R1K w - - bm Nxf7; id "IQ.1226";
r4rk1/1bpq1ppp/3p1b2/2nP1N2/8/1p3Q1P/PPB2PP1/R1B1R1K1 w - - bm Nxg7; id "IQ.1248";
rnb3kr/1p1nqppp/p3p3/2ppP3/3P1N2/2NB1Q2/PPP2PP1/R3K2R w KQ - bm Bxh7+; id "IQ.1253";
2kr3r/pp1b1p2/1qn1pb2/2p5/4Q3/5N1B/PPP2PPP/R1B2RK1 b - - bm Rxh3; id "IQ.1259";
2r2rk1/p1n3pp/1p2p3/1q1pQP2/3Pn3/6N1/PP4PP/R1B2RK1 w - - bm Bh6; id "IQ.1270";
r2q1rk1/3nbpp1/4p2p/p1p1P3/1p1P3P/3B1b1R/PPQB1PP1/R3K3 w - - bm Bxh6; id "IQ.1274";
 
I need to analyse these positions more to see if Yace is "seeing" something that the solutions have missed.


IV. "Rating List"

George Lyapko ran the old IQ test on his AMD K6-2/450 for most of the free Winboard programs back in Nov. 2001 using 10 sec/move, no opening book, and no EGTB's.

Program          Score  % IQ Elo
Yace_09956        247  69  2387
Phalanx_XXII      245  68  2381
LG2000_30         241  67  2367
TCB_0045          228  63  2324
Bringer_18        226  63  2317
Gromit_300        225  63  2314
Crafty_1812       222  62  2304
Pepito_142        222  62  2304
Bionic_401        221  61  2301
Nejmet_260        221  61  2301
Zchess_222        219  61  2294
Glc_215c          219  61  2294
Pharaon_250       217  60  2287
Anmon_515         216  60  2284
Inmi_305          215  60  2281
Tao_44            214  59  2277
Amy_07c           213  59  2274
WildCat_261       210  58  2264
Comet_b37         209  58  2261
Exchess_402       209  58  2261
KingOfKings_200   205  57  2247
Bestia_083        204  57  2244
Terra_25          204  57  2244
Knightx_171a      201  56  2234
Ant_606           196  54  2217
Arasan_54         195  54  2214
Quark_150         193  54  2207
Gnuchess_414      192  53  2204
Fortress_162      190  53  2197
Beowulf_17        189  53  2194
LordKing_III      187  52  2187
Dragon_42         187  52  2187
Freyr_1067        184  51  2177
Queen_211         179  50  2161
Sjeng_11          178  49  2157
Ssechess_2045     178  49  2157
Gerbil_02         175  49  2147
Amyan_148         162  45  2104
Esc_104           161  45  2101
Olithink_305      156  43  2084
Tristram_416      154  43  2077
Ghost_v0_13       150  42  2064
Gully2_c          148  41  2057
Grizzly_125       137  38  2021
Monik_211         135  38  2014
Rzeznik_14        130  36  1997
EnginMax_287      129  36  1994
Ufim_143          128  36  1991
Holmes_050Beta    128  36  1991
Mint_23           119  33  1961
Chessterfield_i5a 112  31  1937
Aldebaran_070     103  29  1907
StAndersen_12      81  23  1834
Skaki_119c         81  23  1834
Storm_06           66  18  1784
Ozwald_043         53  15  1741

And two results on an ancient Am5x86-P75-S/133:

Yace_09956        147 41 2054
Bestia_083        139 39 2027

Although not perfect, IQ seems to be a moderately good predictor of playing strength.


V. Debug Notes

I've been debugging the old IQ suite lately. The following 10 positions needed to be replaced. With the first nine lines, I've found alternate solutions. So the second solution just needs to be added. The 10th position had missed the "+" symbol in the solution. These changes have been incorporated into the new IQ.epd file.

2rr2k1/pb3p1p/1pq3p1/4R1N1/2n5/P4P2/BP2Q1PP/4R2K w - - bm Nxf7 Re7; id "IQ.932";
2r1k1nr/3bbpp1/p2p2P1/4pP2/1pqNP3/2N1B3/PPP3Q1/1K1R2R1 w k - bm b3 Nd5; id"IQ.964";
1kqr2r1/ppp5/2nb1p2/1Q1N2pb/P2P4/1RN1P3/3B1PP1/5R1K b - - bm Be2 Rh8; id"IQ.1011";
5rk1/5pp1/3b4/1pp2qB1/4R2Q/1BPn4/1P3PPP/6K1 b - - bm Bf4 Ra8; id "IQ.1091";
5b2/pr4pk/4P1Rp/1pppBPP1/8/1P2P3/P6K/8 w - - bm Rxh6+ gxh6; id "IQ.1123";
2kr2r1/Qpq2p1p/1n2p3/2b2p2/8/2B2PP1/PPPN3P/R3K2R b - - bm Rxd2 Rxg3; id"IQ.1204";
r1b2k1r/1p4pp/p4B2/2bpN3/8/q2n4/P1P2PPP/1R1QR1K1 w - - bm Bxg7+ Qh5; id"IQ.1244";
r1n1nrk1/p4p1p/1q4pQ/2p1pN2/1pB1P1P1/5P2/PPP4P/1K1R3R w - - bm Rd6 Rhg1; id"IQ.1276";
r3r1k1/bpp1q1pp/p3bp2/2p4Q/4N3/1P2PP2/PB3P1P/R2R3K w - - bm Nxf6+ Rg1; id"IQ.1287";
r4rk1/4ppbp/1q2bnp1/n1p4P/4P1P1/2NBBP2/PP1Q4/1K1R2NR b - - bm Bxa2+; id"IQ.1290";

(11-14-2002) Four additional lines are corrected. Thanks to Uri, Andreas, and GCP.
4rk2/1p3r2/3q1nQp/1R1P2p1/P2pp3/3B2PP/1P3P2/2R3K1 w - - bm Qxh6+ Rxb7; id "IQ.967";
4rbk1/1q3ppp/2Rr4/1p1P1B2/2b1PR2/p5P1/5P1P/B1Q3K1 w - - bm Bxh7+ Rxd6 Rh4 Bxg7; id "IQ.973";
r1b1k2r/p1p1nppp/2p5/3q4/8/1P4P1/P1P1QK1P/RNB1R3 b kq - am O-O; id "IQ.1172";
r6r/pb1R1pk1/1p2p1pp/3nP3/4N2Q/4B3/P3qPPP/3R2K1 w - - bm Bxh6+; id "IQ.1267";

(11-15-2002) Two more corrections. Thanks Uri.
2qrrb2/pb3ppk/1npp3p/5N2/1p2n3/1P3NPB/PBPQ1P1P/3RR1K1 w - - bm Rxe4 Qf4; id "IQ.928";
4r2r/pp3k2/2p1pq2/3nR3/2PP1pp1/1B1Q2P1/PP3PK1/6R1 b - - bm Ne3+ Nb4; id "IQ.977";

(12-27-2002 ) Seven more postions with corrected solutions. Thanks to Dieter Buerssner and Yace
7r/8/3p1p1r/p1kP2p1/Pp2P1P1/1PpR3P/5R1K/8 b - - bm Re8; id "IQ.1121";
8/r4p2/6p1/1pknP2p/2p1b3/P1P2N2/2BK2PP/R7 b - - bm Nxc3 Bxc2; id "IQ.1150";
2r5/1p6/pq2p2p/3rN3/k2P2Q1/3R2P1/1P3PP1/1K6 b - - bm Qxb2+ Qd6; id "IQ.1157";
3rr1k1/1p2bpp1/2ppq2p/p1n5/2P4P/1PN1P1P1/PBQR1PK1/2R5 b - - bm Bxh4 Bf6; id "IQ.1158";
2r1r1k1/4qpb1/p2p2p1/1p1Pn1P1/3BB3/P1P5/1P4Q1/1K2R2R w - - bm Bxg6 Qh2 Qh3; id "IQ.1282";
4rbk1/n2n3p/b1q1p1p1/1p1pP1B1/1PrP1N2/1R3NP1/3Q1PB1/R5K1 w - - bm Nxg6 Bf1; id "IQ.1285";
r4rk1/4ppbp/1q2bnp1/n1p4P/4P1P1/2NBBP2/PP1Q4/1K1R2NR b - - bm Bxa2+ Rfb8; id "IQ.1290";

(12-27-2002) Note: IQ.1162 seems suspect. The solution of 1. Bf4 tries to seal the BQ on the kingside and then after 1... exf4, White achieves a draw due to a perpetual attack on it. But Black's 1... e4! seems to cross this plan and Black looks better.

(12/31/2002) Two suspect positions have been removed. IQ.1162 and IQ.1167 were broken by Miguel and Uri respectively. Thanks, guys. These positions have been replaced with IQ.894 and IQ.895. Total positions are still 360.
r3rnk1/4qpp1/p5np/4pQ2/Pb2N3/1B5P/1P3PP1/R1BR2K1 w - - bm Bxh6; id "IQ.894";
r1br2k1/p1q2pp1/4p1np/2ppP2Q/2n5/2PB1N2/2P2PPP/R1B1R1K1 w - - bm Bxh6; id "IQ.895";

VI. Information and Improvements

If you run the test and would like to share your results please send them along and they will be included here. If you find any alternative solutions or questionable ones please send this data so the test can be improved.

My thanks to the posters at CCC and the WB forum for pointing out errors and omissions.

Enjoy the test.