Datapoint 2200 : the common ancestor!

June 17th, 2010

I wrote a post on my hackintosh blog about Datapoint 2200: the ancestor of all our modern computer in so many ways…

blog-datapoint2200

The funny thing is that this 1971 “terminal”, with hundreds TTL modules that support Intel 8008 instruction-set and up to 8KB RAM on original model, could have run the Sargon chess program (adapted), in 1971! Datapoint 2200 on Wikipedia

  • Print this article!
  • E-mail this story to a friend!
  • Digg
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • LinkedIn
  • Reddit
  • Slashdot
  • Technorati

Sargon Book by Dan & Kathe Spracklen

June 17th, 2010

I read many many things these days, including interviews of Dan Spracklen & Kathe Spracklen. They have written computer chess history since their first tournament in 1978, that they unexpectedly win, they share many world champion prices, and they owe it. Sargon was their first and mainstream program for years, on Z80, 6502, 68000.

blog-sargon-book

Today I receive the “Sargon book”, by Hayden editor, that contain full source-code of Sargon, fully annotated, for Zilog Z80 CPU (a faster clone of Intel 8080 that was cloned itself many times, for example the NSC 800 a buggy version on the Canon x-07 hand-held computer). This book needs to be read, annotated, by anyone willing to make an efficient chess program. I saw some code that might be rewritten to be faster (ie mul 8bx8b that loop 8 times even if multiplier has no more set bit after right rotation, and code should have been unrolled), but honestly it’s a masterpiece!

They even ended-up beating a 6 million dollar Amdahl mainframe with a personal computer that cost thousands less!

On my collection I have the Fidelity Electronics Chess Challenger Excel Mach III Excel (first USCF Master rated electronic chess computer!) with their program on it. It plays a beautiful chess, that is solid and elegant (for a 1987 program!) with a 2079 ELO rating.

The most surprising, given their incredible achievement, is their humility. Thanks for your incredible work on chess computing! An interview of Dan & Kathe Spracklen you have to read here (PDF)

PS: Full source-code in Z80 assembly language of Sargon chess program here. Enjoy, and appreciate the comment they put, some young programmers might take a lesson or two!

  • Print this article!
  • E-mail this story to a friend!
  • Digg
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • LinkedIn
  • Reddit
  • Slashdot
  • Technorati

Memory limits and new developer generation…

June 14th, 2010

The new generation of developers have the habit to count in GB (gigabyte = 1 billion bytes) for main memory and TB (1000 billion of bytes!) for storage space!

I began programming in the 70’s, so mine had the habit of counting in bytes for memory, and not thinking too much about external storage as a means to avoid memory overload, just as a means to access data and store their states, but having in mind that these datas should be as close as the CPU or execution unit as possible!

My first own programmable computer was a TI-57, the one with LED, 50 instructions step, 8 registers (1 dedicated for decrementing loops, 1 for comparison). There wasn’t no useless instructions, execution was real slow too, and you could not afford such luxury as external storage or non-optimized code. A great lesson to use each resource to it’s fullest.

Chess programmers on the 70’s have done with some kind of limitations, imagine a full chess engine on ‘72, on a 4-bit micro-controller (that is 4bit CPU + peripheral on one chip), 2KB ROM, and 80 Bytes of memory (yes 80, organized in 160 x 4bytes). David Levy and it’s team have done that! Incredible for me!

Today, most of our new generation developer think that these limits of the past, or the know-how old developers (as me) have acquired to live with that and produce useful applications with so limited resources, all that is useless and should be put on a Museum…

But if you look at chess on CUDA, you will discover that these limits are actually there, and you’ll have to cope with them, and better don’t waste any storage Byte, ’cause you may regret it:

On each SM, you have 8 SP (Scalar Processor), that executes at least 32 threads to be fully working on basic instructions, and only 16KB or shared ram. yes, that is 512Bytes of RAM for each thread, in a world where you usually allocates Megabytes to any threads just to have it starting! You could use the videocard main memory, you will be limited by total bandwidth of memory, and will have scaring latency. You will even have to launch more thread to hide latencies and use your GPU processing power, ending with memory being a total bottleneck: the more your launch thread, the less each one has shared memory, the more each thread will use main memory. An exponential problem!

So you will have to cope with 512Byte memory per thread, if you want to use each GPU cycle efficiently on basic instructions. And it,s the same wether you consider 2SM/16SP GeForce 9400M IGP, or 16SM/128SP GeForce 9800! The problem scale perfectly, albeit main memory bandwidth doesn’t on high-end card!

Now be prepared to code like David Levy’s have done, Dan & Kathe Spracklen did, and some other famous chess developers of the 70’s: your resources are so limited that you may even struggle just to have the list of move in a given position. 64 bytes for chess board, 218 move possible at worst, 2 bytes per move (packed), you are at 500 bytes for your thread, just 12 bytes (3 32bit word) left! Ouch!

So how to overcome these limitations??? And avoid using video card main memory?

  • Print this article!
  • E-mail this story to a friend!
  • Digg
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • LinkedIn
  • Reddit
  • Slashdot
  • Technorati

Mobile Fermi!

May 26th, 2010

nVidia just presented GeForce GTX 480M Fermi Mobile GPU. It’s more of a underclocked GeForce GTX 465 (to be presented soon), and will offer typically 50% of a GeForce GTX 480 desktop performance-level.

Still it’s largely over the last-generation GeForce GTX 285M that was just a GeForce 9800GTX in disguise, showing that GT200 GPU (GTX260..295) is just inadequate to be used on mobile platforms or downsized for middle-level gamer videocards!

The most interesting thing is not the gaming performance, and it will be impressive, on a par with my desktop GTX 260 or better, but Fermi being available on (huge) laptops. With real-world OpenCL & CUDA performance-level that is really impressive.

If you compare this laptop GPU to desktop CPU, for example with Folding@home distributed supercomputing projet, created for CPU and ported to GPU on nVidia’s CUDA and ATI’s brooke(n) technology you will have to compare with:

- 3 desktop Core i7 high-end CPU (or 6 laptop Core i7 Mobile CPU!)

- 2 Radeon 5870 desktop GPU (or a 5970 desktop GPU)

The raw numbers are more impressive on Radion HD 5xxx GPU, but the real-life OpenCL performance (and CUDA too) is almost unbeatable when you took CPU-developped programs ported to GPU!

And it’s a mobile GPU! I would like to see a PC card with it, to consume 2X less than my GTX 260 while offering better performance-level and Fermi computing ability :-)

  • Print this article!
  • E-mail this story to a friend!
  • Digg
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • LinkedIn
  • Reddit
  • Slashdot
  • Technorati

GeForce GTX 465 Fermi

May 5th, 2010

Fermi for the rest of us, this is mission for GTX 465, before GF 104 based videocard are available, nVidia seems to have decided to push a -relatively- affordable GTX 465, based on GF 100 as GTX 480 & 470.

352 SP, 256bits memory-bus, 1GB video memory, with a tag price around $249, it’s a GTX 2xx replacement that will offers improved gaming experience, and is the soft-spot (in terms of price and performance) for CUDA and OpenCL developpers. This TX 465 is what I was waiting for to upgrade my GTX 260 and to go to the Fermi path, that is really promising! (naturally I will keep a laptop with GeForce 9400M and 9600M GT)

It should be unveiled at Computex, early june, for an availability this summer.

There’s one bad new: as with  GeForce GTX 2xx series, that were not declined, GF 104 GPU seems to be late, really late, and may even never show. nVidia creates good designs but is unable to downsize them for the rest of their videocard line. Fermi may stay reserved for high-end videocard and the rest of us will use them same G80-derived GPU, without Fermi capabilities (2006-derived design!)

  • Print this article!
  • E-mail this story to a friend!
  • Digg
  • Twitter
  • Facebook
  • del.icio.us
  • Google Bookmarks
  • LinkedIn
  • Reddit
  • Slashdot
  • Technorati