Archive for June, 2009

My actual platforms

Thursday, June 11th, 2009

I develop CUDA programs for Windows Vista, Mac OS X & Linux (not necessarily in this order ;-) )

I have the three OS installed on my personal laptop, and my home desktop (a powerful mhackintosh PC), and I installed the development tools on my job laptop, that is not cuda-enabled (graphic rely on GeForce 7900GS :-( ) but enable me to develop, review code and run it on emulation-mode (slow slow slowwwww).

In fact my main development environment is my Mac OS X laptop casually, or my desktop PC under Linux if I want to focus on development.

My CUDA Projects

Thursday, June 11th, 2009

I have many CUDA Projects that I do advance in parallel with my casual CTO job:

  • CUDA Chess to have a CUDA-based chess engine (maybe UCI compatible)
  • CUDA Bench to compare different nVidia’s GPU and/or CPU
  • CUDA H264 to create a CUDA-boosted H264 encoder
  • CUDA SQL to add CUDA co-processor to MySQL

They are all advancing slowly, but I plan to give them more time :-)

Ouch! CUDA 1.0 Device!

Friday, June 5th, 2009

I worked on my laptop (GeForce 8600M GTS that is a CUDA 1.1 Device) and I began to port it to my home desktop that have a GeForce 8800 GTS.

And I begin to have some stupid errors about non-supported atomic operations…

I take a look to check how my GeForce 8800 is recognized, and I discovered that is it a GeForce 8800GTS of first generation, that is a CUDA 1.0 DEVICE, and CUDA 1.0 DEVICES doesn’t include atomic operations, complicating the inter-thread communication (kinda nearly impossible on some case!).

This is why I *HATE* marketing names, because a GeForce 8800 is not necessarily compatible with a GeForce 8800, and with the software developped for it, as it is between GTX 260 and GTX 260M.

I thought I have been cautious, but now, I will have to change it, probably for a CUDA 1.3 DEVICE, such a desktop GeForce GTX 260, that will enable me to develop targeting this architecture, or wait for firsts G300 cards that may appear in september, and will probably implement a better set of instruction, specific optimizations and probably more register or cache.