It's a busy week at work, but I got some interesting newsletter from intel:
http://isdlibrary.intel-dispatch.com...abee_paper.pdf. They don't fail to mention how larrabee is able to solve all of the problems of d3d. The last remaining fixed function stages of the rendering pipeline, everything except texture sampling, have to be programmed. Thier described software rendering method does not differ too much from gsdx/ps2's solution. It's tile based, where each tile fits into the local cache of one core. GSdx divides the multithreaded blocks into scanlines, I could use 2d tiles if I wanted to, it depends on which has better memory locality. Another advantage that the cache is transparently managged as with current cpu's, and unlike CUDA. It mentions that texture access can generate page faults, it's not clear yet how many cycles it costs, I tried the same with the OS managged virtual memory before, that was awfully slow

. For multithread mode gsdx uses spinlocks to sync the cores, in larrabee there seems to be a less wasteful solution for that too. Will see how 10 cores and four times wide sse registers do (and costs :P) once it gets released.
Last edited by gabest; September 4th, 2008 at 15:15..