I find the Amd L3 cache re-approach very inventive. I guess 2MB L3 + 4x512KB L2 was way faster as only 4x512KB L2 when the designers made their tests.
Dunno how fast 4x1MB L2 or 4MB of shared L2 would turn out since the same ammount of mem is discussed. Maybe they found THE solution regarding multi-core cpu cache management.
It must be considered though that phenoms are a 3 complex instruction per cycle design and therefore very fast for their frequencies.
I guess when they finish their 4+ IPC designs on 45nm that this will bring them back in the game. They should do something like that or even better.