Personal thoughts…

Tech, Hockey, and random thoughts…

A holding action-the real target is the PS4/Xbox3.

The more I think about it, the more I come to believe that for all the hype of the next generation of consoles (Or at least the Xbox360/PS3) that they in reality are intended as little more then a holding action.
I've discussed this before in my own brief examination of the consoles, but significantly more light on the processors has come to light since then along with developers comments, and some solid architectural overviews.

Without deliberating in depth, as I've covered the respective processors in the past it's clear that essentailly their massively oparallel processors capable of executing on many threads simultaeously and relatively highly clocked. Both processors exhibit remarkably poor branch prediction algorithyms, and both are purely in-order processors with no capability of OOE execution of instructions.

The PS3 (Cell Processor) stands out specifically in this respect. A cut down PPC core working with 7 SPE's (Synergistic Processing Elements). The PPE we know all abut, it's essentially a higher clocked cut down PPC G5 dirivative. The SPE's aren't so well understood as yet. Heavily pipelined, no branch predictor and no dedicated cache with an in-order execution effectively insures that the SPE's are useless for any conventional single threaded code and indeed most other code isn't appropriate for SPE acceleration
High level game logic, and AI are complete non-starters, those will have to rely entirely on the PPC Core.
Despite Sony's indications that the SPE's will be heavily utilized for physics using the NovodeX SDK, I'm firmly of the opinion that phsyics is definitely not even remotely a viable option to be calculated on the SPE's.
Physics is typically very branch heavy, and at least in current games the majority of physics execution is derived from calculation collision detection via a Binary Search Partitioning (BSP) tree. In order to follow through the BSP tree in order to test for a collision between some object and a polygon you have to perform a lot of comparisons. You first search the tree to find the polygon you want to test for a collision against. Then you have to perform a number of checks to see whether a collision has occurred between the object you’re comparing and the polygon itself. This process involves a lot of conditional branching, code which likes to be run on a high performance OoO core with a very good branch predictor.
Which is exactly what the SPE's are not.

Every mispredicted branch will have be forced to flush the entire pipeline, and with no branch prediction that will happen an awful lot, and phsyics code is branch heavy which worsens the scenario. Flushing the pipelines is doubly harsh given it's highly pipelines nature in order to achieve it's high clockspeeds. In effect, with every mispredicted branch you'll be tying up the pipelines with no execution being done all tpo often.
Now physics code is particularly parallizable, and with 7 SPE's that will to a great extent offset the performance penalty over each SPE. Still, that will largely leave the SPE's dedicated to physics which would put all the rest of the load on the dedicated PPC core…. which by itself isn't more then 15-20% faster then the PS2's "Emotion Engine" on it's own.
My best bet?
Sound processing will thrive on the SPE's given their rather specialized architecture aimed at streaming tasks, as would displaying pre-recorded FMV.

Now the XBox 360 isn't quite as massiveky threaded, but still more heavily so then we're accustomed to on the PC (and prior consoles have been purely single-threaded. The Xbox360 effectively uses 3 cut down PPC cores, obviously that's putting it in very simplistic terms but I've covered the XBox360 "Xenon" processor in the past so there's no reason to re-iterate things. The core itself is a very narrow 2-issue in-order execution core, featuring a 64KB L1 cache (32K instruction/32K data) and either a 1MB L2 cache. Supporting SMT, the core can execute two threads simultaneously similar to a Hyper Threading enabled Pentium 4.
Now this is genius if your looking after pure peak floating point output, not altogether useful but it makes a damn nice marketing figure on Microsoft has played it for all it's worth trumpeting the figure of 1 trillion floating point operations per second (TFLOPS) to everyone. Alas real world code never even remotely approaches peak theoretical rates even under best case scenarios.
The very narrow 2-issue in-order core also happens to be very deeply pipelined, apparently with a branch predictor that’s not exactly… strong. Real world code, you'll probably see the Xenon executing at roughly twice the performance of the XBox's Celeron/Pentium 3 733Mhz hybrid.
This is definitely not a pleasent thing for the lifetime of the XBox360, but heavily multithreaded game engines are the future, that future won't really take form for another 3 – 5 years though. Even Microsoft has readily admitted that all developers are focusing on having, at most, one or two threads of execution for the game engine itself, not the six threads that the Xbox 360 was designed for.
Even when games become more aggressive with their multithreading, targeting 2 – 4 threads, most of the heavily perforance constrained work will still be done in a single thread.

What does this all mean?
The Xenon and Cell will typically only be 50-100% faster then the processors in their predecessor consoles. Such performance has long since been surpassed on desktop PC's by even the lowest end processors. Now in recent weeks this has become obvious with the deluge of game developers speaking out about how disappointed they are in the Xbox360/SP3's processors.
So why did Sony and Microsoft go this route instead of utilizing cheaper, faster desktop PC processors?
For Sony, it doesn't take much to see that the Cell processor is quite similar to the Emotion Engine in the PlayStation 2, at least conceptually. Sony clearly has an idea of what direction they would like to go in, and it doesn't happen to be one that's aligned with much of the rest of the industry. Sony's past successes have really come, not because of the hardware, but because of the developers and their PSX/PS2 exclusive titles. A single hot title can ship millions of consoles, and Sony has had many more of those exclusive titles than Microsoft had with the first Xbox. Regardless of the hardware platform, a game developer won't turn down working with the PS2 – the install base is just that attractive. So for Sony, the Cell processor may be strange and even undesirable for game developers, but the developers will come regardless.

Microsoft is a bit mroe interesting. With the original XBox they listened very closely to the wants and desires of game developers (Outside of cutting out 64MB of system DRAM just before launch, as they got a ton of flack from developers for that and it's come back to haunt them long-term). This time around, despite what has been said publicly, the Xbox 360's CPU architecture wasn't what game developers had asked for. They wanted a multi-core CPU, but not such a significant step back in single threaded performance. When AMD and Intel moved to multi-core designs, they did so at the expense of a few hundred MHz in clock speed, not by taking a step back in architecture.

I suspect that a big part of Microsoft's decision to go with the Xenon core was because of its extremely small size. A smaller die means lower system costs, and with Microsoft launching the cheapest Xbox 360 at $299 the Xenon CPU will be a big reason why that was made possible.
Granted it is a bit more expensive short term, but not by much and the very small die will ensure it becomes much cheaper as process technology improves and MS eventually takes advantage of that. Microsoft bought the IP{ from IBM and are fabbing their own Xenon processors (Through TSMC).

A small die also equates to less heat, and helps to keep the overall platform relatively small. Both two oof Microsoft's biggest goals with the XBox360.
Another contributing factor may be the fact that Microsoft wanted to own the IP of the silicon that went into the Xbox 360. We seriously doubt that either AMD or Intel would be willing to grant them the right to make Pentium 4 or Athlon 64 CPUs, so it may have been that IBM was the only partner willing to work with Microsoft's terms and only with this one specific core.

In the end, during the lifetime of the XBox360, and PS3 the processing performance will be a significant hindrance. Outdated at launch they'll only fall progressively further behind.
Both consoles do however posess GPU's that are more advanced compared to the current state of the art then their predecessors did.
From all appearances the XBox360/PS3 seem to be best suited to improve sigificantly on the visuals of the games compared to their predecessors and only offer a very incremental benefit in terms of AI, flexibility, phsyics etc.
It's a holding action.
It'll accustom developers to working with mulitple threads to get the best performance (As they will absolutely have to if they want to get any sort of decent processing performance out of these consoles), which will prepare them for the next generation of console processors by which time multihtreaded code will be in the makority and multiple cores won't have to be so heavily neutered as they do now to fit in a reasonable die space at a decent price. All of which means developers for the next generation consoles with be ready to go immediately with precious little time spent analyzing the consoles, and developing new code practices and new algorithyms.
Which will all but ensure that the next generation of consoles will reach their peak far sooner then their predecessors, which will help to ensure many more high quality games at a cheaper cost then ever before.

In the end, it's the GPU's that will be the saving grace and allow the PS3/XBox360 to impress people. And make no mistake the performance of these GPU's will be very good, perhaps only slightly behind PC GPU's that will be launching close to mid 2006.

Take it to the bank, this generation is intended to do one think really well. Great visuals. Everything else won't see much improvement over what we have right now.
If your looking for great physics, longer gameplay, more depth, or intelligent AI… your only going to see incremental improvements over what we have now. And indeed, in the first few years of development those areas may even be weaker then what we're accustomed to before developers have learned the intricacies of the Xenon, Cell processor.
The next generation is when it all really comes together, and may well be far more impressive for it's time then any geneation of consoles before.

On a side note- For someone that has no interest in consoles I spent a great deal of time examining them and the market. I suppose that's merely due to my interest in microarchitectures be it processors or otherwise, but it's ironic.


August 23, 2005 Posted by | Console Examination | 1 Comment