It seems that you're using an outdated browser. Some things may not work as they should (or don't work at all).
We suggest you upgrade newer and better browser like: Chrome, Firefox, Internet Explorer or Opera

×
avatar
Calibrus: Reading through the reviews it looks like AMD's biggest problem may be that they skimped out on the Floating Point Unit. Bulldozer 8-core isn't a true 8 core processor. It's 4 modules containing 2 partial cores sharing 1 FPU each. With Intel's Sandy Bridge, each core has it's own dedicated FPU. Even if you play a game that uses 8 cores, the performance will be lackluster because the 8 cores will be bottle-necking on the 4 FPUs.

Had AMD gave each core its own FPU, I'm sure the benchmarks would have been a lot better. But it seems the new Bulldozer design is more about manufacturing processors on the cheap rather than delivering high performance.
avatar
cjrgreen: No, it's not skimped. Each core has its own 128-bit FPU. A pair of cores can gang their FPUs to have a 256-bit unit.

AMD's market is data centers; unless you do procurement for a data center, they don't much care what you think. Their division of the FPUs is driven by their knowledge of data center workloads, and their decision is one of the reasons they were able to get 8 full cores on the chip and still be able to manufacture it.
Then AMD shouldn't be selling it as a desktop processor (the server version will have all HT links enabled, not 1/4).

The only problem is that, as of right now, they can't really provide the number of units needed to cover serious data center requirements; couple this with the fact that server grade motherboards which can take this processor are lacking compared to their Intel counterparts, and the fact that support overall is better on the Intel side means that businesses and large data don't really have a reason to switch to their architecture.

Also, each module has two ALUs, but only one FPU/SIMD unit. If you're doing a bunch of parallel integer tasks, then yes, this could come close to double the performance. If you're doing a bunch of floating point tasks, as "distributed computing" workloads generally are, then this won't improve performance much at all.

Another thing is that Anand's N x N queen test also shows that Bulldozer has worse branch prediction than the previous generation (branch prediction is even more important for Bulldozer because it has a longer pipeline) as well as that cache latencies are terrible (between 25 and 125% slower than Deneb or Sandybridge processors).

Oh, forgot to add the FX-8150 has a thermal design power of 125 W (for short periods it can spike above 125W, but the long-term average is capped at that level), whereas the i5-2500K and i7-2600K are both rated at just 95W. Another reason most datacenters won't go with this one as power/cooling costs are one of the largest money sinks.
Post edited October 13, 2011 by AndrewC
avatar
cjrgreen: No, it's not skimped. Each core has its own 128-bit FPU. A pair of cores can gang their FPUs to have a 256-bit unit.

AMD's market is data centers; unless you do procurement for a data center, they don't much care what you think. Their division of the FPUs is driven by their knowledge of data center workloads, and their decision is one of the reasons they were able to get 8 full cores on the chip and still be able to manufacture it.
avatar
AndrewC: Then AMD shouldn't be selling it as a desktop processor (the server version will have all HT links enabled, not 1/4).

The only problem is that, as of right now, they can't really provide the number of units needed to cover serious data center requirements; couple this with the fact that server grade motherboards which can take this processor are lacking compared to their Intel counterparts, and the fact that support overall is better on the Intel side means that businesses and large data don't really have a reason to switch to their architecture.

Also, each module has two ALUs, but only one FPU/SIMD unit. If you're doing a bunch of parallel integer tasks, then yes, this could come close to double the performance. If you're doing a bunch of floating point tasks, as "distributed computing" workloads generally are, then this won't improve performance much at all.

Another thing is that Anand's N x N queen test also shows that Bulldozer has worse branch prediction than the previous generation (branch prediction is even more important for Bulldozer because it has a longer pipeline) as well as that cache latencies are terrible (between 25 and 125% slower than Deneb or Sandybridge processors).
Well, the fact that AMD has manufacturing problems should not come as a surprise, and the fact that they can't match Rony Friedman's CPU group on instruction performance is sort of a "given".

I believe you are wrong about the FPU, though. The modules share an FPU that can function as two unganged 128-bit FPUs or one ganged 256-bit FPU. The fact that it is shared does not mean that each core cannot issue its own FPU instructions independent of the other, as it would with a fully shared FPU. There just isn't much in the way of 256-bit FPU instruction use in the usual mix of code, so there is little call for independent 256-bit FPUs.

Since the load experienced by a processor that isn't tasked with "distributed computing" number crunching (and I really don't give a flying fuck about bitcoin mining and other wastes of electricity) is overwhelmingly integer, occasionally floating point, and rarely 256-bit floating point, I think criticism of AMD's decision is more like personal claims of superiority looking for a place to strike than any legitimate claim that it was a poor business decision.
avatar
cjrgreen: Since the load experienced by a processor that isn't tasked with "distributed computing" number crunching (and I really don't give a flying fuck about bitcoin mining and other wastes of electricity) is overwhelmingly integer, occasionally floating point, and rarely 256-bit floating point, I think criticism of AMD's decision is more like personal claims of superiority looking for a place to strike than any legitimate claim that it was a poor business decision.
Almost entirely int tbh there are very few non science uses for the FPU. Theres reports of a huge cache crashing problem with the win7 scheduler and BD benchmarks are showing improvements on win8 and theres a patch for the scheduler in the next update apparently...
so it aint' good for gamers cause they are slower than 6 months intel cpus and not cheaper
so it ain't good for servers as temperature and energy consumption is extremely vital in huge clusters
it might be good for video editings and such.

still... it doesn't make the cpu worth the time, the money invested in it does it? what that means for AMD?
avatar
lukaszthegreat: so it aint' good for gamers cause they are slower than 6 months intel cpus and not cheaper
so it ain't good for servers as temperature and energy consumption is extremely vital in huge clusters
it might be good for video editings and such.

still... it doesn't make the cpu worth the time, the money invested in it does it? what that means for AMD?
This is the first set of chips on this line, they'll get a lot better in the next few years. The question is if they can get good enough fast enough to make a difference.
For AMD's sake I hope Interlagos and Valencia perform better for servers than Zambezi on desktops.
avatar
lukaszthegreat: so it aint' good for gamers cause they are slower than 6 months intel cpus and not cheaper
so it ain't good for servers as temperature and energy consumption is extremely vital in huge clusters
it might be good for video editings and such.

still... it doesn't make the cpu worth the time, the money invested in it does it? what that means for AMD?
avatar
hedwards: This is the first set of chips on this line, they'll get a lot better in the next few years. The question is if they can get good enough fast enough to make a difference.
but it is new hardware which does not beat old intel hardware. so even if newer products are better amd still will be behind intel (of course if intel f***s up big time for one reason or another amd can catch up)
avatar
hedwards: This is the first set of chips on this line, they'll get a lot better in the next few years. The question is if they can get good enough fast enough to make a difference.
avatar
lukaszthegreat: but it is new hardware which does not beat old intel hardware. so even if newer products are better amd still will be behind intel (of course if intel f***s up big time for one reason or another amd can catch up)
True, but Intel has the advantage of abusing it's monopoly position. AMD has been ahead in the past, and I'm sure that they will be in the future, but having better products doesn't do you any good if you can't actually convince systems integrators to buy them because they're being bribed.

That being said, of course you're correct, but at the end of the day, I'll keep buying them even if they aren't quite as good, because I'm sick of the misbehavior that Intel regularly engages in.
avatar
lukaszthegreat: but it is new hardware which does not beat old intel hardware. so even if newer products are better amd still will be behind intel (of course if intel f***s up big time for one reason or another amd can catch up)
avatar
hedwards: True, but Intel has the advantage of abusing it's monopoly position. AMD has been ahead in the past, and I'm sure that they will be in the future, but having better products doesn't do you any good if you can't actually convince systems integrators to buy them because they're being bribed.

That being said, of course you're correct, but at the end of the day, I'll keep buying them even if they aren't quite as good, because I'm sick of the misbehavior that Intel regularly engages in.
Intel abuses its monopoly position in much nastier ways than in the war over CPU sockets. The longstanding war between Intel and nVidia over chipsets and onboard graphics is far more costly in terms of product that other companies can't make and you can't buy because Intel doesn't want the competition.

In CPUs, the fact remains that Intel has an extraordinary CPU design team, and AMD, well, doesn't.
avatar
cjrgreen: Intel abuses its monopoly position in much nastier ways than in the war over CPU sockets. The longstanding war between Intel and nVidia over chipsets and onboard graphics is far more costly in terms of product that other companies can't make and you can't buy because Intel doesn't want the competition.

In CPUs, the fact remains that Intel has an extraordinary CPU design team, and AMD, well, doesn't.
Right now that's definitely the case, in the past they were doing something right. AMD did beat Intel to the 1ghz mark and they did manage to beat Intel to popularizing the 64bit processor for home use. Not to mention that the early dual core AMD chips were substantially better than the Intel ones.

Unfortunately, right now and for some time, you're definitely correct. A lot of that was Intel's doing and some of it was AMD's own incompetence. I still can't believe that they were willing to pay so much for ATI.
Bulldozer is a completely different architecture from anything that we have seen before. It would take a while for AMD to get things right since they are in a bit of an conundrum right now with a dispute over instruction sets and optimizations that don't seem to work quite as well on Windows 7 . Ostensibly windows 8 is touted to have an affinity with the optimization's bulldozer has over the previous generation but I am not expecting a miracle.

I wouldn't term AMD as being incompetent for they have been fair;y innovative in creating a benchmark for the industry where a better IPC was favored over clock speed, an on die memory controller,x86-64 etc.....and this despite its measly R&D budget as well as Intel's questionable trade practices. Frankly even at its peak AMD wasn't quite to acquire market share and this was perhaps its greatest failing.

The problem with bulldozer is that its not a very efficient approach towards multi-threaded applications over Intel's existing HT. The branch prediction is faulty and it sacrifices single threaded performance far too much....
Some hope here. Just checked out the Anandtech review and discovered something .
Anandtech also tested X264 encoding using a modified Binary compiled to support AVX and AMD XOP instruction set and found out that in 2nd pass (which is the original pass where video gets encoded originally) FX 8150 is beating out i7 2600K.
This is really interesting because this example shows how an optimized application for Bulldozer architecture can be benefited and has some serious performance boost. Hoping to see some patch releases to optimize Bulldozer's performance.

In the Windows 8 preview there is a performance improvement ranging from 4% to 10% over Windows 7. The reason stated by AMD is that the Windows 7 Scheduler is not aware about Bulldozer's module based architecture and places threads wherever it finds a core is free, rather than judging the state of the module. For example suppose at time Bulldozer has two free modules (that is 4 cores to OS) and there are two threads waiting for OS to schedule them in CPU. Now if OS is not aware of the modules, it may assign both the threads to the cores of a single module, resulting low resource utiliztion, confilts etc as we all know two cores of a module are not totally independent, they share Fetch-Decode, FP unit and L2 cache.
Now if OS is aware of the modules then two threads can be assigned to two different modules. Here each of the threads gets two cores to finish execution of the instructions present in each of them, resulting faster execution.

Similarly it will also help improve the Turbo core performance. For example if two threads, say Th1 and Th2 where Th2 is dependent upon Th1, are present, they should be assigned to the two cores of a single module to improve resource sharing and to cut down all the other modules to reach at the peak turbo speed when a single module as all the other module are not in used and can be cut from power.

In Windows 7 it is not possible all the time due to OS' inability to recognize the modules.
But if the OS is aware of a module and the cores inside it, those problems can be addressed and scheduling can be done in much organized manner. I don't expect to see a radical change in numbers but it should be enough to make a difference so don't write it off so soon before we see how well it does on windows 8....I wouldn't write it off completely as it may still be reasonably competitive in server segments once the O.S gets more familiar with how the O.S works.Zambezi is more like a preview build or something and we"ll see things get better once piledriver comes in....However it might just be too little and too late.
Post edited October 14, 2011 by Lionel212008
avatar
Region: 8 Cores. Who the hell needs 8 Cores. It's ridiculously cheap for an 8 Core processor but who needs an 8 Core processor?
avatar
rampancy: Careful with a statement like that...remember what Bill Gates said about 640k of RAM?
Idle tittle tattle. My mate Bill said on the tinterweb that he didn't say it, that's good enough for me :P