Introduction
Finally the latest trend of the x86-processor arena has caught up with AMD’s flagship the Athlon-processor as well. Since Intel had started to integrate 128 kB of second level cache onto the processor die of its Celeron processor (the good old ‘Mendocino’-core) late 1998, one processor after the other followed this example. Number two was ‘Dixon’, Intel’s mobile Pentium II CPU with 256 kB L2 cache, then AMD followed with the K6-3 which also sported 256 kB of on-die L2-cache. Finally, Intel launched the famous ‘Coppermine’-core, which again uses 256 kB on-die L2-cache to make sure that Pentium III is able to compete with AMD’s Athlon CPU. Now AMD followed suit and launched a new Athlon processor, coming with the highly anticipated ‘Thunderbird’ core.
On-Die Second Level Cache Makes the Difference
What’s the big deal about an L2-cache that shares its silicon with the processor core? There are actually several big advantages that make this solution very attractive, but at the same time it’s not quite that easy to implement.
On-Die L2-Cache Runs Faster
The first advantage of an L2-cache that is on the same silicon chip as the processor core is the fact that both parts can run at the same clock specifications. This means that the core and the second level cache clock frequency can be identical, making sure that the core doesn’t have to wait a long time until the L2-cache delivers data. Before, when second level cache was found on external chips, it was only able to run at half or even less speed than the processor core. The worst scenario was or still is found in Athlon CPUs that run at 900, 950 or 1000 MHz, where the L2-cache is actually clocked at only a third of the core clock, forcing the processor core to wait up to 2 clock cycles until it can receive data from the L2-cache. The 3rd-party SRAM-makers simply couldn’t supply L2-cache modules that would endure more than 400 MHz clock frequency, and so the great Giga-Athlon came so far with an L2-cache that is hardly faster than the L2-cache of an Athlon 800. On-die L2-cache is able to make sure that processors can still receive data from their L2-caches without much wait, even when they run at clock speeds way beyond 1 GHz.
On-Die L2-Cache Should Be Better Connected
The next great advantage of an on-die L2 is the data path between core and L2-cache. External L2-cache modules need to be connected to the processor core, and the wider the data path between the two, the more pins are required of both components. A processor chip has to stick to a reasonable amount of pins though, which restricted external L2-caches to a data path width of only 64-bit. You can imagine that those pin-restrictions don’t exactly apply to on-die L2. Here the width of the data path is completely up to the CPU-designers and they will try to make this path as wide as possible. Intel widened the connection between CPU core and the L2-cache of the Pentium III ‘Coppermine’-core to 256 bit (part of Intel’s ‘Advance Transfer Cache’-architecture), which meant a fourfold increase in data bandwidth between CPU core and L2-cache over the previous ‘Katmai’-architecture. Unfortunately AMD’s engineers were either not willing or not able to fulfill the same task with Thunderbird, so that AMD’s new processor is still forced to use a ‘one-lane’ instead of Coppermine’s ‘four-lane’ road for the data transport between its core and its on-die L2-cache.
On-Die L2-Cache Makes Processors Nice and Small
Advantage number three doesn’t have anything to do with performance, but it’s just as important as well. Most of you can certainly remember the times when Intel introduced ‘Slot1’ and the ‘single-edge-cartridge’ ‘SEC’. This ‘awkward’ package was necessary to host a printed circuit board (PCB) with the CPU-core chip as well as the external L2-cache modules. Once the L2-cache is integrated onto the same piece of silicon (‘die’) as the processor core, there’s no need for the ‘cartridge solution’ as used by ‘Slot1’ or ‘SlotA’ anymore. This is the reason why the trend goes back to the cheaper and easier to implement PGA (pin grid array) -solutions. For Intel this means going away from ‘Slot1’ over to ‘Socket370’ and for AMD the ‘SlotA’ will be replaced by ‘SocketA’. This new standard is a socket with 462 pins and it means the birth of another new connector-version for x86-processors.
All in all it’s pretty obvious, an L2-cache integrated on the processor die is basically able to improve its performance and reduce system costs at the same time. In case of Intel’s move from ‘Katmai’ to ‘Coppermine’ we could see a speed increase up to 10%, as you can read in the article about Coppermine.
The downside of an on-die L2 is the fact that it’s not quite that easy to integrate all the millions of transistors into the processor die. Cache needs a lot of silicon and a lot of power. The on-die-L2 cores of ‘Mendocino’, ‘Dixon’ and K6-3 were still manufactured in 0.25 micron process, but only the step to 0.18 micron process makes on-die L2 really attractive.
Thunderbird’s Specifications
Now AMD managed to equip its Athlon-core with 256 kB on-die and full-speed L2-cache as well, coming with all the goodies discussed above. The new core has received the modest name ‘Thunderbird’, but the actual product will follow Intel’s example of the Pentium III move from Katmai and Coppermine. ‘Thunderbird’ will still be sold as ‘Athlon Processor’, but it will be easier to distinguish the ‘new’ one from the ‘old’ one, because the ‘Thunderbird-Athlon’ comes as ‘SocketA’-version, while ‘Old-Athlon’ will naturally remain for ‘SlotA’ only.
I tried to summarize the differences between Thunderbird and its predecessor in a little table. All the specs that are not listed there should be identical for the two.
‘Old’ Athlon | ‘Thunderbird’ Athlon | |
Manufacturing process | 0.25 / 0.18 micron, Aluminum Interconnect |
0.18 micron, Aluminum / Copper Interconnect |
Die Size | 102 mm² (.18 micron) | 117 mm² |
Number of Transistors / Die | 22 million | 37 million |
Voltage | 1.6 – 1.8 V | 1.7 V |
Thermal Power at 1 GHz | 65 W | 54 W |
Maximum Current at 1 GHz | 37 A | 33.6 A |
L2-CacheLocation | External | On-Die |
L2-Cache Clock | 33 / 40 / 50 % of core Clock | 100% of Core Clock |
L2-Cache Size | 512 kB | 256 kB |
L2-Cache Data Path | 64-bit wide | 64-bit wide |
L2-Cache Organization | 2-way set associative | 16-way set associative |
Package | SlotA Cartridge | SocketA ‘CPGA’ SlotA Cartridge (OEM only) |
Chipsets | AMD 750 AMD 760 VIA Apollo KX133 |
AMD 750 AMD 760 VIA Apollo KX133 (questionable stability) VIA Apollo KT133 |
You can see that Thunderbird is still requiring a lot of power and current, so the ones of you who hoped that Thunderbird wouldn’t need a power supply as strong as ‘old’ Athlon might be a bit disappointed. You can also see that there will be two different versions of Thunderbird. One is using aluminum interconnects and will be produced in Fab25 in Austin, the other is using the modern copper interconnect process and will be produced in Fab30 in Dresden. Currently it remains unclear if there will be any difference between the two, besides the different color. Chips with aluminum interconnect seem to have a green shine to them, while the copper-chips look rather blue.
Thunderbird’s Specifications, Continued
We’ve tried to decipher the new codes of Thunderbird, as you can see in this picture:
The benchmarks below suggest that AMD made a few more enhancements to the core, but unfortunately we haven’t got any detailed information yet what in particular has been changed in Thunderbird’s processor core over the old Athlon core.
Overclockers won’t be too happy about the fact that Thunderbird is of course multiplier-locked. Goldfinger devices won’t help, unless you should get one of the few SlotA-versions of Thunderbird.
Thunderbird Pricing
What we do know is that Thunderbird is quite a bit cheaper to produce and AMD will therefore be able to offer it at very reasonable prices, particularly if you compare them to Intel’s P3-pricing:
Thunderbird Clock Speed |
Price |
750 MHz | $319 |
800 MHz | $359 |
850 MHz | $507 |
900 MHz | $589 |
950 MHz | $759 |
1000 MHz | $990 |
SlotA Thunderbird
We were in the lucky situation to get a SlotA-version of Thunderbird as well. Officially those will be unavailable to the retail channel and only go to selected OEM-customers. We tried the processor in an Asus K7V with the latest BIOS and found out that this Thunderbird 750 ran without a glitch. However, the overclocking of this CPU was rather unsuccessful. Although the Goldfinger devices work fine with this processor, it was impossible to even overclock it reliably to 800 MHz. This could be due to our sample, but it also could be that the rumor saying that SlotA-Thunderbirds at 800 MHz and above aren’t able to run on KX133-platforms reliably has got some truth to it. We will dedicate some more time to this issue and will keep you informed about our findings. We’ll also try to find an answer to the question if SocketA-Thunderbirds are able to run in SlotA-systems with special ‘Slocket-Cards’. So far AMD stated that those SocketA-to-SlotA cards would not work reliably.
You can see that the SlotA-Thunderbird looks very similar to its older brother, besides the fact that it’s missing the L2-cache chips. Goldfinger devices work just the same as well.
SocketA
The new socket for Thunderbird will have no less than 462 pins and therefore look a lot different than the Socket370 used by Intel processors.
Socket A
Socket370
Currently, SocketA is only supported by AMD’s good old 750 chipset and VIA’s recently renamed Apollo KT133 chipset. We are expecting a lot of new SocketA motherboards at Computex / Taipei, which is starting today.
The Chipsets
For the time being, the choice of chipsets for the new Thunderbird isn’t big. AMD announced proudly that their rather old and almost outdated 750 chipset is able to support Thunderbird, but the much newer Apollo KX133 chipset from VIA seems to have problems with the new AMD processor. Therefore VIA created the Apollo KT133, which is nothing different, than a KX133-chipset with SocketA-support, so please don’t expect any new features. KT133 was recently renamed after from ‘KZ133’, because the combination ‘KZ’ is in bad memory to millions of people worldwide.
AMD is currently working on the successor of the 750 chipset, which will carry the ‘surprising’ name ‘AMD 760’. This chipset is expected to support DDR-SDRAM, which is why there are high expectations into systems that come with Thunderbird and the new AMD chipset. Obviously we should not expect those new platforms before August or even September 2000.
Test Motherboards for Thunderbird
By the time of the test we had three different SocketA-motherboards with VIA’s Apollo KT133 chipset available for our 1 GHz Thunderbird. One was the motherboard out of the official AMD evaluation system, which turned out to be an OEM-product from Compaq by the name of ‘Pipeline-1’, of which we heard the rumor that it is manufactured by Asus.
Unfortunately this board wouldn’t leave any room whatsoever for any kind of tweaking, which is not exactly surprising for an OEM-product. We weren’t even able to adjust the memory timing.
The second motherboard came directly from VIA and is the official KT133 reference platform VT5276D. Our tests showed that this board would produce the best results, which is why we used it for our testing suite.
Finally we had the chance to try MSI’s K7T Pro. This board seemed to be in a too early stage for testing, because the results it scored could not live up to its competitors from above.
Benchmark Setup
As in the recent Solano-article we made sure that our test results were comparable to the scores produced for the ‘Giga-Battle‘ article, which is why we used NVIDIA’s rev. 5.08 drivers once more. It is still a fact that all later drivers are slower than revision 508, which is why we felt good about using it.
AMD Athlon Platform Information | |
Graphics card for all tests | NVIDIA GeForce 256 120MHz Core, 300MHz DDR-RAM 32MB |
Hard Drive for all tests | Seagate Barracuda ATA ST320430A |
CPU for all tests | Athlon with Thunderbird core / Athlon with ‘old’ core, 1 GHz |
VIA Apollo KT133 Chipset |
|
Motherboard | VIA KT133 Reference Board VT 5276D |
Memory | 128 MB, Micron PC133 SDRAM CAS2 |
Network | 3Com 3C905B-TX |
VIA Apollo KX133 Chipset |
|
Motherboard | Asus K7V, BIOS May 2000 |
Memory | 128 MB, Enhanced Memory Systems PC133 HSDRAM CAS2 |
Network | 3Com 3C905B-TX |
AMD 750 ‘Irongate’ Chipset |
|
Motherboard | Asus K7M, rev. 1.04, Chipset rev. C6, Super Bypass Enabled, ACPI BIOS 128 beta (02.03.2000), AGP2x enabled in GeForce driver (IrongateEnable2x=1) |
Memory | 128 MB, Micron PC133 SDRAM CAS2 |
Network | 3Com 3C905B-TX |
Intel Pentium III Platform Information | |
Graphics card for all tests | NVIDIA GeForce 256 120MHz Core, 300MHz DDR-RAM 32MB |
Hard Drive for all tests | Seagate Barracuda ATA ST320430A |
CPU for all tests | Intel Pentium III 1GHz, 133 MHz FSB |
Intel i815 Chipset Pre-Release Stepping A2 |
|
Motherboard | No Information |
Memory | 128 MB, Wichmann WorkX MXM128 PC133 SDRAM CAS2 |
IDE Interface | onboard |
Network | 3Com 3C905B-TX |
VIA Apollo Pro 133A Chipset |
|
Motherboard | Asus P3V4X, ACPI BIOS 1002 final, March 2000 |
Memory | 128 MB, Enhanced Memory Systems PC133 HSDRAM CAS2 |
Network | 3Com 3C905B-TX |
Intel 440 BX Chipset |
|
Motherboard | Asus P3B-F, ACPI BIOS 1005 beta 01, March 2000 |
Memory | 128 MB, Enhanced Memory Systems PC133 HSDRAM CAS2 |
IDE Interface | Promise Ultra66 PCI card |
Network | 3Com 3C905B-TX |
Intel 820 Chipset |
|
Motherboard | Asus P3C-L, ACPI BIOS 1020 beta 05, March 2000 |
Memory | 128 MB, Samsung PC800 RDRAM, RDRAM clock adjusted in BIOS |
IDE Interface | onboard |
Network | Onboard i82559 |
Intel 840 Chipset |
|
Motherboard | OR840, special unreleased BIOS |
Memory | 128 MB, Samsung PC800 RDRAM 128 MB, Samsung PC700 RDRAM, running as PC600 RDRAM |
IDE Interface | onboard |
Network | Onboard i82559 |
Driver Information | |
Graphics Driver | NVIDIA 4.12.01.0508 |
viagart.vxd for VIA Chipsets | AGP-driver 4.22 |
ATA Driver | Promise Ultra66 driver rev. 1.43 Intel Ultra ATA BM driver v5.00.038 Latest VIA ATA BM Driver |
Environment Settings | |
OS Versions | Windows 98 SE 4.10.2222 A Screen Resolution 1024x768x16x85 Screen Resolution 1280x1024x32x85 for SPECviewperf |
DirectX Version | 7.0 |
Quake 2 | Version 3.20 command line = +set cd_nocd 1 +set s_initsound 0 Crusher demo, 640x480x16 |
Quake 3 Arena | Retail Version command line = +set cd_nocd 1 +set s_initsound 0 Graphics detail set to ‘Normal’, 640x480x16 Benchmark using ‘Q3DEMO1’ |
Expendable | Downloadable Demo Version command line = -timedemo 640x480x16 |
Unreal Tournament | Ver. 4.05b high quality textures, medium quality skins, no tweaks 640x480x16 Benchmark using ‘UTBench’ |
Benchmark Results
Obviously we can expect a performance increase similar to what we saw when Intel went from Katmai to Coppermine. Lately Intel has been pushing the integer benchmarks again, since even the old Athlon leaves Pentium III far behind in any pure floating point benchmark. In integer benchmarks, Coppermine Pentium III was able to surpass Athlon and that’s were we expect the most improvement of Thunderbird. Let’s see if AMD was able to do it.
Sysmark2000
The performance increase that Thunderbird can show over its predecessor is able to fulfill the expectations. AMD’s new processor is almost 9% faster than the ‘old’ Athlon in Sysmark2000 and therefore just as fast as Intel’s Pentium III with Coppermine core, unless you are using the overclocked BX133 or the expensive i840 platform for the Intel processor.
3D Gaming Performance
In Quake 3, Thunderbird is also able to surpass its predecessor. Only a Pentium III running on the unofficial ‘BX133’ platform is scoring significantly better results.
In Quake 2 Thunderbird can leave all its competitors behind, even the BX133 cannot save Intel from getting into second place.
3D Gaming Performance, Continued
In Expendable the picture is similar once more. Thunderbird is able to leave its predecessor as well as Intel’s Coppermine behind it, as long as this processor does not run on the BX133-chipset.
Unreal Tournament is known as a very CPU-intensive 3D game and Thunderbird can almost reach the performance of Pentium III on BX133. Again, the whole ‘official’ competition is left behind.
Professional OpenGL Application Performance – SPECviewperf 6.1.1
Professional OpenGL Application Performance – SPECviewperf 6.1.1, Continued
SPECviewperf shows once more that Thunderbird marks a significant improvement over the ‘old’ Athlon CPU. Its fast L2-cache plus its powerful FPU are able to put in into first place in most of the five applications.
FPU Performance
We were surprised to see that Thunderbird was even able to score a bit better in our standard 3D Studio Max FPU benchmark. The enhanced on-die L2-cache of AMD’s new processor cannot really explain this, but it proves that Thunderbird incorporates some other improvements as well.
Conclusion
“Athlon was good, but the new Athlon is even better” is the best way to summarize AMD’s new Thunderbird processor. The new integrated L2-cache is able to boost Athlon’s performance to a level that’s now fully able to compete against Intel’s Pentium III in almost any benchmark that’s not one-sidedly enhanced for Intel’s ISSE-instructions only. The attractive pricing of Thunderbird will ensure the continued success of the Athlon CPU. However, those of you who expected that Thunderbird would leave Coppermine far behind may be a bit disappointed. AMD is facing quite a bit of work if Athlon is supposed to compete well against Intel’s upcoming ‘Willamette’ processor.
The bad news about Thunderbird is the fact that current owners of Athlon systems will face problems to upgrade to the new AMD-CPU. Thunderbird will only be available as SocketA-version in the retail market and it seems as if this will require new motherboards. We will have to see if it should indeed be impossible to use SocketA-to-SlotA converter cards as those products slowly become available. However, AMD’s move to Athlon’s socket-version is certainly the right thing to do and people who are interested into buying a new Athlon system now will be able to take advantage of the reduced system costs that come with SocketA. As ‘Duron’, Thunderbird’s little brother with only 64 kB on-die L2-cache, was launched today as well, AMD is now offering a direct upgrade path for customers of this new chip. Duron is also using SocketA and can thus easily be replaced by Thunderbird.
With Intel struggling to deliver enough Pentium III processors in the performance segment above 800 MHz, AMD can jump into the gap with Thunderbird, if this processor should be available in reasonable amounts. In the last half-year AMD proved that it could fulfill the demand for Athlon processors a lot better than Intel was able to do this with its Pentium III, so we should be able to expect the same for Thunderbird.
Last but not least there will be the platform issue however. SocketA motherboards with VIA’s Apollo KT133 chipset are just about to become available and there are more attractive Athlon-platform solutions visible at the horizon already. VIA’s KT133 is only a KX133 with SocketA-support, thus sporting AGP4x, ATA66, but only PC133 memory support. AMD’s 760 chipset as well as VIA’s next Athlon chipset KZ266 will both support DDR-SDRAM as well as ATA100 and therefore offer an even faster platform for the new Thunderbird and Duron processors. Keep this in mind before you go and buy the first KT133-motherboard that becomes available.
All in all we can expect a hot summer and fall this year. Thunderbird will be able to grab an even larger piece of the x86-processor market with its improved performance and its attractive pricing. Intel might be forced to drop the prices of its Pentium III processors earlier than planned, until finally ‘Willamette’ will start the performance race again.