منتدى أحلـــى كـــلام
عزيزى الزائر

كونك زائر غير مسجل

* سيتم عرض اعلانات لك، هذه الاعلانات لا تظهر للاعظاء.
* لن تتمكن من مشاهدة بعض محتويات مواضيعنا.

ندعوك للتسجيل بأقل من دقيقة
* لتتمكن من مشاهدة كافة محتويات المواضيع التى ترغب فى قرآئتها .
* وقف عرض الاعلانات.

SiteAdmin

Intell & AMD Processors

استعرض الموضوع السابق استعرض الموضوع التالي اذهب الى الأسفل

Intell & AMD Processors

مُساهمة من طرف YaMan في الثلاثاء أغسطس 10, 2010 12:06 pm



INTEL

Intel’s HyperThreading Technology (HT Technology)


This is probably one of Intel’s best things that they have done for their higher end CPU’s. However with the release of the dual core CPU’s this technology is beginning to become pointless. Also Microsoft may be preventing their operating systems from working with HT Technology so instead of one logical and one physical core it will actually just be one physical core as the operating system will no longer be forced into believing that there are two cores connected.

Simultaneous Multi-Threading (SMT) also called Hyper-Threading by Intel is really I suppose a way of fooling the operating system into thinking it's hooked up to two CPU's, this is done by making a single chip operate like two separate ones, it does offer some benefits but not in every application, especially games. I suppose we could also define this as a "Symmetric Multi-Processor" system.
Now, running programs that support independent simultaneous threads allows the threads to run independently on both CPU's even though there is actually only one physical processor, in some applications there could be performance gains up to 25%, but at the same time enabling HT will make others lag and run slower than having it disabled.

HT allows the CPU two process two independent threads, however some people believe that they are processed at the same time, that is not entirely true, they are not, the second thread executes after the first and fills in the gaps that the first thread leaves, this means that the second thread is less efficient and slightly slower than the first, it still offers some benefits though as you can run two threads even though one will be a bit faster than the other.

Streaming SIMD Extensions

SSE3 are additions to the SIMD (single instruction multiple data) capabilities of a processor. SIMD processing is based on the idea that sometimes processors must take large amounts of data and perform similar operations across the entire set. This works effectively with things like audio, video processing and multi-tasking. SIMD processing has also largely overshadowed the use of the x87 floating point unit on x86 processors. AMD now have the SSE3 instruction set as it was traded with Intel for the AMD64 technology which has become Intel EM64T.

However this is odd because Intel already had the IA-64 architecture first so they therefore had 64 bit technology before AMD, however the IA-64 architecture is pure 64 bit technology and can only run 32 bit data with it being emulated. The AMD64/EM64T technology is NOT pure 64 bit technology, it is literally the emulation of 32 bit extensions to allow 64 bit processing to be done and executed on CPU’s that are based on a 32 bit architecture e.g. IA-32.

Conroe SSE
Now this is special, SSE extensions are 128 bits in length however current processors are only have 64 bit SSE lanes, so in truth the SSE code being processed has to be split in between two lanes. With Conroe the lanes are 128 bit in length so you no longer need to split the data resulting in much faster executions of SSE.
Please Note: Core 2 Micro architecture contains the SSE4 instruction set with 16 new instructions.


A large amount of Level 2 Cache

Intel processors have a large amount of level 2 cache. Level 2 cache is memory in the processor which is very fast. When you load up a program the processor starts looking for the data. It first checks the Level 1 cache to see if it is there, if it is not there then it checks the level 2 cache, then (if you have any) the level 3 cache. If it does not find it in there then it checks the system memory (RAM) for it. The CPU cache is much faster than the system RAM so having more cache is a big advantage for speed.

Intel 5xx series processors have 1024kb of L2 cache and 16kb L1 cache (12kb of trace level 1 cache also). Intel 6xx series processors have 2048kb of L2 cache and the same amount of L1 cache as the 5xx series. The Intel dual core 8xx series have double the amount of cache as the 5xx series as there are two cores. The 9xx series have even more L2 cache but they still have the same amount of L1 cache as the 8xx series as it is expensive to manufacture.



These are the main features of the Intel processors. Now onto Advanced Micro Devices (AMD) processors.


AMD

AMD 3DNow technology

3DNow significantly enhances floating-point-intensive, 3D graphics and multimedia performance. Benefits of 3DNow! technology include leading-edge 3D performance, more realistic and lifelike 3D imaging and graphics, big screen sound and video, and the ultimate Internet experience. 3DNow! technology is a group of instructions that open the traditional processing bottlenecks for floating-point-intensive 3D and multimedia applications. With 3DNow! technology users can implement more powerful hardware and software solutions to enable a richer visual computing experience.


Operations per clock cycle

Now this is very important. This improves AMD’s power everywhere not just on gaming. However with the release of Conroe this gain is much less as Conroe can equal/beat the IPC count of AMD 64 processors.

Most people think hang on how comes an AMD FX55 CPU beats a Pentium 4 3.8GHz CPU? Well it’s not all about CPU speeds anymore; something more complex determines the power of a processor.

AMD processors do more operations per clock cycle than Intel processors.

Currently AMD processors do 9 operations per clock cycle whereas Intel only do 6.

Here are some examples:

The 4 candidates are

Intel Pentium 4 570 (3.8GHz)

Intel Pentium 4 560 (3.6GHz)

AMD FX55 (2.6GHz)

AMD 64 4000 (2.4GHz)


The 570. 3.8 x 6 = 22.8 operations
The 560. 3.6 x 6 = 21.6 operations
The FX55. 2.6 x 9 = 23.4 operations
The 4000 . 2.4 x 9 = 21.6 operations


Out of these 4 the AMD FX55 wins even though it is running at a lower speed than the Intel 570. The Intel may be running at a faster speed but the AMD is doing more work in total even though its slower.

Conroe has slower clock speeds in comparison to the Pentium 4 processors, however its IPC count is higher an its pipeline is much shorter making it much more efficient therefore it will beat most AMD processors at gaming tasks now. AMD has its 3DNow! Extensions to help them along a bit though but it may not hold out against Conroe with SSE4.


Also Note:

Even though AMD manages to perform more operations than Intel Netburst in one clock cycle, Intel Netburst manages to do their operations quicker. This is because of their pipeline architecture. AMD’s pipeline is only 10 stages long. This means that because the stages in the pipeline have to do more work, they can’t run very fast. Now, with Intel, their processors have a 20 stage pipeline (Prescott/Prescott2M/Presler core processors have 31 stages). This means that the processor can run at a higher clock speed, because less work is done in each stage of the pipeline.


AMD HyperTransport Technology

First of all I know that some people confuse this with Intel’s HT technology. They are actually 2 different things that are not related. HT technology is for Intel processors and it splits them up into two virtual CPU’s. HyperTransport technology is nothing like that. It is the way of sending data in AMD computers.

AMD 64 processors have something called Hyper-Transport. They have no Northbridge* or Southbridge* and this allows the RAM to be directly connected to the CPU. Due to this data does not need to go to the Northbridge. It goes straight to the CPU. This is called a Hyper Transport tunnel. There is also something called the I/O hub. This is like the Southbridge in Intel computers. It manages all of the input/output devices, and ports. HyperTransport allows higher bus speeds of 800MHz (1600MHz effective, only socket 754) and 1000MHz (2000MHz effective, socket 939) on AMD computers. The effective speed is double that of the speed it actually runs at because it's “bidirectional” meaning you can send data both ways in the same time period.

* Note: Please note that the statement does not infer that the Northbridge and Southbridge are not present on an Athlon 64 motherboard. The statement simply means the Northbridge is not a factor of CPU/RAM information flow. AMD's on-die memory controller cuts down the system bottlenecks of traditional CPU-Northbridge-RAM pipelines.



Conclusion

Well a lot has changed here since my original conclusion. I have updated some of this guide, some does not need to be upgraded however. AMD will not longer have a lead after the release of Conroe due to its much improved architecture called Core. With the addition of 128 but SSE lanes and SSE4 it will be able to perform multi-tasking at a much higher speed than any AMD processor including the famous Opteron Line I would expect. AMD still have not lost yet, their CPU’s are efficient and run cool while generating less power than Netburst.
At the time I have re-written this guide Intel have the lead in both parts of the market, gaming and multi-tasking.


Note: The latency of the Intel Netburst CPU’s is greater than that of AMD CPU’s. Intel Northwood’s have a 20 stage pipeline architecture whereas the Prescott’s have a 31 stage architecture, this means that the Prescott’s actually have a higher latency than the Northwood’s. However after 3.8GHz then Prescott’s become more efficient and run much faster than the Northwood’s.
However the Conroe latencies are much lower.


Intel and AMD Temperature Reducing Technologies

Note: These 2 technologies are Intel’s and AMD ways of keep processors cooler and more efficient when they are doing very little or nothing. I will only briefly mention these though.


Intel SpeedStep Technology

Enhanced Intel SpeedStep® technology enables dynamic switching of the voltage and frequency between two performance modes based on CPU demand. The processor also features a new ultra low power alert state called Deeper Sleep, which enables the processor to retain critical data at very low voltages and minimizes power dissipation when the processor is not active.

This comes in 2 forms, called TM1 and TM2.

Thermal Monitor 1 inserts idle clock cycles in to the CPU to attempt to cool it down, this happens after a set temperature and this will reduce the amount of the data that your CPU can process.

Thermal Monitor 2 drops your CPU’s multiplier and lowers the vcore to drop temps even further, this will obviously reduce the speed of the CPU but does lower temps a lot.


AMD Cool N’Quiet

AMD Cool ‘n’ Quiet™ technology controls your system’s level of processor performance automatically, adjusting the operating frequency and voltage up to 30 times per second, according to the task at hand. When an application does not require full performance, significant amounts of power can be saved. However, the processor can respond to increased workloads, allowing the system to deliver a responsive and rewarding computing experience. The only difference the user will observe with most applications is that the system will run cooler and quieter. Performance is designed to still be responsive, with maximum processor performance being delivered when required, and automatic power savings when possible.

Note: If you have this running then I would recommend turning it off.


A Bit More About The Cache


In the main part of the guide I talked about the cache in Intel processors. This time I will talk about it in general.

Right first the Level 1 cache.

L1 cache is a small piece of very fast memory that's on the CPU chip itself. It sits between the CPU registers and the L2 cache. Typically L1 cache has a lower latency than L2 cache. This makes it more expensive to produce so we don’t see a large amount of it in CPU’s today.

Right now onto Level 2 cache.

L2 cache is bigger than L1 cache. It has a higher latency than the L1 cache making it cheaper to produce so we do see a nice size of it in CPU’s today. 128KB, 256KB, 512KB, 1024KB and 2048KB are the most common sizes to see in desktop processors.

Now some people may have heard of level 3 cache. It is quite rare in the desktop processors. The Intel Extreme Edition processors have 2MB of it. I don’t know a lot about it but here is what I do know.

Level 3 cache is extra cache built into motherboards (on Itanium and older processors) between the microprocessor and the main memory. It can come in various sizes.
The L3 cache on the newer CPU's like the P4 EE is on the actual chip.
It is usually used in server processors. The Intel Xeon’s have 9MB of L3 cache in them and you can get 4 Xeons in 1 system so you have 36MB of L3 cache!!!



Let’s take a look at some examples of the amount of cache in various cores.

AMD cores first.

ClawHammer: 1024KB L2 cache/512KB L2 cache, 64KB L1 Data Cache, 64KB L1 Code Cache.

Newcastle: 512KB L2 cache, 64KB L1 Data Cache, 64KB L1 Code Cache

Sledgehammer: 1024KB L2 cache, 64KB L1 Data Cache, 64KB L1 Code Cache

Winchester: 512KB L2 cache, 64KB L1 Data Cache, 64KB L1 Code Cache

Paris: 256KB L2 cache, 64KB L1 Data Cache, 64KB L1 Code Cache

Manchester: 512KB L2 cache (x2), 64KB L1 Data Cache (x2), 64KB L1 Code Cache (x2)

Toledo: 1024KB L2 cache (x2), 64KB L1 Data Cache (x2), 64KB L1 Code Cache (x2)


Now onto Intel cores.

Northwood: Either 128KB, 256KB or 512KB L2 cache, 16KB L1 cache, 12kb of trace L1 cache

Prescott: Either 256KB, 1024KB or 2048KB L2 cache, 16KB L1 cache, 12kb of trace L1 cache

Dothan: 2048KB L2 cache, 32KB 32KB L1 cache (code)

Smithfield: 1024KB L2 cache (x2), 16KB L1 cache, 12kb of trace L1 cache (x2)

Dual Cores

When I wrote this guide there were no dual cores CPU's out for the desktop, by request I will add some information on them now.

Dual core means 2 CPU's on one die. This would theoretically give you twice the performance of a single core chip, however there is currently little software that can take advantage of the second CPU. There are some programs that benefit like Folding@home. Intel and AMD both have dual core processors available for the desktop. AMD's are called x2 and Intel's are called Pentium D's. There are some problems though, the amount of data that can be sent out of these CPU’s is technically less than what can be processed I suppose, so they will be bottlenecked by the FSB. There will be larger amount of heat produced as well which will limit their overclocking ability.

YaMan
عضو جديد
عضو جديد

عدد المساهمات : 15
نقاط : 77
تقييم العضو : 0
تاريخ التسجيل : 16/06/2010

الرجوع الى أعلى الصفحة اذهب الى الأسفل

استعرض الموضوع السابق استعرض الموضوع التالي الرجوع الى أعلى الصفحة


 
صلاحيات هذا المنتدى:
لاتستطيع الرد على المواضيع في هذا المنتدى