Test and review: AMD Radeon RX 480 – new mass-market leader

Test and review: AMD Radeon RX 480 - new mass-market leader

AMD introduced a new graphics card Radeon RX 480, the first model generation of the Polaris, which came to our test lab. With the new generation of AMD has changed the former approach, starting not with the high-end segment, and with the mass market – it is on the middle segment positioned Polaris first video. Below you will learn all the details of the new architecture AMD Polaris, as well as changes in comparison with the previous graphics. Of course, we present the results of tests.

AMD began to advertise Polaris architecture early on. The first signals appeared at the end of 2015, when AMD announced the imminent appearance of graphics cards, however, without specifying the date. Frankly, manufacturers rarely announces new GPU models for six months before they are released. In the case of Polaris nuggets of information it was very little, for it was difficult to form an impression of the performance trends. AMD’s, for example, immediately told about the production on 14-nm process technology FinFET, but without details.

AMD while actively promoting high dynamic range (HDR) for viewing photos and videos, as well as for games. Of course, it should support and monitors and interfaces. In the latter case supported standards DisplayPort 1.3 / 1.4 and HDMI 2.0b. Just about HDMI AMD was subjected to serious criticism with video Fury, who for some reason did not support the HDMI 2.0 standard.

During Computex was presented the final version of the Radeon RX 480. But even then, AMD has not revealed the technical specifications. Now we fill this gap.

For the first time AMD has set for the GPU core frequency, approaching NVIDIA approach. The base rate is 1.120 MHz, it is the minimum at which the video card will work. Boost mode frequency may be up to 1.266 MHz, and, unlike NVIDIA, it is not guaranteed minimum value and the maximum.

At Radeon RX 480 used 2.304 stream processors. Their organization will be discussed in section architecture. Furthermore 2.304 stream processors per GPU work 144 texture units and 32 raster operations pipeline (ROP). Note 4 or 8 GB of GDDR5 memory, which operates at a frequency of 2.000 MHz. So AMD refused to use GDDR5X, as in the case of NVIDIA GeForce GTX 1070. Memory GDDR5X is present only in the GeForce GTX 1080. Of course, the emergence of High Bandwidth Memory was hardly reasonable to wait for the video card with a similar class of price and performance.

Video card Radeon RX 480 consumes 150 watts, 85 watts of which relate to the GPU, and the remaining capacity – the other components.

AMD specifies the recommended price of 255.85 euros or 18,970 rubles for the Radeon RX 480 with 8 GB of memory. The version with 4 GB of memory will cost 214.20 euros, or 16,310 rubles.

We brought in a table known data on GPU Radeon RX 470 and 460, we obtained on Polaris Tech Day. Unfortunately, not all the details are known – for example, no clock frequencies. According to the latest plans, the rest of the video RX line, namely the Radeon RX 470 and the Radeon RX 460, should appear in mid-July.

Test and review: AMD Radeon RX 480 - new mass-market leader

Our conversation with AMD employees at Polaris Tech Day allowed to know some interesting facts about the GPU Polaris. AMD in the development of Polaris paid special attention to good software support and optimization software under the “iron”. We asked why AMD does not start with the “older” Polaris chip for high-end segment, and received the answer: high-end chips are meaningful only when the software is adapted for them, at this stage of development of the API is not yet the case. That is, the software plays a very important role, so now estimate AMD products may differ slightly from the situation, which will be six months or a year. With such, we have already encountered after Fury exit. Therefore, the software division of Radeon Technology Group is now growing stronger than hardware.

Of course, in addition to Polaris 10 and 11 AMD could imagine and a larger chip today – but the company has to carefully calculate their resources. Yet the budget for research and development is very limited, and physical resources are not limitless AMD to take a higher rate of development.

AMD Radeon RX 480: Architecture Polaris

First, I would like to go back in time, to compare the early stages of the development of GPU with the current step. For example, in the years 2001 and 2002, when the manufacturing process was 180 or 150 nm. Go with RV100 on the R300 became a serious step to increase the performance, the chip area almost doubled, despite a smaller process technology. We obtained significant improvements in architecture, instead of 30 million transistors now use 110 million, instead of 1 unit pixel shaders -.. For 8 blocks. All this has led to a substantial increase in productivity.

With architecture Polaris AMD plans to repeat this success, but the company relies not only on a new architecture, but also the many new developments in various fields. It may be noted new standards for connecting displays, multimedia features, cache, memory controller, and power management system.

With the first generation GCN (Graphic Core Next) architecture, AMD has received not only high performance, but also a considerable power consumption, especially in the Radeon R9 290X and Radeon R9 290, the fastest graphics card of the new generation. Over the years, AMD has added a variety of optimization and chips Tonga and Fiji are well reveal the potential of architecture, demonstrating its capabilities.

Architecture Polaris again is a major step forward. It may be called terms such as Primitive Discard Accelerator, Hardware Scheduler or Instruction Pre-Fetch, but without a deep dive into the architecture they are difficult to explain. The easiest way to understand the effectiveness of the improved shaders, as well as function, increase the efficiency of the entire GPU Polaris: memory compression. NVIDIA uses the color delta compression since the first generation of Maxwell, which made it possible to effectively use not the most wide memory interface. AMD with the new High Bandwidth Memory from GPU Fiji spared the memory bandwidth constraints on the other side, but the compression gain will also be visible. This is especially true of video cards without HBM, and which in 2016 will remain the prerogative of high-end GPU.

Start Polaris AMD identified with two versions of the chip: Polaris 10 Polaris 11. Polaris and 10 will be used in video cards Radeon RX 480 and Radeon RX 470. In the case of the RX 480 is available for the Radeon GPU configuration is complete. The command processor (Graphics Command Processor) working with two hardware controllers (Hardware Scheduler, HWS) and four engines Asynchronous Compute Engines (ACE), which provide distribution of computational tasks. 36 CU computing units (Compute Units) containing 2.304 stream processors, that is, in each CU contains 64 stream processors (36 x 64). Throw in 4 geometry processor and a pixel capacity 32 pixels per clock. There were no 114 texture units, 2 MB of cache L2, 576 32-bit blocks of load / store and a 256-bit memory bus.

Test and review: AMD Radeon RX 480 - new mass-market leader

Radeon RX 470 uses the same GPU Polaris 10, but in truncated form. Available only CU unit 32, that is, the video card is based on the 2,048 stream processors. The memory interface was 256-bit, but AMD has provided the memory configuration is only 4 GB. The memory frequency of 1,750 MHz. Typical power consumption – 110 W, 85 W of which falls on the GPU.

For video cards AMD Radeon RX 460 will use the GPU Polaris 11. It relies on the 14 CU, which gives 896 stream processors. Note 64 texture units and 128-bit memory interface. The L2 cache is trimmed to 1 MB compared to the GPU Polaris 10. The number of 32-bit load / store units is reduced to 256. 2 or 4 GB of GDDR5 memory will operate at a frequency of 1.750 MHz. The memory bandwidth of 112 GB / s. Power video is at 75 W, 48 W of which falls on the chip.

In addition to updating the architecture AMD has implemented support for several technologies, which we’ll talk more.

Primitive Discard Accelerator

Primitive Discard Accelerator is designed to prevent excessive tessellation. Among other things, AMD is focused on some methods of GameWorks, which use tessellation, but at the same time benefit from the features of the GPU NVIDIA architecture, which can not be said of AMD graphics cards. For example, at some levels tessellation effect will not be noticeable, it is only spent performance.

Primitive Discard Accelerator is a rendering filter, which detects such excessive calculations and removes them. Liberated computing performance goes to the same anti-aliasing MSAA. To improve the efficiency of AMD shader uses a new index cache for geometric objects, which allows you to keep data longer. As a result, the data often do not have to transfer between the cache and memory bandwidth increases primitives.

Another way to improve the efficiency of the architecture – improved prefetch instructions (prefetch). It allows you to more effectively fill the rendering pipeline to it was less “bubbles” of inactivity. AMD increased or optimized caches and buffers. In particular, it is important cache L2, which increases the efficiency of video memory.

In architecture, the 4th generation GCN has a “native” point FP16 support and Int16, including “native” 16-bit registers and a 16-bit mapping. This measure will reduce energy consumption and reduce the occupied space / bandwidth of video memory and the ability to register, since the calculations required for FP16 FP32 not occupy resources. FP16 calculations are important for graphics applications, photos and videos, as well as in-depth training (Deep Learning).

New memory controller supports GDDR5 GPU Polaris at speeds up to 2,000 MHz. The capacity is 256 GB / s. Compared with the predecessor Radeon R9 380, AMD increased its capacity by approximately 50 percent.

But more important memory compression technology. Like NVIDIA, AMD has added new memory compression scenarios. compression technology is especially important when using narrow interfaces. 256-bit interface Radeon RX 480 seems to be very narrow in comparison with the 512-bit GPU Hawaii or at 4.096-bit video cards with memory or HBM HBM2 from AMD and NVIDIA. Therefore AMD wished to compensate a relatively narrow memory interface using a color-based compression of delta (Delta Color Compression, DCC), at least to some extent. Technology color delta compression is used in the GPU AMD and NVIDIA for several generations. For example, NVIDIA has moved to the fourth generation of compression technology. AMD used the compression GPU Tonga on video cards Radeon R9 295. Recently, AMD once again accentuated realization of color delta compression architecture GCN. It is important to recall that the compression is lossless. No data is lost, the developers may use the method without any special adaptation software.

DCC technology involves storing information on the basic pixel and the surrounding pixels in a 8×8 matrix are stored in the form of a difference (delta). Since closely spaced pixels usually do not differ much in color, storage for them is the difference in volume of information is more profitable than full color values. Therefore, in case a delta compression the pixel information occupies less space in memory, also a saving memory bandwidth. As an example of technology can result in a completely black and white blocks to be stored in memory as {1.0, 0.0, 0.0, 0.0} and {0.0, 1.0, 1.0, 1.0}. You can save resources, keeping only 0.0 or 1.0 as the value.

Test and review: AMD Radeon RX 480 - new mass-market leader

It is already known scenarios compression of 2: 1 and 1: 4 is added to one, namely, 8: 1, which provides an even stronger grip. AMD says an increase in productivity with the new method is 38%. Of course, this value is theoretical. To assess the effect of color is difficult to practice a delta compression, since it is performed at the hardware level, compression can only disable itself AMD driver. So you have to be satisfied with the tests made by AMD.

Of course, the memory capacity of 384 or 320 GB / s in a new video cut to 256 GB / s, but thanks to Delta Color Compression and AMD plans to double the cache memory to provide a significant increase in performance and efficiency. And AMD has not only increased the size of the cache, but also to optimize its performance. Now read and write cache is faster, and have accelerated the process of separation of different data areas. All of these measures improve the energy efficiency of the memory subsystem and enhance the advantages of DCC.

Asynchronous Compute

The presence of thousands of stream processors give enormous performance potential. But hardware and software developers need to consider how to most efficiently use the available resources to perform computing tasks. Everything is further complicated if the classic graphic tasks and add computation on the GPU. Not all such calculations is obtained mix, to perform simultaneously.

So it was invented, asynchronous shader (Asynchronous Shader), which are designed to facilitate the interaction of the engine, drivers, and hardware, as well as to improve the distribution of tasks. With the advent of the first graphics cards GPU “Hawaii” AMD started talking about improvements GPU computing through faders Asynchronous Compute Engines. However, these hardware blocks are present in all GPU on GCN architecture. They are designed for computing on the GPU, DirectX 12 games but they can also use.

When considering the GPU architecture, we always mention the individual functional blocks are considering their work. They can operate independently, but all depends on the rendering process, which linear operations can be represented as a queue. All modern architectures Manager tries to optimize the performance of these operations, to maximize load computing resources, but this possibility is not unlimited.

With the advent of DirectX 12 developers have significantly more control over the computing units, including the possibility of a better distribution of tasks. But here the API imposes certain requirements. Modern games need to break the problem on the line, which can be performed independently of each other. For example, tasks allocated to queues graphics, computing and copying. API, in this case the DirectX 12 then allocates a queue accessible by GPU resources.

The approach seems simple, but in practice it is more complex, since the individual line data should be distributed free on the GPU resources. AMD describes the rendering process on the example of road traffic using the intersection and traffic lights. Just light – is the controller, which is to distribute the processes for rendering engine. Here, all members of the movement are of equal priority. They have to wait until the resource is available, this is not the optimal scheduling.

One way to improve the process is to prioritize. In this case, you can set the priority of certain tasks. For example, in the case of road traffic stop can turn blue to purple to process queue with a higher priority – it will be sent to rendering. But here it is necessary to switch between the various queues, the efficiency is not always increased markedly.

Here AMD is asynchronous computing engines (ACE, Asynchronous Compute Engines). They allow you to simultaneously process multiple command queuing, which is a more efficient build in rendering pipeline. For example, road traffic with the use of ACE can be represented as control of each vehicle in a particular order and with the required intervals. If all the queue will come randomly, then we get a collision. And prevent their ACE, placing commands in the queue. This method allows you to get rid of empty seats in a queue on the conveyor, make better use of the potential performance. You can also prioritize certain processes.

Quick Response Queue

Quick Response Queue, which is part of or directly Asynchronous Shaders associated with this technology. QRQ AMD was first mentioned about a year ago, and also in connection with Asynchronous Shaders. In addition, Quick Response Queue works with Asynchronous Time Warp, what we will write later. This feature is designed to solve some of computing problem (and prioritization), when it comes to the use of virtual reality helmet.

Test and review: AMD Radeon RX 480 - new mass-market leader

The first example shows the AMD Stream Processors work in simple scenarios Pre-Emption. In this case, graphics and other computations are performed one after the other depending on what they take place in the queue. Total runtime problems increases due to the misallocation of resources. The situation is slightly different when using Asynchronous Shaders. Various problems in this case, can be performed simultaneously by different stream processors. With this, the total computation time is greatly reduced, but some priorities that need to be completed quickly, will take longer because of the simultaneous loading of stream processors.

Using Asynchronous Shaders with QRQ leads to the fact that critical tasks get higher priority. Therefore, the most important processes is given more resources, and they are faster. As a result, the total computation time increases slightly but important processes are performed much faster than when using other scenarios.

AMD Radeon RX 480: The 14-nm process FinFET

An important feature of the new generation GPU from AMD and NVIDIA is a 14- or 16-nm process technology FinFET. After a very long period, AMD and NVIDIA were able to reach an agreement with contract manufacturer of chips, which allowed to accompany the transition to the new architecture of the simultaneous transition to a new process technology. In the past, this transition was not always smooth. Perhaps it is for this reason that AMD and NVIDIA as long adhered to the 28-nm process technology TSMC. With the move to 14 nm FinFET should show significant advantages in power consumption. Our tests GeForce GTX 1080 Founders Edition well show that the new process technology can provide a significant advantage in efficiency. In the case of NVIDIA potential it was very good.

Even with a single architecture, the transition to a new process is often very complicated. It is no coincidence Intel moved to the method of “tick-tock”, alternating transition to the new architecture and a new process technology. This reduces the risk that both will have to solve problems with a new architecture and a new process technology. As an example, NVIDIA difficulties encountered in the transition to the Fermi architecture with the new process technology. Problems arose with the interconnects (fabric) on the GPU chip, which ceased to function correctly at high frequencies, due to the NVIDIA chip which had to carry out. In the case of AMD transition to the new architecture of the Polaris and the new process was not accompanied by serious problems. Trial samples Polaris Polaris 10 and 11 chips were released in November and December 2015. Today, eight months later, the card has entered the market. By the way, AMD’s recently, apparently received coupons Vega 10, high-end new generation chip.

NVIDIA GPU for manufacturing continues to work with TSMC and finalize the 16-nm process FinFET, but the way AMD and TSMC, seems to have dispersed. GPU Partner for steel production GlobalFoundries company with a plant in the United States, namely in the state of New York. The conference Polaris Tech Day on stage raised by representatives of Globalfoundries. Will Vega also made at Globalfoundries capacity is still unknown.

AMD has already noted on several occasions that the development of Polaris architecture was the focus of efficiency and low power consumption. AMD used the possibility of a new process technology to pack more transistors on the same or a smaller area. Keep in mind that the transition to smaller structures there are other problems, such as leakage currents, which do not allow to achieve doubling of efficiency in practice.

Of course, the manufacturers for many years worked hard to offset these shortcomings. Modern GPU no longer share the same voltage for all units. Crystal is divided into Multi-voltage Islands region with a different voltage. Of course, this approach requires a separate voltage controller, which leads to additional costs. But such measures make it possible to use different frequencies for different areas of GPU or even turn them off. All this helps reduce overall power consumption and increase efficiency, but directly to better performance of the architecture does not squeeze.

Intel was the first to produce crystals of the FinFET technology, which adds 3D-structure of a planar transistor. The emitter, base and collector are exposed over a flat substrate, which gives better control the passage of current through the transistor.

It seems that such an approach does not seem logical, but FinFET compared to planar transistors, there are significant advantages. In the production characteristics of the scatter it is much less. Less variation can more accurately and efficiently control individual areas or even transistors.

AMD sees two advantages FinFET process technology: lower power consumption and potentially higher productivity should increase efficiency by 2.5 times.

In addition to advantages in efficiency switching to smaller process technologies and provides other benefits that are not visible at first sight. Here we can mention the use of PCB smaller and less powerful / expensive power subsystem. Finally, GPU packaging can also be smaller, which also saves space on the PCB.

There is another effect that has not yet manifested itself: at least “younger” GPU Polaris 11 obtained much more subtle. Packaging GPU Bonaire size was 0.29 x 0.29 inch (7.37 mm), Polaris 11 when it will be only 0.245 x 0.245 inch (6.22 mm). But more interesting is the chip thickness, which is reduced from 1.9 mm (Bonaire) and 1.5 mm (Polaris 11). The reason lies in the smaller number of layers in the package also near the crystal substrate is removed after lithography.

Adaptive Clocking

Like NVIDIA, AMD in the context of a smaller process technology points to the difficulties in applying power to the chip.

Although the reduction process technology enables lower voltage, the power subsystem must still cope with a fairly high currents. Typically, when the voltage fluctuations of up to 10-15%. With an operating voltage of 1 V at the peaks it can reach 1.15 V. To close this spread, manufacturers have to raise the average voltage. All this leads to an increase in energy consumption, which is completely unnecessary. Adaptive Clocking technology allows you to cope with this problem, a 25% lower energy consumption growth.

When a voltage fluctuation of power system responds as follows. Without technologies Adaptive Clocking voltage remains at a nominal level, but with Adaptive Clocking, if the voltage rises above the desired level, with nanosecond delay correction occurs. The voltage VDD to one thousandth of a second is reduced, both the GPU frequency is reduced. Then, the voltage and frequency return to the operating level.

Adaptive Voltage & Frequency Scaling (AVFS)

In addition to dynamically adjusting voltage and GPU AMD architecture Polaris has implemented and some other functions. Among them – Adaptive Voltage & Frequency Scaling or AVFS.

AVFS described method, which takes into account not only in different areas of GPU voltage and temperature, and frequency. Typically, GPU frequency is 3.2 percent margin to cover GPU aging. This margin by using AVFS technology is no longer required because the GPU independently evaluates its work and autonomously adjusts operating parameters via the P-states.

If the clock frequency from the default values are applied only to the required voltages, without excessive increase. But Adaptive Clocking technology is constantly trying to squeeze a higher clock frequency (of course, do not exceed the maximum levels stated for Adaptive Clocking), and tries to put the technology AVFS operating voltages for these frequencies without excessive lifting. Thus, AMD provides the highest performance, while trying to achieve the best long-term components of life.

Boot Time Calibration (BTC) and Multi-Bit-Flip-Flops (MBFF)

The third important factor – Boot Time Calibration (BTC). This technology is already used by AMD in the 7th generation of APU named Bristol Ridge, it demonstrates how far the AMD on the joint development of APU and GPU.

Boot Time Calibration Procedure GPU monitors behavior during loading, as a rule, in certain conditions may be applied higher voltages. It also contributes to Reliability Tracker, and also has a VRM error. BTC is the purpose to decrease power consumption when possible, as well as more aggressive switching P-states.

Also, AMD has tried to solve the problem of aging chips. The foregoing techniques not only permit the chips to work more efficiently, but also provide additional power when required.

chip area and power consumption may depend on the technology of multi-bit flip-flops (MBFF), wherein the smaller structures are combined into larger as needed. This saves both chip area and power consumption. The GPU Polaris 10 AMD says 21 million. These elements flip-flops (11,1 watts of total power consumption of 85 W), ie about 15% of total energy consumption. As a result, AMD was able to save about 4-5% of the total power consumption of the chip.

The document (PDF) Chinese developers explained in detail because of the savings MBFF chains. Particular attention is paid to saving the chip area, which is achieved due to MBFF.

Test and review: AMD Radeon RX 480 - new mass-market leader

In matters of energy consumption or efficiency of AMD paid special attention to the memory interface. Due to the color delta compression and more efficient work with L2 cache and registers AMD was able to increase the memory interface performance, despite a decrease in throughput. However, memory chips AMD has been able to reduce the power consumption of the memory interface, up to 58%, which has a positive impact on the effectiveness of video as a whole.

The architecture Polaris AMD says an improvement in the ratio of 1.7 times the performance per watt of the transition to 14-nm process technology. Thanks to improvements in the very architecture we have improved to 2.8 times. All these theoretical advantages – it’s good, but it is unlikely we will see the same level of practice.

GlobalFoundries

At Polaris Tech Day the representatives of Globalfoundries, who talked about the problems that had to be solved in the production of GPU Polaris. Here AMD and Globalfoundries have to use the same approach as in the case of NVIDIA and TSMC cooperation. Production and architecture should be even more adapted to each other in the transition to the smaller process technologies.

Globalfoundries used technique Design Technology Co-Optimization (DTCO). This method involves several steps alternating between optimizing production and GPU architectures. Each time the results are compared with the desired production, after which the necessary changes are made. This process is repeated many times, the appropriate changes are made in production technology or architecture.

After reducing the technical process these steps were repeated many times over the past months. As a result, the production of 14-nm process technology has been optimized FinFET, it withstands higher control currents, the transistors can be switched quickly, they require less voltage, leakage currents were also reduced. All these effects has been achieved as a result of optimization.

If we believe GlobalFoundries, performance was improved by 55% at the same frequency, and power consumption was reduced by 65%.

AMD Radeon RX 480: TrueAudio Next and AFR Frame Pacing

TrueAudio Next:

AMD has a generation GPU Hawaii added to the crystal dedicated DSP (DSP), which provides hardware audio processing for more realism. But TrueAudio technology has not been able to enter the mass market, despite the support of middleware implementation was only in Thief. But there TrueAudio support does not become the most serious argument in favor of the game. With Polaris AMD decided on another attempt, this time in pursuit of other goals.

TrueAudio Next approach is quite similar to NVIDIA VRWORKS Audio. But both technologies were developed independently of each other. TrueAudio Next is a variant of ray tracing for the audio when the sound stage are a lot of sound sources and reflective surfaces – all this allows us to achieve a high level of realism. sound calculation benefits from the support of Asynchronous Compute, because the relevant calculations will be carried out simultaneously with graphic calculations.

In the allocation of computing resources, AMD can not rely only on a hardware controller, through the Compute Unit Reservation possible to allocate a certain amount of CU on the necessary process. For example, of the 36 CU 28 can be given on the schedule, and the remaining eight – on calculations audio. Four CU performance roughly correspond to 4-core processor that allows to redistribute part of the CU computing tasks.

Middleware for developers of audio interfaces in games already announced support TrueAudio Next. Now it remains to see how many games will support the final version of TrueAudio technology, what advantages will the technology. On this question we can answer in a few months.

As in the case of NVIDIA, architecture has the ability to simultaneously calculate several viewports. NVIDIA calls this technology Simultaneous Multi Projection, AMD has chosen the name of the Variable Rate Shading. The technology is the same with glasses VR scene must be calculated taking into account the peculiarities of view of the scene from the direction of the two eyes.

NVIDIA Pascal architecture supports up to 16 viewports, AMD’s also supports up to 16, and are utilized Geometry Shader, that allows you to calculate 2 to 16 viewports in a single pass. Variable Rate Shading is not new technology architecture Polaris, it has long been present in the GCN design, although not used.

AFR Frame Pacing:

With Frame Pacing AMD technology solves the problem, often occurring in multi-GPU mode for different time frame rendering. Frame Pacing involves the use of an algorithm, smoothing the difference. Initially, the technology worked only in DirectX 11 games, and then it has undergone several phases of development. With the generation of Polaris and new drivers Radeon Software AMD adds Frame Pacing for DirectX 12 with Crossfire enabled.

But the current driver, with whom we tested the Radeon RX 480, the technology is not yet supported. The implementation should be similar to what we have seen in a mode with a single GPU – is added to the delay too fast a calculated frames so that they uniformly displayed.

DisplayPort 1.3 / 1.4, HDMI 2.0b and HDR

The first pieces of information about the architecture of the Polaris, which shared AMD, featured support output DisplayPort 1.3 / 1.4 and HDMI 2.0b. Below we look at the special features and technical details of the new generation of Polaris with the display controller.

Immediately it should be noted that the architecture of Polaris sovestima with DisplayPort 1.3 HBR3 and DisplayPort 1.4. Both standard bandwidth is significantly increased compared to the previous generation, it can be used not only for higher resolutions, but also for other possibilities displays.

In particular, the game monitors (or their producers) need support in the DisplayPort 1.3 4K display with a resolution of 3840 x 2160 pixel and 120 Hz. In addition, there FreeSync support. Moreover, in such FreeSync monitors will run between 30 and 120 Hz. The first models are expected in the fourth quarter of 2016.

DisplayPort 1.3 standard also allows you to connect monitors with one cable 5K. Until now it required two cable DisplayPort, otherwise the frame rate is not greater than 30 Hz. In DisplayPort 1.3 standard resolution 5K possible refresh rate of 60 Hz. By the end of the year the market should see the corresponding monitors.

If you look at the supported resolution and refresh rate, you can see HDR support with high resolution and refresh rate. In the case of 1.920 x 1.080 pixels, it has up to 240 Hz, but also at 4K monitors can support up to 96 Hz.

HDR (High Dynamic Range) and high dynamic range significantly expands the color space of the monitor. Until now, manufacturers of monitors focused on the resolution or refresh rate. HDR adds a third factor, but no less important. The first monitors HDR, which will appear later this year, will support the color space P3 Digital Cinema (yellow triangle), which is considerably more than the sRGB standard (blue triangle). But in the longer term it is planned to expand to the color gamut Rec. 2020 (red triangle). Here, in comparison with the sRGB color space is doubled, the monitors will display 75% of the spectrum of colors that the human eye can see.

Of course, it may take some time before HDR displays will dominate the market. The first monitors must appear at the end of the year, their price, as in the case of any other new technology, will be quite high. Another problem lies in the absence of standards. Each manufacturer can define your own HDR standard. Covered by the color space and brightness levels are not yet standardized.

Brightness – the second important advantage of HDR except gamut. Here we also have to obtain significant improvements, but it all depends on how the new standard will spread quickly. Standard 10-Bit ST 2048 for the next 20 years should be sufficient.

To support the new standards AMD GPU displays had to modify the display pipeline. Polaris supports 10-bit and 12-bit HDR. In the first implementation, we will see the 10-bit HDR, but here there are significant differences between the implementations of the HDR. In particular, the manufacturers have not yet come to a consensus about the color gamut and brightness.

AMD designed the frame buffer and display controller based programming capabilities. You can reprogram the color gamut of a display without further delay. Radeon Photon SDK for HDR provides the appropriate adjustments.

The slide above shows the supported resolution and refresh rate. AMD and NVIDIA with the standards DisplayPort 1.3 / 1.4 and HDMI 2.0b significantly outperformed the playback equipment (TVs and monitors). It will take some time before the market will go the corresponding model.

AMD has paid attention not only to the quality of the image displayed content, but also the functions of encoding and decoding. Polaris architecture includes hardware encoder and HEVC VP9, H.264, MP4 P2 and VC1 in the modern versions with support for up to 4K resolution and 120 fps.

AMD Radeon RX 480 inside

But enough theory, let us consider the video to Polaris architecture. As usual, we will check the work of the graphics card under real conditions in the PC system under load. Allow me to introduce the first tests.

In the video card Radeon R9 290X used PowerTune technology, AMD Boost the marginal rate, but in reality the rate could be lower. With reference cooler Radeon R9 290X and happened: he could not cope with cooling the graphics card, so the maximum frequency was observed for a short time. The video card Radeon R9 Fury X AMD bypassed this problem by the CBO, which has always coped with cooling the graphics card below the threshold. With Radeon Fury Nano, we again have a situation when the maximum frequency of 1.000 MHz, the GPU was observed in rare cases.

Therefore, we always analyze the behavior of the video and frequency under varying load. AMD said base frequency of 1.120 MHz, Boost mode, it can be increased up to 1.266 MHz, but it does not guarantee that such frequency we get in practice. In our tests, the reference Radeon RX 480’s GPU is often heated to 87 ° C. The maximum rate was 1.225 and 1.266 MHz. As you can see, in practice, the graphics card does not always work with the highest frequencies Boost.

GPU-Z screenshot confirms the above information AMD Radeon RX 480. Unfortunately, GPU-Z utility is not able to distinguish between the reference clock and Boost frequency.

AMD Radeon RX 480 – impressions

Below we take a closer look at the video card Radeon RX 480 and the reference cooler.

Test and review: AMD Radeon RX 480 - new mass-market leader

Even on the first specifications you might guess, the video card will compact – at least, this applies to the PCB. Unfortunately, for some reason, the video card does not disable the fan in idle mode, although this option is now available to many models. But AMD and NVIDIA refused to turn off the fan in their reference designs.

At first glance you notice the compact dimensions Radeon RX 480. The length of the video is only 240 mm. According to AMD the basic design has remained faithful to the principles laid down with Radeon R9 Fury X. The casing of the cooler is completely closed. For ventilation meets only one radial fan in the back of the card. The face of the casing is coated with soft-touch.

As can be seen from behind, the printed circuit board directly noticeably shorter than 240 mm graphics cards. Its length is only 180 mm, the radiator is exposed to about 60 mm. AMD’s Radeon RX 480 case refused to use the back plate. Several AMD partners have announced the release of video cards based on the reference design, but with the back plate. The rest of the back side of the PCB is not anything interesting. You can see the contact pads for additional components, but their nature is difficult to determine.

Radial fan Radeon RX 480 has a diameter of 60 mm. AMD, as NVIDIA, continues to use the turbine. It takes air from the top, in the region of the axis of the fan, and then pushes it through the radiator. Most of the hot air is thrown out through the PC chassis-slot cover. Of course, here the good news is that the hot air does not accumulate inside the case.

At the end of the graphics card slot next to the stopper can see an updated Radeon logo with a new font. On the front side of the video card are pleased to note coated soft-touch. However, the plate can be removed by unscrewing the four screws.

AMD indicates typical power consumption of Radeon RX 480 at 150 watts. Therefore, an additional 6-pin power socket is enough, it provides power to 75 watts. Also, through the PCI Express slot can be made up to 75 watts. Note that directly GPU itself consumes up to 85 watts, and the remaining 65 W fall on other components, such as memory. AMD also gives some reserve power for acceleration, referred to 85 W can be increased by 50% The result is up to 127.5 watts for the chip and memory and other components are still 22.5 watts – of course, if we take the level of 150 W .

On the front side of the video card is also possible to see the new logo with the new Radeon font. Unlike cards Fury, from Radeon RX 480 it is not highlighted. In general, the card looks very dark.

In the slot plug arranged set of air vents through which hot air is expelled outside the PC case. Only a small part of the accumulated heat remains within the PC chassis. Video outputs are represented by three DisplayPort 1.3 / 1.4 and one HDMI 2.0b. AMD completely abandoned dual channel DVI, although AMD partners will continue to probably use the old interface.

Some fresh air radial fan gets out of the holes on the back side of the video card – is AMD took advantage of the cooler exhibitor. Of course, AMD graphics card could do even more compact.

We continue its consideration of the standard version Radeon RX 480:

Behind the PCB, next to the slot plug, you can see the contacts that are likely to bridge Crossfire. But AMD enables Crossfire support graphics cards Radeon RX 480 without an additional bridge, as we have seen in the case of GPU Hawaii, Tonga and Fiji. Apparently, AMD decided to err, and yet did not remove the pads. Probably, they are left for testing design development stage.

Sticker rear video indicates the model Radeon RX 480 with 8 GB of video memory. All memory chips are located on the front side of the video card, there is nothing behind. In the center of the image visible pads to install the second VRM-controller.

It is interesting to compare the sizes of graphics cards Radeon RX 480 (top) and Radeon Fury Nano (bottom). Of course, the new card Polaris is relatively small, but the Radeon Fury Nano clearly proves that you can get even more compact dimensions with high-end GPU and High Bandwidth Memory. Radeon Fury Nano remains the most graphics performance for its size.

Without a graphics card cooler is even more compact. As you can see, AMD reference design for power system components has placed between the GPU and the slot plug. The reason lies in a compact size. In the longer graphics cards power subsystem is usually shifted to the far third of the PCB.

GPU chip does not give further details about the chip, the information is engraved on the frame. In the upper left corner, we found only the date of production. The chip is called Polaris 10 XT, in a subsequent video Radeon RX 470, we probably get 10 Polaris PRO.

AMD used the Samsung memory chips with a frequency of 2.00 MHz. They are marked K4G80325FB-AC25. Chips capacity of 8 Gb arranged in a configuration of 32 x 256 Mbps. Package contains 170 FBGA170 bottom contacts, the operating voltage range from 1,305 to 1,597 V. In the specifications chips can work at 2,000 MHz with a refresh rate of 0.25 ns.

Thus, our sample Radeon RX 480 has eight chips 1GB. In sum, we get 8 GB of video memory. But, as we mentioned above, the released version of the graphics card with 4GB of memory. It is unknown whether halve the number of AMD chips or choose half the capacity of each chip.

Power Radeon RX 480 provides six phases that seems very significant number for a similar video card. However, the number of phases is not important, and their quality. We certainly get alternative designs with more or fewer phases.

Cooler for heat dissipation of 150 W is unlikely to be complicated. In the case of AMD Radeon RX 480 has chosen a copper base with a coating outside the GPU contact area. The radiator is made of aluminum, it lurks behind the copper base plate. Bottom cooler uses an extra metal plate which directs heat away from the memory chips and the power components of the system.

Another picture allows you to see an aluminum radiator, located behind the copper base plate. For memory chips and other components of AMD used a thermal interface material. Contact GPU platform is equipped with a sufficient amount of thermal paste.

AMD Radeon RX 480: Conclusion

AMD graphics card with Radeon RX 480 does not claim to be the leader. This should be kept in mind when studying the test results. However, it is focused on the maximum price / performance ratio in the price range of 16-20 thousand. Rubles (200-250 euros). But even here, must be distinguished AMD’s advertising claims, which we have received in recent months, and the reality.

Recall that AMD loudly declared price of $ 199 for a graphics card Radeon RX 480. It soon became clear that this is a video card with 4 GB of graphics memory. That game is not limited by the available memory, today recommended to take a video card with 8 GB. AMD itself also sent us a sample with 8 GB of memory. Soon we plan to conduct tests with video cards 4GB GDDR5, to compare the two versions. But back to the price: the level of 255.85 euros or 18,970 rubles can not be called low. Much closer to the promised bar Radeon RX 480 with 4 GB of memory and costs 214.20 euros, or 16,310 rubles. But here we draw conclusions after the additional test in the future.

AMD architecture with Polaris moved to a new manufacturing process of 14 nm. It allows you to pack more transistors in the same area. Moreover, the chips can operate at a lower voltage with a higher frequency. Of course, in the case of frequency AMD has made a breakthrough not as significant as we have received it from NVIDIA. The same applies to architecture and design. NVIDIA architecture with Pascal put on a high frequency Boost, AMD has decided to compete in a large number of stream processors, which is more “general” architecture. All this we clearly see from Polaris.

AMD tried to squeeze out the maximum Polaris architecture. In addition to changing process technology, we should get a substantially higher energy efficiency. But this comparison is necessary to take the right base. We rarely receive performance gains of 50-70% when moving to a new generation. If you compare the Radeon RX 480 with the graphics card Radeon R9 380 (which hints the name), an increase of between 40 and 60 percent – depending on the test and settings. So the graphics card Radeon RX 480 provides a significant performance boost.

It is not so simple with the promised performance. Video Card Radeon RX 480, together with the whole system consumes about 20 W more than Radeon R9 380, roughly at the level of GeForce GTX 1070 Founders Edition, which is far ahead of AMD’s tests. Frankly, we expected more.

AMD also stressed that Radeon RX 480 provides enough performance to work in tandem with modern glasses VR. The performance of the graphics card Radeon R9 380 and Radeon RX 480 competes with GeForce GTX 970, so the minimum requirement is indeed fulfilled.

DirectX 12 could be called a second important topic AMD, is an American manufacturer has kept its promises. But the DirectX 12 performance is highly dependent on the implementation. The game Ashes of the Singularity, we are seeing a significant increase in performance in Rise of the Tomb Raider advantage can be seen only on certain combinations of graphics card and CPU. Yet the game developers have to work hard in the future, is that the reasons for switching to DirectX 12 API is not so much.

For Async Compute true what has been said above about DirectX 12. If support is implemented well, AMD graphics card wins. NVIDIA with the Maxwell architecture did not support Asynchronous Shaders or Async Compute, so the graphics card does not receive data from the winning technology. With the new architecture Pascal everything is already different, as can be seen by the results of tests GeForce GTX 1080 Founders Edition.

By cooling the already less pleasant. Frankly, at idle and under load the video card could work a little quieter. Especially that AMD could use the available potential. In any case, it will be interesting to look at the graphics cards with alternative cooling systems. Quite strange to see that our test sample was heated under a load of up to 87 ° C – before we disassembled the video card. We decided to replace the thermal paste, it is uniformly distributed and then collected the cooler and conducted repeated tests. GPU temperature did not rise above 84 ° C.

Test and review: AMD Radeon RX 480 - new mass-market leader

In general, as shown by tests Radeon RX 480, AMD architecture with Polaris is on the right track. Yet the video card does not have enough “spice.” It could be slightly higher or slightly lower graphics card price – all these Radeon RX 480 can not yet boast. Therefore verdict is difficult, although AMD has worked well. The benefits can be attributed, and 8 GB of video memory, and support for standard DisplayPort 1.3 / 1.4 and HDMI 2.0b. But the relatively high level of noise and GPU temperatures are somewhat smeared overall impression.

Our conclusion may seem negative, but the fact that AMD’s expectations were very high. Radeon RX 480 promised to be excellent graphics for its price – no more, no less. But, unfortunately, the revolution we did not get. The price difference between the GeForce GTX 970 and the precursors of our own production is low. However, for 250 euros or 18-19 thousand rubles you get a decent graphics card for 1080p or 1440p permits, which can cope with future games.

Benefits of AMD Radeon RX 480:

  • The performance increase of 40-60% compared to the Radeon R9 380
  • The new architecture Polaris
  • Production of 14-nm process technology FinFET
  • 8 GB of video memory
  • DisplayPort 1.3 / 1.4
  • HDMI 2.0b

Disadvantages AMD Radeon RX 480:

  • Too loud running idle and under load
  • High temperatures under load

Source: videocardz

Test and review: AMD Radeon RX 480 – new mass-market leader was last modified: July 20th, 2016 by Tomas Shellby