AMD Radeon Instinct™ MI60 and MI50 accelerators with supercharged compute performance, high-speed connectivity, fast memory bandwidth and updated ROCm open software platform power the most demanding deep learning, HPC, cloud and rendering applications
|
SAN FRANCISCO, Nov. 06, 2018 (GLOBE NEWSWIRE) -- AMD (NASDAQ: AMD) today announced the AMD Radeon Instinct™ MI60 and MI50 accelerators, the world’s first 7nm datacenter GPUs, designed to deliver the compute performance required for next-generation deep learning, HPC, cloud computing and rendering applications. Researchers, scientists and developers will use AMD Radeon Instinct™ accelerators to solve tough and interesting challenges, including large-scale simulations, climate change, computational biology, disease prevention and more.
“Legacy GPU architectures limit IT managers from effectively addressing the constantly evolving demands of processing and analyzing huge datasets for modern cloud datacenter workloads,” said David Wang, senior vice president of engineering, Radeon Technologies Group at AMD. “Combining world-class performance and a flexible architecture with a robust software platform and the industry’s leading-edge ROCm open software ecosystem, the new AMD Radeon Instinct™ accelerators provide the critical components needed to solve the most difficult cloud computing challenges today and into the future.”
The AMD Radeon Instinct™ MI60 and MI50 accelerators feature flexible mixed-precision capabilities, powered by high-performance compute units that expand the types of workloads these accelerators can address, including a range of HPC and deep learning applications. The new AMD Radeon Instinct™ MI60 and MI50 accelerators were designed to efficiently process workloads such as rapidly training complex neural networks, delivering higher levels of floating-point performance, greater efficiencies and new features for datacenter and departmental deployments1.
The AMD Radeon Instinct™ MI60 and MI50 accelerators provide ultra-fast floating-point performance and hyper-fast HBM2 (second-generation High-Bandwidth Memory) with up to 1 TB/s memory bandwidth speeds. They are also the first GPUs capable of supporting next-generation PCIe® 4.02 interconnect, which is up to 2X faster than other x86 CPU-to-GPU interconnect technologies3, and feature AMD Infinity Fabric™ Link GPU interconnect technology that enables GPU-to-GPU communications that are up to 6X faster than PCIe® Gen 3 interconnect speeds4.
AMD also announced a new version of the ROCm open software platform for accelerated computing that supports the architectural features of the new accelerators, including optimized deep learning operations (DLOPS) and the AMD Infinity Fabric™ Link GPU interconnect technology. Designed for scale, ROCm allows customers to deploy high-performance, energy-efficient heterogeneous computing systems in an open environment.
“Google believes that open source is good for everyone,” said Rajat Monga, engineering director, TensorFlow, Google. “We've seen how helpful it can be to open source machine learning technology, and we’re glad to see AMD embracing it. With the ROCm open software platform, TensorFlow users will benefit from GPU acceleration and a more robust open source machine learning ecosystem.”
Key features of the AMD Radeon Instinct™ MI60 and MI50 accelerators include:
- Optimized Deep Learning Operations: Provides flexible mixed-precision FP16, FP32 and INT4/INT8 capabilities to meet growing demand for dynamic and ever-changing workloads, from training complex neural networks to running inference against those trained networks.
- World’s Fastest Double Precision PCIe®2 Accelerator5: The AMD Radeon Instinct™ MI60 is the world’s fastest double precision PCIe 4.0 capable accelerator, delivering up to 7.4 TFLOPS peak FP64 performance5 allowing scientists and researchers to more efficiently process HPC applications across a range of industries including life sciences, energy, finance, automotive, aerospace, academics, government, defense and more. The AMD Radeon Instinct™ MI50 delivers up to 6.7 TFLOPS FP64 peak performance1, while providing an efficient, cost-effective solution for a variety of deep learning workloads, as well as enabling high reuse in Virtual Desktop Infrastructure (VDI), Desktop-as-a-Service (DaaS) and cloud environments.
- Up to 6X Faster Data Transfer: Two Infinity Fabric™ Links per GPU deliver up to 200 GB/s of peer-to-peer bandwidth – up to 6X faster than PCIe 3.0 alone4 – and enable the connection of up to 4 GPUs in a hive ring configuration (2 hives in 8 GPU servers).
- Ultra-Fast HBM2 Memory: The AMD Radeon Instinct™ MI60 provides 32GB of HBM2 Error-correcting code (ECC) memory6, and the Radeon Instinct™ MI50 provides 16GB of HBM2 ECC memory. Both GPUs provide full-chip ECC and Reliability, Accessibility and Serviceability (RAS)7 technologies, which are critical to deliver more accurate compute results for large-scale HPC deployments.
- Secure Virtualized Workload Support: AMD MxGPU Technology, the industry’s only hardware-based GPU virtualization solution, which is based on the industry-standard SR-IOV (Single Root I/O Virtualization) technology, makes it difficult for hackers to attack at the hardware level, helping provide security for virtualized cloud deployments.
Updated ROCm Open Software Platform
AMD today also announced a new version of its ROCm open software platform designed to speed development of high-performance, energy-efficient heterogeneous computing systems. In addition to support for the new Radeon Instinct™ accelerators, ROCm software version 2.0 provides updated math libraries for the new DLOPS; support for 64-bit Linux operating systems including CentOS, RHEL and Ubuntu; optimizations of existing components; and support for the latest versions of the most popular deep learning frameworks, including TensorFlow 1.11, PyTorch (Caffe2) and others. Learn more about ROCm 2.0 software
here.
Availability
The AMD Radeon Instinct™ MI60 accelerator is expected to ship to datacenter customers by the end of 2018. The AMD Radeon Instinct™ MI50 accelerator is expected to begin shipping to data center customers by the end of Q1 2019. The ROCm 2.0 open software platform is expected to be available by the end of 2018.
Supporting Resources
- Visit the AMD Next Horizon event webpage to get the event materials
- Learn more about AMD Radeon Instinct™ MI60 and MI50 accelerators
- Learn more about AMD 7nm technology here
- Learn more about the ROCm 2.0 open software platform here
- Learn more about ROCm & MIOpen Docker Hub here
- Become a fan of AMD on Facebook
- Follow AMD Radeon Instinct on Twitter
About AMD
For more than 45 years AMD has driven innovation in high-performance computing, graphics and visualization technologies â� the building blocks for gaming, immersive platforms and the datacenter. Hundreds of millions of consumers, leading Fortune 500 businesses and cutting-edge scientific research facilities around the world rely on AMD technology daily to improve how they live, work and play. AMD employees around the world are focused on building great products that push the boundaries of what is possible. For more information about how AMD is enabling today and inspiring tomorrow, visit the AMD (NASDAQ: AMD)
website,
blog,
Facebook and
Twitter pages.
Cautionary Statement
This press release contains forward-looking statements concerning Advanced Micro Devices, Inc. (AMD) including the features, functionality, availability, timing and expected benefits of the AMD Radeon Instinct™ MI60 and MI50 accelerators and the ROCm 2.0 open software platform, which are made pursuant to the Safe Harbor provisions of the Private Securities Litigation Reform Act of 1995. Forward-looking statements are commonly identified by words such as "would," "intends," "believes," "expects," "may," "will," "should," "seeks," "intends," "plans," "pro forma," "estimates," "anticipates," or the negative of these words and phrases, other variations of these words and phrases or comparable terminology. Investors are cautioned that the forward-looking statements in this document are based on current beliefs, assumptions and expectations, speak only as of the date of this document and involve risks and uncertainties that could cause actual results to differ materially from current expectations. Such statements are subject to certain known and unknown risks and uncertainties, many of which are difficult to predict and generally beyond AMD's control, that could cause actual results and other future events to differ materially from those expressed in, or implied or projected by, the forward-looking information and statements. Material factors that could cause actual results to differ materially from current expectations include, without limitation, the following: Intel Corporation’s dominance of the microprocessor market and its aggressive business practices may limit AMD’s ability to compete effectively; AMD has a wafer supply agreement with GF with obligations to purchase all of its microprocessor and APU product requirements, and a certain portion of its GPU product requirements, from GLOBALFOUNDRIES Inc. (GF) with limited exceptions. If GF is not able to satisfy AMD’s manufacturing requirements, its business could be adversely impacted; AMD relies on third parties to manufacture its products, and if they are unable to do so on a timely basis in sufficient quantities and using competitive technologies, AMD’s business could be materially adversely affected; failure to achieve expected manufacturing yields for AMD’s products could negatively impact its financial results; the success of AMD’s business is dependent upon its ability to introduce products on a timely basis with features and performance levels that provide value to its customers while supporting and coinciding with significant industry transitions; if AMD cannot generate sufficient revenue and operating cash flow or obtain external financing, it may face a cash shortfall and be unable to make all of its planned investments in research and development or other strategic investments; the loss of a significant customer may have a material adverse effect on AMD; AMD’s receipt of revenue from its semi-custom SoC products is dependent upon its technology being designed into third-party products and the success of those products; AMD products may be subject to security vulnerabilities that could have a material adverse effect on AMD; data breaches and cyber-attacks could compromise AMD’s intellectual property or other sensitive information, be costly to remediate and cause significant damage to its business and reputation; AMD’s operating results are subject to quarterly and seasonal sales patterns; global economic uncertainty may adversely impact AMD’s business and operating results; AMD may not be able to generate sufficient cash to service its debt obligations or meet its working capital requirements; AMD has a large amount of indebtedness which could adversely affect its financial position and prevent it from implementing its strategy or fulfilling its contractual obligations; the agreements governing AMD’s notes and the Secured Revolving Line of Credit impose restrictions on AMD that may adversely affect its ability to operate its business; the markets in which AMD’s products are sold are highly competitive; AMD's issuance to West Coast Hitech L.P. (WCH) of warrants to purchase 75 million shares of its common stock, if and when exercised, will dilute the ownership interests of its existing stockholders, and the conversion of the 2.125% Convertible Senior Notes due 2026 may dilute the ownership interest of its existing stockholders, or may otherwise depress the price of its common stock; uncertainties involving the ordering and shipment of AMD’s products could materially adversely affect it; the demand for AMD’s products depends in part on the market conditions in the industries into which they are sold. Fluctuations in demand for AMD’s products or a market decline in any of these industries could have a material adverse effect on its results of operations; AMD’s ability to design and introduce new products in a timely manner is dependent upon third-party intellectual property; AMD depends on third-party companies for the design, manufacture and supply of motherboards, software and other computer platform components to support its business; if AMD loses Microsoft Corporation’s support for its products or other software vendors do not design and develop software to run on AMD’s products, its ability to sell its products could be materially adversely affected; and AMD’s reliance on third-party distributors and AIB partners subjects it to certain risks. Investors are urged to review in detail the risks and uncertainties in AMD's Securities and Exchange Commission filings, including but not limited to AMD's Quarterly Report on Form 10-Q for the quarter ended September 29, 2018.
©2018 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, Radeon, Instinct and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
_____________________________
1 As of Oct 22, 2018. The results calculated for Radeon Instinct MI60 designed with Vega 7nm FinFET process technology resulted in 29.5 TFLOPS half precision (FP16), 14.8 TFLOPS single precision (FP32) and 7.4 TFLOPS double precision (FP64) peak theoretical floating-point performance. This performance increase is achieved with an improved transistor count of 13.2 billion on a smaller die size of 331.46mm2 than previous Gen MI25 GPU products with the same 300W power envelope.
The results calculated for Radeon Instinct MI50 designed with Vega 7nm FinFET process technology resulted in 26.8 TFLOPS peak half precision (FP16), 13.4 TFLOPS peak single precision (FP32) and 6.7 TFLOPS peak double precision (FP64) floating-point performance. This performance increase is achieved with an improved transistor count of 13.2 billion on a smaller die size of 331.46mm2 than previous Gen MI25 GPU products with the same 300W power envelope.
The results calculated for Radeon Instinct MI25 GPU based on the “Vega10” architecture resulted in 24.6 TFLOPS peak half precision (FP16), 12.3 TFLOPS peak single precision (FP32) and 768 GFLOPS peak double precision (FP64) floating-point performance. This performance is achieved with a transistor count of 12.5 billion on a die size of 494.8mm2 with 300W power envelope.
AMD TFLOPS calculations conducted with the following equation for Radeon Instinct MI25, MI50, and MI60 GPUs: FLOPS calculations are performed by taking the engine clock from the highest DPM state and multiplying it by xx CUs per GPU. Then, multiplying that number by xx stream processors, which exist in each CU. Then, that number is multiplied by 2 FLOPS per clock for FP32 and 4 FLOPS per clock for FP16. To calculate FP64 TFLOPS rate for Vega 7nm products MI50 and MI60 a 1/2 rate is used and for “Vega10” architecture based MI25 a 1/16th rate is used.
TFLOP calculations for MI50 and MI60 GPUs can be found at https://www.amd.com/en/products/professional-graphics/instinct-mi50 and https://www.amd.com/en/products/professional-graphics/instinct-mi60
GFLOPS per Watt | |||
MI25 | MI50 | MI60 | |
FP16 | 0.082 | 0.089 | 0.098 |
FP32 | 0.041 | 0.045 | 0.049 |
FP64 | 0.003 | 0.022 | 0.025 |
Industry supporting documents / web pages:
http://www.tsmc.com/english/dedicatedFoundry/technology/7nm.htm
https://www.globalfoundries.com/sites/default/files/product-briefs/product-brief-7lp-7nm-finfet-technology.pdf
AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-2
2 Pending
3 As of October 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology-based accelerators are PCIe Gen 4.0 capable providing up to 64 GB/s Peak bandwidth per GPU card with PCIe Gen 4.0 x16 certified servers. Peak theoretical transport rate performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance.
Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s
PCIe Gen 3: 8 * 2 * 2 = 32 GB/s
PCIe Gen 4: 16 * 2 * 2 = 64 GB/s
Refer to server manufacture PCIe Gen 4.0 compatibility and performance guidelines for potential peak performance of the specified server models. Server manufacturers may vary configuration offerings yielding different results.
https://pcisig.com/
https://www.chipestimate.com/PCI-Express-Gen-4-a-Big-Pipe-for-Big-Data/Cadence/Technical-Article/2014/04/15
https://www.tomshardware.com/news/pcie-4.0-power-speed-express,32525.html
AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-5
4 As of Oct 22, 2018. Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology based accelerators are PCIe® Gen 4.0* capable providing up to 64 GB/s peak theoretical transport data bandwidth from CPU to GPU per card with PCIe Gen 4.0 x16 certified servers.
Previous Gen Radeon Instinct compute GPU cards are based on PCIe Gen 3.0 providing up to 32 GB/s peak theoretical transport rate bandwidth performance.
Peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions = GB/s per card
PCIe Gen3: 8 * 2 * 2 = 32 GB/s
PCIe Gen4: 16 * 2 * 2 = 64 GB/s
Vega20 to Vega20 xGMI = 25 * 2 * 2 = 100 GB/s * 2 links per GPU = 200 GB/s
xGMI (also known as Infinity Fabric Link) vs. PCIe Gen3: 200/32 = 6.25x
Radeon Instinct™ MI50 and MI60 “Vega 7nm” technology-based accelerators include dual Infinity Fabric™ Links providing up to 200 GB/s peak theoretical GPU to GPU or Peer-to-Peer (P2P) transport rate bandwidth performance per GPU card. Combined with PCIe Gen 4 compatibility providing an aggregate GPU card I/O peak bandwidth of up to 264 GB/s.
Performance guidelines are estimated only and may vary. Previous Gen Radeon Instinct compute GPU cards provide up to 32 GB/s peak PCIe Gen 3.0 bandwidth performance.
Infinity Fabric™ Link technology peak theoretical transport rate performance is calculated by Baud Rate * width in bytes * # directions * # links = GB/s per card
Infinity Fabric Link: 25 * 2 * 2 = 100 GB/s
MI50 |MI60 each have two links:
100 GB/s * 2 links per GPU = 200 GB/s
Refer to server manufacture PCIe Gen 4.0 compatibility and performance guidelines for potential peak performance of the specified server model numbers. Server manufacturers may vary configuration offerings yielding different results.
https://pcisig.com/
https://www.chipestimate.com/PCI-Express-Gen-4-a-Big-Pipe-for-Big-Data/Cadence/Technical-Article/2014/04/15
https://www.tomshardware.com/news/pcie-4.0-power-speed-express,32525.html
AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
RIV-4
5 Calculated on Oct 22, 2018, the Radeon Instinct MI60 GPU resulted in 7.4 TFLOPS peak theoretical double precision floating-point (FP64) performance. AMD TFLOPS calculations conducted with the following equation: FLOPS calculations are performed by taking the engine clock from the highest DPM state and multiplying it by xx CUs per GPU. Then, multiplying that number by xx stream processors, which exist in each CU. Then, that number is multiplied by 1/2 FLOPS per clock for FP64. TFLOP calculations for MI60 can be found at https://www.amd.com/en/products/professional-graphics/instinct-mi60. External results on the NVidia Tesla V100 (16GB card) GPU accelerator resulted in 7 TFLOPS peak double precision (FP64) floating-point performance. Results found at: https://images.nvidia.com/content/technologies/volta/pdf/437317-Volta-V100-DS-NV-US-WEB.pdf. AMD has not independently tested or verified external/third party results/data and bears no responsibility for any errors or omissions therein.
6 ECC support on 2nd Gen Radeon Instinct™ GPU cards, based on the “Vega 7nm” technology has been extended to full-chip ECC including HBM2 memory and internal GPU structures.
7 Expanded RAS (Reliability, availability and serviceability) attributes have been added to AMD’s 2nd Gen Radeon Instinct™ Vega 7nm technology based GPU cards and their supporting ecosystem including software, firmware and system level features. AMD’s remote manageability capabilities using advanced out-of-band circuitry allow for easier GPU monitoring via I2C, regardless of the GPU state. For full system RAS capabilities, refer to the system manufacturer’s guidelines for specific system models.
A photo accompanying this announcement is available at http://www.globenewswire.com/NewsRoom/AttachmentNg/72ecde34-7b7d-47f0-a7e3-56e53d7368f5