Oracle’s AI with AMD Instinct MI355X GPUs
Oracle and AMD have teamed up to bring a new GPU offering to Oracle Cloud Infrastructure (OCI). The announcement of the AMD Instinct™ MI355X GPUs on OCI is designed to provide value-driven scaling for large-scale AI workloads, with approximately two times the price-performance ratio than the previous generation.
Oracle plans to have massively zettascale AI clusters with as many as 131,072 of these GPUs. They are not just trying to throw processing power at something; they are trying to flexible meet the scale and capacity needs of complex AI training and inference workloads in all industries. According to Oracle Executive Vice President Mahesh Thiagarajan, the combination of AMD’s industry-leading GPUs with OCI’s world-class networking, processing, and security equips its customers with the opportunity to realize their AI innovations.
The current AI workloads, which include and are not limited to, large language models and generative AI simply require more memory, more processing, and smarter orchestration. The MI355X GPUs include almost three times pricing power than previous generations with a massive increase in high-bandwidth memory limits, which allow you to run larger models entirely in memory, while using processing and memory efficiently.
Forrest Norrod from AMD shared the long-standing relationships between both companies emphasizing the new capabilities allow for flexible system architectures and will enable us to support a greater variety of use cases from real-time inference to training at scale.
Customers leveraging these new GPU powered cloud shapes on OCI can expect up to 2.8x faster throughput, along with new capabilities like FP4 support for ultra-efficient large-model inferences. The dense, liquid-cooled infrastructure also provides high-quality performance without sacrificing energy efficiency, which is essential in today’s demanding compute world.
A highlight of these capabilities is the installation of AMD’s Pollara™ network interface cards that bring a suite of advanced network features into OCI with the OEM specification. This allows Oracle to provide faster communication between nodes, decreased latency, and improved congestion control for efficient high-speed AI operations.
With open source compatibility via AMD’s ROCm software stack, customers have the added benefit of freedom of build, scale, and migration without vendor lock-in.
Together, Oracle and AMD are ushering in AI infrastructure that is not only faster and more performant, but more inclusive, secure, and ready for future agentic applications.
____________________________________________________________________________________________________
Latest Stories
NVIDIA Brings the Fastest Supercomputer in Europe, JUPITER