Amazon Ushers in a New Era of AI Training: Next-Gen Chips Deliver 4X Performance Leap

Tamim Rupo
November 29, 2023
12:30 pm

The AWS re: Invent conference in Las Vegas commenced with a flurry of announcements, primarily centered around this year’s predominant technology, AI. These revelations collectively provide insights into Amazon Web Services’ long-term objectives for artificial intelligence platforms.

To begin with, AWS introduced its latest AI chips tailored for both model training and executing trained models. Trainium2, dedicated to model training, boasts a performance improvement of up to 4x and a 2x increase in energy efficiency compared to its predecessor. Amazon assures that these chips will enable developers to train models swiftly and at a reduced cost, thanks to decreased energy consumption. Anthropic, a competitor backed by Amazon and OpenAI, has already committed to constructing models using Trainium2 chips.

On a different note, Graviton4 is geared more towards general usage. Built on Arm architecture, these processors consume less energy than Intel or AMD chips. Amazon guarantees a 30 percent boost in overall performance when employing a trained AI model integrated into a Graviton4 processor. This pledge is expected to curtail cloud-computing expenses for organizations regularly utilizing AI models while providing a slight acceleration for casual users engaged in tasks like creating fictional photos of Harry Potter at a rave or similar activities.

In total, Graviton4 is poised to empower AWS customers, enabling them to “process larger amounts of data, scale their workloads, improve time-to-results, and lower their total cost of ownership.” The preview of Graviton4 is available today, with a broader release scheduled in the upcoming months.

Typically, the introduction of in-house chips by a company signals potential challenges for existing third-party providers such as NVIDIA, a major player in the enterprise AI arena. NVIDIA’s GPUs are widely used for training, and its Arm-based datacenter CPU Grace has a significant presence. Rather than moving away from this partnership in favor of proprietary chips, Amazon is strengthening the collaboration by providing enterprise customers with cloud access to NVIDIA’s latest H200 AI GPUs. Additionally, Amazon will operate over 16,000 Nvidia GH200 Grace Hopper Superchips specifically for NVIDIA’s research and development team. This approach mirrors the strategy of its primary AI competitor, Microsoft, which, alongside unveiling its proprietary AI chip, Maia 100, also announced an enhanced partnership with NVIDIA.

Amazon has also introduced a new AI chatbot named Q, designed for business purposes. The name is likely inspired by the Star Trek character rather than any conspiracy association. Described as a “new type of generative AI-powered personal assistant,” Q is specifically crafted to streamline work projects and customer service tasks for businesses. It can be customized to fit any business and provides relevant responses to frequently asked questions. Amazon Q is also capable of generating content autonomously and taking actions based on customer requests, with personalized interactions based on a user’s role within a company.

Q will be present on communication platforms like Slack and widely used text-editing applications in the software development realm. In this capacity, Q possesses the capability to modify source code and establish connections with over 40 enterprise systems, including Microsoft 365, Dropbox, Salesforce, Zendesk, and others. Currently, in preview, Amazon Q is set for a broader release soon, with pricing ranging from $20 to $30 per user per month, contingent on the included features.

In summary, Amazon is making substantial investments in AI, aligning with the broader industry trend. More precisely, it is engaging in a competitive landscape with its longstanding cloud rival, Microsoft, vying to establish itself as the premier choice for enterprise-centric AI solutions. By leveraging AI, Amazon aims to reinforce its dominance in the cloud computing sector, seeking to minimize the market share gains of competitors such as Microsoft, Google, and Alibaba.