The proliferation of artificial intelligence (AI) relies on continuous alignment between hardware and software innovation. It’s this combination which will improve AI capabilities at every technology touchpoint, from the smallest sensor running workloads at the edge to the biggest server handling complex workloads to train large language models (LLMs). As the ecosystem works to realize the true potential of AI, it’s clear that security, sustainability, and data ceilings will all be a challenge. Therefore, it’s vital that we continue to explore industry-wide collaboration, so that we can achieve AI at scale, including more inference at the edge. Arm is engaged in several new strategic partnerships that will fuel AI innovation and bring AI-based experiences to life.
In addition to our own technology platforms where AI development is already happening, Arm is working with leading technology companies, including AMD, Intel, Meta, Microsoft, NVIDIA, and Qualcomm Technologies, Inc. on a range of initiatives focused on enabling advanced AI capabilities for more responsive and more secure user experiences. These partnerships will create the foundational frameworks, technologies, and specifications required for the 15 million and counting Arm developers to deliver next-generation AI experiences across every corner of computing.
Powering AI at the edge
While generative AI and LLMs may be capturing headlines today, Arm has been at the forefront of delivering AI at the edge for years, with 70% of third-party AI applications running on Arm CPUs in the smartphone space. However, as we explore how to deliver AI in a sustainable way, and move data around efficiently, the industry needs to evolve to run AI and machine learning (ML) models at the edge, which is challenging as developers are working with increasingly limited computing resources.
Arm is
working with NVIDIA to adapt NVIDIA TAO, a low-code open-source AI toolkit for Ethos-U NPUs, which helps to create performance-optimized vision AI models for the purpose of deploying them on these processors. The NVIDIA TAO provides an easy-to-use interface for building on top of TensorFlow and PyTorch, which are leading, free, open-source AI and ML frameworks. For developers, this means easy and seamless development and deployment of their models, while also bringing more complex AI workloads to edge devices for enhanced AI-based experiences.
Advancing neural networks across all devices and markets
A vital aspect of the continued growth of AI is advancing the deployment of neural networks at the edge.
Arm and Meta are working to bring PyTorch to Arm-based mobile and embedded platforms at the edge with ExecuTorch. ExecuTorch makes it far easier for developers to deploy state-of-the-art neural networks that are needed for advanced AI and ML workloads across mobile and edge devices. Moving forward, the collaboration between Arm and Meta will ensure AI and ML models can be easily developed and deployed with PyTorch and ExecuTorch.
The work with Meta builds on significant investments that we have already made in the Tensor Operator Set Architecture (TOSA), which provides a common framework for AI and ML accelerators and supports a broad range of workloads employed by deep neural networks. TOSA will be the cornerstone of AI and ML for a diverse range of processors and billions of devices that are built on the Arm architecture.
Industry-wide scalable AI
Supporting the wide deployment of data formats is crucial for scaling AI at a relatively low cost. Arm has been working hard to support a variety of emerging small data types focused on AI workloads.
Last year,
in a joint collaboration, Arm, Intel, and NVIDIA, published a new 8-bit floating point specification, the ‘FP8’. Since then, the format has gained momentum and the group of companies expanded to AMD, Arm, Google, Intel, Meta, and NVIDIA, who together created the official
OCP 8-bit Floating Point Specification (OFP8). In our latest A-profile architecture update, we’ve added OFP8 consistent with this standard to support its rapid adoption in neural networks across the industry. OFP8 is an interchange 8-bit data format that allows the software ecosystem to share neural network models easily, facilitating the continuous advancement of AI computing capabilities across billions of devices.
Open standards are critical to driving innovation, consistency, and interoperability in the AI ecosystem. In continuing our work to support industry collaboration efforts on these standards we recently joined the Microscaling Formats (MX) Alliance, which includes AMD, Arm, Intel, Meta, Microsoft, NVIDIA, and Qualcomm Technologies, Inc. The
MX Alliance recently collaborated on the specification for a new technology known as microscaling, which builds on a foundation of years of design space exploration and research, and is a fine-grained scaling method for narrow-bit (8-bit and sub 8-bit) training and inference of AI applications. This specification standardizes these narrow-bit data formats to remove fragmentation across the industry and enable scalable AI.
In the spirit of collaboration, the MX Alliance released this MX specification in an open, license-free format through the Open Compute Project, which consists of hyperscale data center operators and other industry players in computing infrastructure, to encourage broad industry adoption. This is in recognition of the need to provide equitable access to scalable AI solutions across the ecosystem.
Unprecedented AI innovation
Arm is already foundational to AI deployments in the world today and these collaborations are just some of the ways we are providing the technologies needed for developers to create advanced, complex AI workloads. From sensors, smartphones, and software defined vehicles, to servers and supercomputers, the future of AI will be built on Arm.