Researchers Develop 'Ring Attention' Approach to Boost AI Model Capacity

Researchers Develop ‘Ring Attention’ Approach to Boost AI Model Capacity

A recent research paper titled ‘Ring Attention with Blockwise Transformers for Near-Infinite Context’ introduces a new approach called ‘Ring Attention,’ which aims to remove memory limitations for training and running AI models. The current limitations are primarily due to the memory constraints of GPUs. The researchers, including a Google researcher, Databricks CTO Matei Zaharia, and UC Berkeley professor Pieter Abbeel, propose a ring structure of GPUs that allows for passing information between GPUs and eliminates the memory bottleneck. This new method enables AI models to handle much larger context windows, accommodating millions of words of input instead of tens of thousands.

The potential applications are extensive. According to Hao Liu, one of the co-authors, AI models could be able to process entire codebases or even videos as input and generate coherent responses. The team tested their approach in real-world experiments and showed that their technique increased the context length from 16,000 to a theoretically possible 4 million tokens. Liu emphasizes that this breakthrough will lead to developers and technology companies pushing the boundaries of what can be achieved with AI models; hence, it is unlikely to decrease the demand for Nvidia’s AI chips.

Ads
  

Source: AI models can analyze thousands of words at a time. A Google researcher has found a way to increase that by millions.

Similar Posts