Introducing Infinite Context Transformers with Infini-attention

April 21, 2024

Google’s latest research introduces a groundbreaking technique to scale Transformer-based Large Language Models (LLMs) to infinitely long inputs with bounded memory and computation. The new attention mechanism, called “Infini-attention,” incorporates a compressive memory and combines masked local attention with long-term linear attention mechanisms in a single Transformer block. This innovation aims to address the challenge of context length limitations in LLMs, enabling them to process and generate responses based on an infinite context. Read the research paper “Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention” to learn more.

ML Scientist

Introducing Infinite Context Transformers with Infini-attention

Leave a Reply Cancel reply

You May Also Like

Leave a Reply Cancel reply