Research scientists at OpenAI, including Leo Gao, have emphasized the increasing integration of artificial intelligence (AI) systems into significant areas and the need for their safety as they evolve. A recent study introduces a new model known as a weight-sparse transformer, which is considerably smaller and less capable than leading models such as GPT-5, Claude by Anthropic, and Google DeepMind’s Gemini. According to Gao, the new model’s capabilities are at most comparable to OpenAI’s earlier GPT-1, developed in 2018; however, direct comparisons have not yet been conducted.
The primary objective of this research is not to compete with existing high-performance models but to investigate the underlying mechanisms that govern the operations of more advanced AI technologies. This approach falls within the emerging field of mechanistic interpretability, focused on understanding how models execute various tasks through their internal processes.
Experts in the field have recognized the potential significance of this research. Elisenda Grigsby, a mathematician at Boston College who studies large language models, noted the research is interesting and could have a substantial impact on the understanding of AI operations. Similarly, Lee Sharkey, a research scientist at the AI startup Goodfire, praised the work as targeting a crucial issue effectively.
The complexities in understanding AI models stem from their architecture. They utilize neural networks composed of interconnected nodes, or neurons, organized in layers. In dense networks, each neuron connects to others in adjacent layers, creating a vast web of relationships. This design can lead to challenges such as the distribution of simple concepts across different neurons and the phenomenon known as superposition, where individual neurons may represent multiple features. Consequently, pinpointing specific components of a model to distinct concepts can be difficult.
Source: https://www.technologyreview.com/2025/11/13/1127914/openais-new-llm-exposes-the-secrets-of-how-ai-really-works/

