Ai2's MolmoAct model ‘thinks in 3D’ to challenge Nvidia and Google in robotics AI

Ai2’s MolmoAct model ‘thinks in 3D’ to challenge Nvidia and Google in robotics AI

Physical AI, which combines robotics and foundation models, is an emerging field with significant interest from companies like Nvidia, Google, and Meta. These organizations are exploring ways to integrate large language models (LLMs) with robotic systems.

Recent research from the Allen Institute for AI (Ai2) introduces MolmoAct 7B, a new open-source model designed to enhance physical AI capabilities by enabling robots to reason within a three-dimensional space. This model is derived from Ai2’s previously established Molmo and is intended to challenge existing models from major tech companies. Ai2 is also making its training data available under an Apache 2.0 license for the model and a CC BY-4.0 license for the datasets.

MolmoAct is classified as an Action Reasoning Model that helps robots understand and plan actions based on their physical environment. The model outputs “spatially grounded perception tokens,” which allow for an estimation of distances between objects, subsequently enabling the prediction of movement pathways.

Benchmark tests reveal that MolmoAct 7B achieved a task success rate of 72.1%, outperforming other models from Google, Microsoft, and Nvidia. However, experts suggest that while improvements are noted, existing benchmarks don’t fully encompass the complexities of real-world scenarios.

Interest in physical AI continues to grow among developers and researchers, as breakthroughs in LLMs might help address challenges of robotic cognition and movement. Current advancements, such as Google Research’s SayCan and Meta’s OK-Robot, showcase the application of LLMs in robotic tasks.

The field is evolving towards more sophisticated physical intelligence models, although researchers recognize the inherent complexities still present. The push for automation and enhanced spatial awareness remains a paramount goal, suggesting a potential for significant future developments in physical AI.

Source: https://venturebeat.com/ai/ai2s-molmoact-model-thinks-in-3d-to-challenge-nvidia-and-google-in-robotics-ai/

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top