The robotics industry is gaining momentum with the help of foundation models like large language models (LLMs) and vision-language models (VLMs), which are now enabling robots to handle more sophisticated tasks involving reasoning and strategic planning.
These advances represent a shift from conventional programming toward models that allow robots to perform complex operations autonomously. By integrating such models, researchers aim to improve robots’ ability to function in dynamic environments, potentially transforming how they assist in areas like healthcare, logistics, and human-robot interaction.
Meta has introduced Sparsh, a family of encoder models developed in collaboration with the University of Washington and Carnegie Mellon, focused on improving tactile perception in robots. Sparsh employs vision-based tactile sensing to help robots interpret touch, a vital ability for tasks that require delicate handling and nuanced physical interaction.
Traditional methods relied heavily on labeled data, limiting their applicability across different tasks and sensors. In contrast, Sparsh’s self-supervised learning approach overcomes this barrier, adapting seamlessly across various tactile sensors and showing notable improvements in handling diverse tactile data.
To further enhance tactile sensing, Meta has developed Digit 360, a highly detailed, finger-shaped sensor that captures complex touch information using over 8 million taxels. This sensor enables robots to detect and interpret fine-grained, multidirectional touch inputs, improving their capacity for precise object manipulation.
Digit 360 is designed with embedded AI that allows it to process information locally, reducing response time and enabling rapid adjustments to touch in real time. Meta is making the code and design for Digit 360 publicly accessible, encouraging research in robotic touch that could benefit fields from medical technology to virtual reality.
In addition to Digit 360, Meta has launched Digit Plexus, a platform that integrates various tactile sensors into a single robotic hand, allowing for comprehensive data gathering and streamlined transmission to host computers. The platform can encode and transfer data from multiple touch sensors via a single cable, simplifying sensory integration on robotic devices.
In collaboration with GelSight Inc. for manufacturing and Wonik Robotics for developing a complete robotic hand, Meta is setting the foundation for improved robotic touch capabilities. By releasing the Digit Plexus platform design and code, Meta is supporting community-led research to further refine tactile perception technologies in robotics.
To advance human-robot collaboration, Meta has also developed PARTNR, a benchmark tool for evaluating AI models in household task settings. Built on Meta’s Habitat platform, PARTNR includes a large dataset of natural language tasks that test robots’ ability to follow human instructions accurately in complex, home-like environments.
This benchmark is part of a broader industry trend toward improving robot planning and reasoning abilities through LLMs and VLMs. Other companies, like Google DeepMind with their RT-X project, are similarly focused on expanding AI’s role in robotics to make robots more capable and adaptive across different settings and tasks.