Unlocking Embodied AI.
With Human Intelligence.
Istari is a platform for hyperscaling humanoid robot training data. By fusing diverse human insights with advanced AI, Istari empowers robots to learn, understand, and operate effectively in the complexities of the real world.
The Data Bottleneck
"When it comes to humanoid AI, data is the ultimate constraint. The robotics field is facing a scaling crisis."
Even the most advanced humanoid systems today operate on just ~10,000 hours of training data—a fraction of what GPT and other foundation models consume. Scaling laws indicate a 500,000× increase is needed to unlock truly generalized capability.
Current Teleoperation Challenges
- •
Traditional setups require $40,000+ per robot platform, limiting scale
- •
Lab-confined demonstrations fail to capture real-world complexity
- •
Cumbersome interfaces and frequent hardware issues slow data collection
The Economics Don't Work
- •
Prohibitive infrastructure: Traditional telerobotic stations cost $100,000+ each
- •
Impossible scale: At current rates, reaching 1B hours would cost $15-30B—before even hitting 1% of needed data
LLM datasets dwarf Robotics datasets
Open X-Embodiment, used to train models like DeepMind's RT-2, represents the largest robotics dataset, yet it's dwarfed by text-based datasets.
Common Crawl, a massive open-source web crawl, is a key resource for training Large Language Models (LLMs). The disparity in scale highlights the immense data gap robotics needs to bridge.
The ISTARI Network
Global Data Infrastructure
Advanced Capture Devices
High-fidelity sensors capture nuanced human-environment interactions
Intelligent Data Classification
Advanced models and algorithms ensure optimal data quality, segmentation and relevance
Global Contributor Network
Istari's incentive system provides fair, transparent compensation to contributors
Enterprise Data Pipeline
Structured, annotated data streams flow into industry-standard formats compatible with NVIDIA Cosmos and leading robotics platforms
Competitive Edge
- ●
Unprecedented scale: Enabling millions of global contributors to provide training data through consumer-friendly hardware
- ●
Diversity advantage: Capturing the full spectrum of human behaviors across cultures, tasks, and environments that lab settings can't match
- ●
Economic breakthrough: Reducing data acquisition costs by 95% makes billion-hour datasets financially viable for the first time
Capture Hardware
Smart Glasses
- •Advanced video capture capabilities
- •Environmental awareness technology
- •Ergonomic design for extended use
- •On-device processing capabilities
Haptic Gloves
- •Advanced tactile feedback system
- •High-precision motion tracking
- •Extended battery life
- •Seamless wireless connectivity
Scalable hardware at 1/100th the cost of traditional teleoperation rigs
Enterprise Solutions
Multimodal Data Streams
Access high-fidelity sensory data from our global contributor network spanning vision, audio, motion, and haptics.
Ideal for research teams pioneering novel training architectures and embeddings.
Structured Task Datasets
Curated collections of human demonstrations organized by task domain, complexity, and environment type.
Perfect for robotics teams developing targeted capabilities and task-specific intelligence.
Foundation Models
Pre-trained robotics foundation models that generalize across embodiments and accelerate deployment.
Jumpstart development with ready-to-deploy intelligence trained on our massive dataset.
Empowering the next generation of robotics with the world's largest human-embodiment dataset.
Contact Us
Interested in learning more about ISTARI or partnering with us? Get in touch!
The future of robotics relies on unlocking human intelligence at unprecedented scale. Our global network, paired with revolutionary capture hardware, offers the only economically viable path to true general AI.