How it works

From a spec to a dataset, end to end.

We run the entire pipeline end to end. You specify the requirement; we field it, film it to a standard, clear the rights, and deliver it ready to use.

Spec intake

You give us the task, the person type, the environment, the location, the modality, the volume, and the quality bar. We turn it into a machine-readable rubric.

Source & dispatch

Our AI deep-search sourcing finds and fields matching contributors, targeted by task, profession, demographic, and geography. Not one person, as many of the exact type as the spec needs.

Standardised capture

Contributors film their own real tasks in their own private, consented spaces, on a standardised rig we supply. Same setup, angles, and sensor config, so every clip is consistent and comparable.

Multi-stage QA

Every clip is reviewed against the rubric. On-device checks catch bad framing, desync, or exposure before upload. Rejects rework or drop out. We target 95%+ acceptance.

Rights & consent

Contributors sign the releases that make the data licensable. Everyone identifiable in a clip consents, or it does not ship. Consent and chain-of-title travel with the data.

Delivery

A loadable LeRobot dataset: video, per-frame motion and labels, with optional depth, hand pose, and object masks, plus consent and provenance. Delivered to your storage, or pulled via API.

Quality at scale

Standardised capture is how we keep datasets consistent.

Standardised capture means every contributor shoots to the same setup. That is the quality edge over phone-and-hope collection: a whole dataset that is consistent and comparable, not just the best few clips. On-device validation plus a multi-stage review ladder keeps acceptance high.

loadable LeRobot dataset

lerobot-dataset/

meta/# schema · episodes · tasks · diversity stats

data/chunk-000/# per-frame parquet: state, action, labels

videos/chunk-000/# the synced video streams

observation.images.head# egocentric RGB · 1080p60

observation.depth.head# depth maps (optional)

observation.masks.head# object masks (optional)

consent/# signed release · chain-of-title

processing.json# every model · time · cost per hour

DATASHEET · LICENSE# datasheet, license, readme

Native LeRobot. RLDS and LeRobot v3 export on request.

Get the real-world data your robot needs.

Tell us the task, the person, and the place. We field it from a network of 800k contributors and deliver it to spec, cleared for commercial training, in about four weeks.

Book a call Request a dataset