Audit Quickstart
This guide starts from a LeRobot evaluation you already ran. WorldFlux does not run the model again; it reads the savedeval_info.json and turns it into an evidence package. The same audit tree also accepts openpi, lbm_eval, vla-eval, gr00t-n1.7, pi-0.7, embodied-gov-bench, and bridge-generated cosmos-predict audit inputs.
1. Prerequisites
You need:- an existing LeRobot install
- an
eval_info.jsonfrom a past LeRobot eval run - a local WorldFlux install
2. Install WorldFlux
During beta, design partners install from a private checkout or private package index. Public PyPI installation will be documented after the public package release.3. Create a claim package
The fastest path is the built-in OpenPI/LIBERO template:claim_pkg/claim.json and claim_pkg/protocol.json.
If you need a paper-derived draft instead, scaffold it locally:
4. Create protocol.json
Skip this step if you used worldflux claim create or worldflux claim from-paper-url; both commands already wrote protocol.json.
5. Run Audit
6. Inspect
7. Publish
--cloud-run-id <cloud-run-uuid> with
--confirm-public-share-upload so WorldFlux can upload customer-approved
sanitized package bytes, or --evidence-package-artifact-id <artifact-uuid> for
an already uploaded Cloud evidence package artifact.
Production Sigstore publication also requires --sigstore-policy-config.
Self-sign publication requires --trusted-signer-config so Cloud can bind the
share to an explicitly trusted signer.
For messy already-extracted customer run folders, start with a private,
read-only import report:
8. Share
claim.json, protocol.json, evidence.json, evidence.md, audit_input.json, audit_provenance.json, episode_results.jsonl, raw_evidence_manifest.json, failure_evidence_index.jsonl, failure_replay_manifest.jsonl, and artifact_manifest.json.
audit_input.json is the normalized input WorldFlux audited. episode_results.jsonl stores per-episode summaries plus bounded metadata that adapters supplied. raw_evidence_manifest.json stores safe references to raw/source evidence and redacted export hashes; it does not automatically copy raw videos, traces, provider responses, or model outputs. failure_evidence_index.jsonl is a searchable seed for later failure graph ingestion, not a graph database. failure_replay_manifest.jsonl records replay hints such as task, seed, and input hashes when available, but it is not a replay runner.
Hosted public shares are narrower than the local package. Cloud accepts only the signed package members claim.json, protocol.json, evidence.json, evidence.md, audit_input.json, audit_provenance.json, artifact_manifest.json, and the signature files required by the signing backend. The reviewer URL exposes a public-safe DTO: expiry and password status as computed fields; reviewer label, audience, approver, revocation owner, and retention policy from the approval record; display id, claim/protocol/result/scope/recommendation, verification status, package-derived deployment summary, missing evidence labels, and next falsification axes from the verified package; and sanitized artifact display name/type/size bucket/state. It does not publish raw logs, raw videos, checkpoints, provider responses, local paths, signed URLs, workspace controls, or private object keys. It also hides raw client_run_id, raw recipe/runtime, raw artifact paths, workspace/project/user ids, API key ids, and token hashes.
Reliability metadata is separate from publishing. It is opt-in only, and when enabled it stores allowlisted labels derived from the verified public-safe summary, not customer model weights, raw folders, logs, or videos.