Comparison to Dreamer V3

Dreamer 4 extends Dreamer 3’s philosophy of domain-invariant learning into the offline imagination era, unifying perception, dynamics, and control within a single scalable framework.

Training Mode

Online RL + Imagination

Offline → Imagination-only RL

World Model

Recurrent RSSM + Block GRU

Transformer-based latent dynamics + Shortcut Forcing

Data Source

Environment interaction

Unlabeled videos + small labeled subset

Generalization

Fixed hypers across 150 tasks

Cross-domain, cross-modality robustness from offline data

Compute Scaling

Improves with model size & replay ratio

Linear scaling to real-time world simulation

Signature Demo

Diamonds in Minecraft (from scratch)

Diamonds in Minecraft from offline data only

Core Losses

Symlog, Two-Hot, KL balancing + Free Bits

Adds Shortcut Forcing + Object-Interaction Fidelity terms

Goal

General RL without retuning

General AI agent training without real world steps

Last updated