Embodied AI Characters

Realistic 3D digital humans with personal likeness, emotional expression, and real-time conversation — running natively in AR.

A continuing exploration in embodied AI: 3D characters built to feel real, respond with emotion, and converse in real time both on desktop and standalone AR hardware.

This work extends character systems I've built upon in the past, applied here to a more demanding question — how close can a digital character get to feeling like a present, responsive, real person in mixed reality?

The result is a workflow that can quickly generate a roster of distinct, individually believable characters — each with their own face, their own presence, and the full range of facial movement needed for emotional and verbal expression.

Realistic characters from a single photo

Each character can take on the facial likeness of any person from a single photograph, while preserving full expression and speech capability.

Emotionally responsive

Characters respond emotionally — joy, sadness, surprise, anger, fear, disgust — with smooth transitions, micro-expressions, and natural idle behavior that keeps them feeling alive between reactions.

The expression system runs continuously alongside conversation, so what a character is saying and how they feel about it move together. The result reads less like an avatar and more like presence.

Real-time AR

The full system runs natively on Quest 3 — no PC tether, no cloud streaming. The user puts the headset on, sees the character in their physical space, and has a conversation.

Performance, conversational latency, and rendering quality are all tuned to hold together at headset frame rates. The result is a present, 3D character that lives in the same room as the user.

Capture and replay

Conversations can be recorded as full performances — voice, expressions, gestures, and behavioral state — and replayed deterministically later.

This was originally built as a research and debugging tool, but it turned out to be just as useful for content production and training: capture a great interaction once, replay it with different cameras, lighting, or characters, and experience it in headset or render the result as needed.