SERGI

CABALLER

About

Case Studies

Credits

Contact

SERGI

CABALLER

About

Case Studies

Credits

Contact

Art Director, Avatars

2022-2025

Meta - Horizon Avatars

Home

Case Studies

Meta - Horizon Avatars

Meta Avatars: A Character System for 2 Billion Users

TLDR: Joined Meta in October 2022 out of almost a decade at Disney Animation. The brief was a ground-up redesign of the avatar system carrying over a billion users across Facebook, Instagram, WhatsApp, Messenger, and Horizon. No approved visual direction when I arrived, and the 2D-concept-first process was not clicking at a tech company. I materialized the north star through six direct-to-3D maquettes, was promoted to Art Director as the scope formalized, then designed and prototyped the parametric systems underneath: a new neutral head, roughly 150 face parameters, a body system with ten regions and 250+ identity shapes after Aspirational Bodies, FACS expressions, and the ML pipeline that turns a selfie into an avatar on style. Quality held across internal teams, ML pipelines, and vendor studios on three continents. Representation score landed at 3.8 at launch and 4.0 after Aspirational Bodies, against a 3.0 target. 72% user preference. 300%+ Messenger usage spike.

DETAILS

Studio: Meta Reality Labs
Project: Horizon Avatars (Style 2.0, launched at Meta Connect, plus Aspirational Bodies follow-up)
Role: Art Director, Avatars (promoted from Character Art Lead)
Scale: 1B+ avatars across Facebook, Instagram, WhatsApp, Messenger, and Horizon
Tools: Maya, ZBrush, Python, USD
Year: 2022-2025
Partners: Visual development team, ML researchers, tech art and engineering, product and UXR, and vendor studios on three continents
Shipped: Style 2.0 visual redesign across every Meta platform, Style 2.0 neutral head (LODprime), face parametric system (~150 parameters), body parametric system (10 regions, 250+ identity shapes after Aspirational Bodies), FACS expression layer, ML training pipeline that generates avatars from selfie input, style frameworks scaling consistent quality across internal teams, ML pipelines, and vendor studios on three continents, 16 new body presets across masculine and feminine archetypes, and face depth controls addressing structural representation gaps
Results: Representation score 3.8 at launch, reaching 4.0 after Aspirational Bodies, against a 3.0 target. 72% user preference in qualitative study. 300%+ Messenger usage spike after rollout.
Links: [Meta Connect 2024 launch], [Meta Avatars]

THE CONTEXT

Meta's avatars are the digital identity layer across Facebook, Instagram, WhatsApp, Messenger, and Horizon. Over a billion avatars in active use. When someone creates an avatar, that character becomes how they show up across every Meta product. Their face in a video call, their presence in a Story reaction, their body in Horizon, their identity in chat, an emote in a thread.

I joined in October 2022 as Character Art Lead, coming out of almost a decade at Disney Animation. The goal was simple to state and large to execute: define what the next generation of Meta avatars should look like, build the systems that would produce them at platform scale, and set the quality standard that every team touching avatars (internal, vendor, ML) would work against. There was no defined visual direction for what came next, and no framework for scaling quality once a direction was chosen. I defined the visual strategy, built the initial prototypes, and was promoted to Art Director as the scope formalized.

Directing a character on a film and directing a character at platform scale are not the same job, and not the same challenge. On a film, the character is one named individual, Mirabel, Asha, Anna, whose design is tailored, reviewed, and crafted over months until it lands. At Meta, the character is every user, or as many characters as a user wants. A billion people creating their own self-representation, no two alike, all generated through a parametric system that has to hold up without a human reviewer in the loop for each one. The work shifts from crafting a specific designed face to designing a system whose applied rules guarantee that any generated face stays on style and above the established quality bar.

THE CHALLENGES

The existing avatar style had a fundamental problem. A low-fidelity cartoon with childlike proportions and a limited identity system that didn't resonate with users. The pain point became public when Mark Zuckerberg posted his own avatar in front of the Eiffel Tower to celebrate Horizon Worlds launching in Europe. The mockery was global and instant. The reaction to the post was the visible symptom of a problem the team had been measuring internally for months: users had expressed a low acceptance of Meta avatars at the time.

Underneath the style issue was a structural one. The body system couldn't produce genuinely diverse body types. The face system couldn't represent users from underrepresented groups with real fidelity, especially in face structure, skin tones, and hair. Every surface symptom traced back to the same root: a style that didn't resonate, and a limited parametric architecture that wasn't built to carry the range of identities it was being asked to represent.

The platform needed a ground-up redesign of both the visual style and the parametric systems underneath it. Not a refresh. A rebuild.

THE APPROACH

Building the visual foundation

There was no approved execution path of any kind. No 2D, no 3D, no direction to build from, just the understanding that the style needed to mature somehow to find distance from the Eiffel Tower avatar, and for that we needed a materialized direction to build toward as an org.

The concept sketches and moodboards approach had already been tried for a while without landing on approval. The delta between the 2D designs and their 3D translation kept breaking, especially given the nature of the company. Studios with an art foundation are used to that process: concept 2D art, then 3D visual development, with the awareness that the back and forth not only materializes the concept but creates an organic evolution from the initial 2D. A tech company, and the leadership at Meta, is less familiar with that process, so for them not having a 1-to-1 solution was an unexpected surprise.

This was the core reason the avatars team brought me in. They were looking for someone who could drive visual exploration without needing a 2D-established direction first, and creative leadership had seen references of the Style 2.0 direction they wanted to pursue in my personal portfolio. The trust to run ahead of a traditional visdev process, at a tech company with no in-house precedent for it, is the kind of trust I do not forget.

My approach was direct-to-3D visual development sculpts, delivered on a record time that could land the visual north star without the bumps of the back and forth, and without disrupting set expectations. So I brought the film and game studio practice of fully realized 3D maquettes to a platform product. Built six maquettes from scratch, each a different persona spanning gender, ethnicity, age, and background. These weren't just visual development. They were the system specification. Every downstream decision, from parametric ranges to ML training targets to vendor quality bars, was built against these references. If a future output didn't match what one of those six maquettes would look like in that configuration, the system wasn't working.

Once the maquette work concluded, the style crystallized around five principles: strong first visual read, graphic shapes, planarity, visual tension, and careful attention to proportions, distribution, and spacing. The balance toward stylization was deliberately uneven, elevating the new style to a language more mature and more sophisticated than the previous attempt, with more anatomy and naturalistic shapes, but staying far from uncanny territory. Clearly stylized, not attempting photorealism, but grounded enough in real physiology that a user could recognize themselves in the result. Stylized enough to live in a product at platform scale. Grounded enough to feel personal.

Architecting the parametric systems

As the style solidified, my role formalized into Art Director. The scope expanded from defining the visual language to owning the architecture of the parametric systems generating every user's face and body. Five pieces had to land together for the system to function: a neutral/average body and head to anchor everything and become the canonical foundation for the system, a face parametric system, a body parametric system, a FACS expression layer, and an ML training pipeline that could learn a user's face identity from a picture and produce an automatic avatar interpretation that belonged to the style. Each one pushed the next, and all five had to hold up to the highest possible quality bar.

The neutral head. Before any parametric system could exist, the style needed a base mesh to run on top of. I conceived and built the Style 2.0 base mesh, referred to internally as LODprime (on the assumption that other LODs would run in engine, while this one was the production-layer foundation). LODprime is the foundational topology that every rig, parametric nuance, blendshape, and animation would run on. Neutral but malleable enough to represent billions of users across the full range of ethnicities, ages, and shapes without breaking. Conservative enough that rigging and animation could trust its behavior under stylized deformation, while ensuring the identity layer stayed on top of the quality bar. This is the piece every other system quietly depends on. If the topology isn't the right one, too dense, too artifact-noisy, too many poles, not the right flow, not the right density for the contexts, everything downstream accumulates problems and fails. Getting it right was one of the highest-leverage decisions in the whole system at that early stage.

Face parametrics. I decomposed the face identity layer into roughly 150 independently controllable parameters, breaking the face down into micro shapes with minimal individual influence but enormous combined plasticity. The design choice was deliberately granular, to give us the nuance needed to represent billions of users. The parametric space is vast, and goes deep into face morphology. Giving full control over that space carried a risk: the system was capable of delivering high quality and representation bars, but in the hands of less experienced users, or even artists, it could produce undesirable, out-of-style outputs. We needed public safe rails.

We adopted two decisions. On one side, users get a curated, fine-grained control surface that leverages the whole system underneath. On the other, the ML models get access to the full parameter space, large enough to represent real identity without collapsing into presets. The initial parameter list exposed to the user went through multiple stress-test rounds against real identities, validated by team and XFN partners and UXR research studies, and each round revealed gaps the design hadn't anticipated. After multiple rounds, the user-facing controls on the editor were formalized.

Additionally, keeping each of the billions of editor-producible permutations on style required an internal evaluation system, which I led. That system became the backbone for editor preset creation, dynamic parametric space driven by preset selection, and the annotator guidance framework for the synthetic data that trained the ML model translating user selfies into in-style avatars. Over 30,000 annotations were created across multiple vendors in different countries. I led that effort from the AD side, providing the system framework as well as the knowledge transfer through lectures, documentation, and countless Zoom review sessions.

Body parametrics. Composition axes, shape decomposition, and skeletal controls working together to produce the full range of body types within the style language. Built in Maya, validated through the same stress-testing methodology used for the face, handed off to engineering for integration. The body system is where the Aspirational Bodies work (below) later opened up significantly.

FACS expressions. The full set of art-directed facial expression blendshapes, establishing how stylized expression should behave in the new style. Expression research started on the maquettes, where range and exaggeration could be tested outside of rig constraints. Once the emotional vocabulary was settled, those findings were brought onto the neutral head and built out as the shipping blendshape library.

ML training pipeline. I served as primary quality director for the ML model that translates a selfie into an avatar. This is the piece that turned the system into a product: a user takes a picture, the model returns an avatar, and that avatar has to feel both like them and like it belongs to the style. The evaluation framework established in Face parametrics fed the training; on top of it, I ran structured style sessions to align internal art directors on the same bar, and kept tuning the feedback loop as the model iterated. Human-in-the-loop ML direction at scale: define the target, build the process to get there, make the standard reproducible so the model can keep improving without me being the bottleneck on every decision.

Aspirational Bodies

After Style 2.0 launched, representation scores climbed substantially, especially on faces. The body system, though, hit a ceiling. Users could build a wider range of bodies than before, but the parametric space still struggled with specific physiques, especially athletic and muscular frames that the original architecture wasn't built to resolve cleanly.

Instead of patching the existing system, I designed and prototyped a completely new iteration. The previous system was limited from the beginning due to on-device performance concerns at the time, concerns that lifted during the path to 2.0 delivery. That gave me room for a completely new architecture: two main composition axes (thickness and fitness) with gender-specific shapes, ten independently controllable body regions, and extended skeletal capabilities to support the broader range. I built the prototype in Maya with a custom data-centric Python toolkit (procedural connections, pose switching, preset management, import and export) so the full iteration loop could run inside a single tool rather than across a chain of handoffs. This was crucial for setting the target of the new body parametric iterations and for defining the new body preset configurations users would see. Production happened in a 3.5-week sprint with tech art and engineering, which is fast for a system change at that depth, and only possible because the underlying body parametric architecture was already solid and the partner teams moved in lockstep with the prototype. Following the same process we had run early with faces (internal evaluations, UXR studies), we landed 16 new body presets across masculine and feminine archetypes.

Pushing representation

Even with Style 2.0 live, some users still couldn't find themselves in the system. Representation scores were strong overall, but certain demographic groups consistently scored lower than the average, and the editor couldn't represent specific facial morphologies beyond surface-level markers. The same limitations applied to age representation. The real nuances of identity live in face structure, particularly face depth and other bone-structure indicators, not in the ethnicity-coded presets that most avatar systems fall back on.

The representation capability already existed in the underlying full parametric space, just not exposed to users. The solution focused on exposing and bundling additional sliders on the editor, rather than adding more static preset categories that would have created a tiered approach competing with a solution already part of the system. My artistic eye, craft skills, and in-depth knowledge of anatomy let me identify where the system was failing structurally, and propose a set of new controls addressing structural facial diversity. The controls shipped, and pulled representation scores up in the groups the previous system had underserved.

Shipping, and what it actually means

Style 2.0 launched at Meta Connect and rolled out across every Meta platform. Over a billion avatars migrated to the new style during the rollout. The key metrics cleared their targets:

Representation score: target was 3.0 on a 5-point scale. Landed at 3.8 at launch, reaching 4.0 after the Aspirational Bodies update.
User preference: 72% preferred the new style in qualitative study.
Messenger usage: 300%+ spike after rollout.

What shipped was the full stack behind those numbers. A complete visual redesign live across every Meta platform. The Style 2.0 neutral head (LODprime) as the foundational topology. A face parametric system carrying roughly 150 parameters. A body parametric system with ten regions and 250+ identity shapes after the Aspirational Bodies work. A FACS expression layer. An ML training pipeline that turns a selfie into an avatar that belongs to the style. Face depth controls addressing the structural representation gaps. And the style frameworks that kept visual quality consistent across internal teams, ML pipelines, and vendor studios on three continents without needing a single reviewer to look at every output.

That last piece is the one I am most proud of. Traditional art direction depends on one person reviewing every output, and that model works on a film, where the total asset and shot count is in the hundreds or thousands and every frame passes through a human review chain. It breaks immediately at platform scale, where the system produces novel outputs continuously across internal teams, ML pipelines, and vendor studios on multiple continents. The structured style framework made quality measurable and reproducible. It translated artistic intuition into language that engineers, ML researchers, and vendor teams could follow, which meant style decisions could propagate through the system without routing every call back through me.

None of this ships without strong partners. The visual development team who built against the maquettes, the tech art and engineering who turned my initial prototypes into shippable systems, the ML researchers who trained and retrained the pipeline against the quality bar we had set together, the product and UXR partners who kept us grounded in what users actually needed, and the vendor studios on three continents who carried the style framework into production at scale. The character on screen is the visible part of this work. The system underneath is everybody's.

A billion avatars sounds like a number. It is also a billion small moments of someone looking at a screen and seeing themselves, or not, in the character that represents them to their family, their friends, their coworkers, their communities. That is what platform-scale character work actually is. A responsibility, a craft problem, and the most meaningful extension of twenty years of character work that I could have asked for.

additional work case studies

3D Visual Development & Character Modeling Supervisor

Disney - Wish

Character Modeling Supervisor on Disney's 100th anniversary feature. Built a Python autotagging tool that compressed months of manual cataloging into a week, then used the reclaimed time to develop the 3D interpretation of Asha. Learn more…

Founder, Designer, Developer

MeshSynergy

Maya/Python tools for character workflows: mesh data handling, crease management, semantic landmarks, and a base mesh topology built for production deformation and scalability. Tested against real rigs, real pipelines. From artist to artist. Learn more…

Follow Me

Terms & Conditions

Follow Me

Terms & Conditions