Motion Synthesis with Sparse and Flexible Keyjoint Control

Inwoo Hwang1, Jinseok Bae1,

Donggeun Lim1, Young Min Kim1

1Seoul National University

Abstract

Creating expressive character animations is labor-intensive, requiring intricate manual adjustment of animators across space and time. Previous works on controllable motion generation often rely on a predefined set of dense spatio-temporal specifications (e.g., dense pelvis trajectories with exact per-frame timing), limiting practicality for animators. To process high-level intent and intuitive control in diverse scenarios, we propose a practical controllable motions synthesis framework that respects sparse and flexible keyjoint signals. Our approach employs a decomposed diffusion-based motion synthesis framework that first synthesizes keyjoint movements from sparse input control signals and then synthesizes full-body motion based on the completed keyjoint trajectories. The low-dimensional keyjoint movements can easily adapt to various control signal types, such as end-effector position for diverse goal-driven motion synthesis, or incorporate functional constraints on a subset of keyjoints. Additionally, we introduce a time-agnostic control formulation, eliminating the need for frame-specific timing annotations and enhancing control flexibility. Then, the shared second stage can synthesize a natural whole-body motion that precisely satisfies the task requirement from dense keyjoint movements. We demonstrate the effectiveness of sparse and flexible keyjoint control through comprehensive experiments on diverse datasets and scenarios.

SceneMI teaser image.

We enable a wide range of practical and controllable motion generation with high quality and precision. Our approach synthesizes natural human motion from explicit signals, including (a) dense signals with multiple joints, (b) sparse signals, and (c) goal-driven scenarios. Additionally, we generate motion from (d) implicit control signals defined via objective functions, including time-agnostic control.

Explicit Control

Dense Control Examples

From single-to-multiple joints dense trajecotries, we synthesize natural motion with high precision.

Dense Signal

a person walks in a curved line.

a person running to the right

a person walked in a clockwise circle.

jogging forward in medium pace.

he is leaning on something and cleaning it with a towel.

the person crouches and walks forward.

Sparse Control Examples

From highly sparse signal, we synthesize plausible motion with high accuracies. We visualize synthesized intermediate keyjoint trajectories with skyblue color.

Sparse Signal

a person moves sideways to the right and then sideways back to the left and then one more step to the right.

a person slowly walks backwards.

a person takes a huge leap forward.

a person walks forward with a limp.

a person starts walking with their right foot first and takes eleven steps forward.

a person takes two steps forward, then walks sideways three steps, then walks forward diagonally and to the left three steps.

Comparison with Baselines

Existing controllable motion synthesis methods often struggle with sparse control; however, our approach maintains high performance despite the sparsity of control.

Comparison with Baselines

MotionLCM

TLControl

Ours

MotionLCM

TLControl

Ours

Goal-Driven Scenarios

From highly sparse signals, which are initial pose and target end-effector goal, we synthesize goal-driven motion. We trained a unified network for different tasks and demonstrate its performance across various goal-driven scenarios.

Goal-Driven Scenarios

Reaching Hand Target

Climbing with Rock Constraints

Sitting with Hand Control

Implicit Control

We define control signals as functions derived from a set of keyjoints, enabling more user-friendly control.

Hand-Head Touching

With our decomposed framework, our approach better satisfies objective constraints while preserving the realsim of the full-body motion.

Head-Hand Touching

DNO

Ours w/o decomposed

Ours

Time-Agnostic Control

Without exact timing information, model fails to accurately follow the trajectory. Our time-agnostic control enables the trajectory control without need for exact timesteps, enabling temporally flexible synthesis.

Time-Agnostic Control

w/o time-agnostic

Ours

Ours trajectory

Input (no-timing information) & Our trajectory

w/o time-agnostic

Ours

Ours trajectory

Input (no-timing information) & Our trajectory

BibTeX


        @misc{hwang2025motionsynthesissparseflexible,
            title={Motion Synthesis with Sparse and Flexible Keyjoint Control}, 
            author={Inwoo Hwang and Jinseok Bae and Donggeun Lim and Young Min Kim},
            year={2025},
            eprint={2503.15557},
            archivePrefix={arXiv},
            primaryClass={cs.GR},
            url={https://arxiv.org/abs/2503.15557}, 
        }