OmniInsert

Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models

* Equal contribution, Corresponding author, Project lead

Intelligent Creation Lab, Bytedance

Research Paper GitHub

Demo Video

Various Video Insertion Results
Given any reference, OmniInsert seamlessly inserts the subjects into the original scenes, demonstrating robustness in various scenarios.
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Video Insertion Comparisons
Compared with other methods, OmniInsert shows strong ability on Subject-Secne Equilibrium and Insertion Harmonization.
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Ethical Considerations

The reference images and videos used in these demo videos are sourced from public domains or generated by models, and are intended solely to demonstrate the capabilities of this research. If there are any concerns, please contact us (li-xh21@tsinghua.org.cn) and we will delete it in time.

BibTeX

@misc{chen2025omniinsert,
            title={OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models}, 
            author={XXX},
            year={2025},
            eprint={2509.17627},
            archivePrefix={arXiv},
            primaryClass={cs.CV},
            url={https://arxiv.org/abs/2509.17627}, 
        }