High-Fidelity Temporally Coherent Video Editing

ByteDance Inc.   *Equal Contribution

MagicEdit explicitly disentangles the learning of appearance and motion to achieve high-fidelity and temporally coherent video editing. It supports various editing applications, including video stylization, local editing, video-MagicMix and video outpainting.

Video stylization

Video stylization enables one to (1) transform the source video into a new video with a style-of- interest (e.g., realistic, cartoon), or (2) creating a new scene with different subject (e.g., dog → cat) and different background (e.g., living room → beach).


Source video "a monkey, playing on the coast, sea" "a teddy bear, eating apple, green grassland with flowers"
Source video "a cute cat, sitting on the ground" "a labrador dog, sitting on the table"
Source video "a husky dog, lying on the sands" "a sea lion, lying on the ice, winter, snow"


Source video "a red car, moving on the road, autumn, maple leaves" "a red car, moving on the road, mountain, green grass and trees"
Source video "black and white biscuits, falling down sequentially" "bricks, falling down sequentially"
Source video "stools, moving on the railway track" "white cupcakes, moving on the table"


Source video "boats and buildings, floating in the space, stars" "boats, floating on the sea, villas on the coastal"
Source video "a submarine travelling in the sea, fish swimming" "motorbike travelling in the tunnel, with graffiti on the wall"
Source video "red train travelling, snow, winter" "train travelling, japanese, Cherry blossoms"


Source video "a pretty girl, pink dress, living room" Source video "a pretty girl, white dress, dark hair, flowers" Source video "a pretty girl, white singlet, dark pants, on the stage"
Source video "a pretty girl, yellow dress, beach, coconut trees" Source video "a pretty girl, red dress, ballroom" Source video "a pretty girl, white dress, dark hair"

Local editing

MagicEdit also allows users to make local modification to the video while leaving other regions untouched (e.g., make the young lady wear glasses).

Source video "a handsome man" Source video "wearing sunglasses" Source video "a handsome man"
Source video "a senior lady" Source video "a young lady" Source video "a pretty girl"


Similar to MagicMix, we can also mix two different concepts (e.g., rabbit and tiger) in the video domain to create a novel concept (e.g., a rabbit-alike tiger).

Source video "+ tiger" "+ piglet"
Source video "+ tiger" "+ cat"

Video Outpainting

MagicEdit also supports video outpainting task without any re-training.

Source video Outpainted Source video Outpainted Source video Outpainted
"a handsome man, jogging, on the road, sunny" "a cute dog, garden, flowers" "a pretty girl, sunset"
"a pretty girl, grey t-shirt" "an eagle, sea" "a brown bear, forest"

Video outpainting with different ratios and prompts

Prompt = "a handsome man, jogging, on the road, trees, sunny" Prompt + "long pants"


    author = {Liew, Jun Hao and Yan, Hanshu and Zhang, Jianfeng and Xu, Zhongcong and Feng, Jiashi},
    title = {MagicEdit: High-Fidelity and Temporally Coherent Video Editing},
    year = {2023}