Haoran MO | 莫浩然 's Papers

Haoran MO |

莫浩然

Creative Intelligence and Synergy Lab

Computational Media and Arts (CMA), Information Hub

The Hong Kong University of Science and Technology (Guangzhou)

Guangzhou, China

Email:

haoranmo (at) hkust-gz.edu.cn / mohaor (at) mail2.sysu.edu.cn

Github Google Scholar Resume

All Publications (☞ Selected Publications)

'#' indicates equal contribution. '*' indicates corresponding author.

2025

DoodleAssist: Progressive Interactive Line Art Generation with Latent Distribution Alignment

Haoran Mo, Yulin Shen, Edgar Simo-Serra and Zeyu Wang

IEEE Transactions on Visualization and Computer Graphics (TVCG 2025) (CCF-A)

Project Page Paper Supplementary Code Abstract Bibtex

Creating high-quality line art in a fast and controlled manner plays a crucial role in anime production and concept design. We present DoodleAssist, an interactive and progressive line art generation system controlled by sketches and prompts, which helps both experts and novices concretize their design intentions or explore possibilities. Built upon a controllable diffusion model, our system performs progressive generation based on the last generated line art, synthesizing regions corresponding to drawn or modified strokes while keeping the remaining ones unchanged. To facilitate this process, we propose a latent distribution alignment mechanism to enhance the transition between the two regions and the seamlessness of their blending, thereby alleviating issues of region incoherence and line discontinuity. An interactive user interface is built to allow the convenient creation of line art through sketches and prompts. Qualitative and quantitative comparisons against existing approaches and an in-depth user study demonstrate the effectiveness and usability of our system. Our system can benefit various applications such as anime concept design, drawing assistant, and creativity support for children.

@article{mo2025doodleassist,
  title   = {DoodleAssist: Progressive Interactive Line Art Generation with Latent Distribution Alignment},
  author  = {Mo, Haoran and Shen, Yulin and Simo-Serra, Edgar and Wang, Zeyu},
  journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)},
  year    = {2025}
}

DiFusion: Flexible Stylized Motion Generation Using Digest-and-Fusion Scheme

Yatian Wang, Haoran Mo and Chengying Gao*

IEEE Transactions on Visualization and Computer Graphics (TVCG 2025) (CCF-A)

Paper Abstract Bibtex

To address the issue of style expression in existing text-driven human motion synthesis methods, we propose DiFusion, a framework for diversely stylized motion generation. It offers flexible control of content through texts and style via multiple modalities, i.e., textual labels or motion sequences. Our approach employs a dual-condition motion latent diffusion model, enabling independent control of content and style through flexible input modalities. To tackle the issue of imbalanced complexity between the text-motion and style-motion datasets, we propose the Digest-and-Fusion training scheme, which digests domain specific knowledge from both datasets and then adaptively fuses them into a compatible manner. Comprehensive evaluations demonstrate the effectiveness of our method and its superiority over existing approaches in terms of content alignment, style expressiveness, realism, and diversity. Additionally, our approach can be extended to practical applications, such as motion style interpolation.

@article{wang2025difusion,
  title   = {DiFusion: Flexible Stylized Motion Generation Using Digest-and-Fusion Scheme},
  author  = {Wang, Yatian and Mo, Haoran and Gao, Chengying},
  journal = {IEEE Transactions on Visualization and Computer Graphics (TVCG)},
  year    = {2025}
}

2024

	Joint Stroke Tracing and Correspondence for 2D Animation Haoran Mo, Chengying Gao* and Ruomei Wang ACM Transactions on Graphics (Presented at SIGGRAPH 2024) (CCF-A) Project Page Paper Supplementary Code Abstract Bibtex To alleviate human labor in redrawing keyframes with ordered vector strokes for automatic inbetweening, we for the first time propose a joint stroke tracing and correspondence approach. Given consecutive raster keyframes along with a single vector image of the starting frame as a guidance, the approach generates vector drawings for the remaining keyframes while ensuring one-to-one stroke correspondence. Our framework trained on clean line drawings generalizes to rough sketches and the generated results can be imported into inbetweening systems to produce inbetween sequences. Hence, the method is compatible with standard 2D animation workflow. An adaptive spatial transformation module (ASTM) is introduced to handle non-rigid motions and stroke distortion. We collect a dataset for training, with 10k+ pairs of raster frames and their vector drawings with stroke correspondence. Comprehensive validations on real clean and rough animated frames manifest the effectiveness of our method and superiority to existing methods. @article{mo2024joint, title = {Joint Stroke Tracing and Correspondence for 2D Animation}, author = {Mo, Haoran and Gao, Chengying and Wang, Ruomei}, journal = {ACM Transactions on Graphics (TOG)}, year = {2024} }
	Text-based Vector Sketch Editing with Image Editing Diffusion Prior Haoran Mo, Xusheng Lin, Chengying Gao* and Ruomei Wang IEEE International Conference on Multimedia & Expo (ICME 2024) (CCF-B) Paper Supplementary Code Abstract Bibtex We present a framework for text-based vector sketch editing to improve the efficiency of graphic design. The key idea behind the approach is to transfer the prior information from raster-level diffusion models, especially those from image editing methods, into the vector sketch-oriented task. The framework presents three editing modes and allows iterative editing. To meet the editing requirement of modifying the intended parts only while avoiding changing the other strokes, we introduce a stroke-level local editing scheme that automatically produces an editing mask reflecting locally editable regions and modifies strokes within the regions only. Comparisons with existing methods demonstrate the superiority of our approach. @inproceedings{mo2024text, title={Text-based Vector Sketch Editing with Image Editing Diffusion Prior}, author={Mo, Haoran and Lin, Xusheng and Gao, Chengying and Wang, Ruomei}, booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)}, pages={1--6}, year={2024}, organization={IEEE} }
	Video-Driven Sketch Animation via Cyclic Reconstruction Mechanism Zhuo Xie, Haoran Mo and Chengying Gao* IEEE International Conference on Multimedia & Expo (ICME 2024) (CCF-B) Paper Abstract Bibtex Considering the time-consuming manual workflow in 2D sketch animation production, we present an automatic solution by using videos as reference to animate the static sketch images. This includes motion extraction from the videos and injection into the sketches to produce animated sketch sequences in which appearance properties from the source sketches should be preserved. To reduce blurry artifact caused by complex motions and maintain stroke line continuity, we propose to incorporate inner masks of the sketches as an explicit guidance to indicate inner regions and ensure component integrality. Moreover, to bridge the domain gap between the video frames and the sketches when modelling the motions, we introduce a cyclic reconstruction mechanism to increase compatibility with different domains and improve motion consistency between the sketch animation and the driving video. @inproceedings{xie2024video, title={Video-Driven Sketch Animation via Cyclic Reconstruction Mechanism}, author={Xie, Zhuo and Mo, Haoran and Gao, Chengying}, booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)}, pages={1--6}, year={2024}, organization={IEEE} }
	Controllable Anime Image Editing via Probability of Attribute Tags Zhenghao Song, Haoran Mo and Chengying Gao* Computer Graphics Forum (Pacific Graphics 2024) (CCF-B) Paper Code Abstract Bibtex Editing anime images via probabilities of attribute tags allows controlling the degree of the manipulation in an intuitive and convenient manner. Existing methods fall short in the progressive modification and preservation of unintended regions in the input image. We propose a controllable anime image editing framework based on adjusting the tag probabilities, in which a probability encoding network (PEN) is developed to encode the probabilities into features that capture continuous characteristic of the probabilities. Thus, the encoded features are able to direct the generative process of a pre-trained diffusion model and facilitate the linear manipulation. We also introduce a local editing module that automatically identifies the intended regions and constrains the edits to be applied to those regions only, which preserves the others unchanged. Comprehensive comparisons with existing methods indicate the effectiveness of our framework in both one-shot and linear editing modes. Results in additional applications further demonstrate the generalization ability of our approach. @inproceedings{song2024controllable, title={Controllable Anime Image Editing via Probability of Attribute Tags}, author={Song, Zhenghao and Mo, Haoran and Gao, Chengying}, booktitle={Pacific Graphics}, year={2024} }

2023

Controllable Garment Image Synthesis Integrated with Frequency Domain Features

Xinru Liang, Haoran Mo and Chengying Gao*

Computer Graphics Forum (Pacific Graphics 2023) (CCF-B)

Paper Abstract Bibtex

Using sketches and textures to synthesize garment images is able to conveniently display the realistic visual effect in the design phase, which greatly increases the efficiency of fashion design. Existing garment image synthesis methods from a sketch and a texture tend to fail in working on complex textures, especially those with periodic patterns. We propose a controllable garment image synthesis framework that takes as inputs an outline sketch and a texture patch and generates garment images with complicated and diverse texture patterns. To improve the performance of global texture expansion, we exploit the frequency domain features in the generative process, which are from a Fast Fourier Transform (FFT) and able to represent the periodic information of the patterns. We also introduce a perceptual loss in the frequency domain to measure the similarity of two texture pattern patches in terms of their intrinsic periodicity and regularity. Comparisons with existing approaches and sufficient ablation studies demonstrate the effectiveness of our method that is capable of synthesizing impressive garment images with diverse texture patterns while guaranteeing proper texture expansion and pattern consistency.

@inproceedings{liang2023controllable,
  title={Controllable Garment Image Synthesis Integrated with Frequency Domain Features},
  author={Liang, Xinru and Mo, Haoran and Gao, Chengying},
  booktitle={Pacific Graphics},
  year={2023}
}

2022

Multi-instance Referring Image Segmentation of Scene Sketches based on Global Reference Mechanism

Peng Ling, Haoran Mo and Chengying Gao*

Pacific Graphics (PG 2022) (CCF-B)

Paper Code Abstract Bibtex

Scene sketch segmentation based on referring expression plays an important role in sketch editing of anime industry. While most existing referring image segmentation approaches are designed for the standard task of generating a binary segmentation mask for a single or a group of target(s), we think it necessary to equip these models with the ability of multi-instance segmentation. To this end, we propose GRM-Net, a one-stage framework tailored for multi-instance referring image segmentation of scene sketches. We extract the language features from the expression and fuse it into a conventional instance segmentation pipeline for filtering out the undesired instances in a coarse-to-fine manner and keeping the matched ones. To model the relative arrangement of the objects and the relationship among them from a global view, we propose a global reference mechanism (GRM) to assign references to each detected candidate to identify its position. We compare with existing methods designed for multi-instance referring image segmentation of scene sketches and for the standard task of referring image segmentation, and the results demonstrate the effectiveness and superiority of our approach.

@inproceedings{ling2022multi,
  title={Multi-instance Referring Image Segmentation of Scene Sketches based on Global Reference Mechanism},
  author={Ling, Peng and Mo, Haoran and Gao, Chengying},
  booktitle={Pacific Graphics},
  year={2022}
}

Unpaired Motion Style Transfer with Motion-oriented Projection Flow Network

Yue Huang, Haoran Mo, Xiao Liang and Chengying Gao*

IEEE International Conference on Multimedia & Expo (ICME 2022, Oral) (CCF-B)

Paper Abstract Bibtex

Existing motion style transfer methods trained with unpaired samples tend to generate motions with inconsistent content or inconsistent number of frames when compared with the source motion. Moreover, due to the limited training samples, these methods perform worse in unseen style. In this paper, we propose a novel unpaired motion style transfer framework that generates complete stylized motions with consistent content. We introduce a motion-oriented projection flow network (M-PFN) designed for temporal motion data, which encodes the content and style motions into latent codes and decodes the stylized features produced by adaptive instance normalization (AdaIN) into stylized motions. The M-PFN contains dedicated operations and modules, e.g., Transformer, to process the temporal information of motions, which help to improve the continuity of the generated motions. Comparisons with the state-of-the-art methods show that our method effectively transfers the style of the motions while retaining the complete content and has stronger generalization ability in unseen style features.

@inproceedings{huang2022unpaired,
  title={Unpaired Motion Style Transfer with Motion-oriented Projection Flow Network},
  author={Huang, Yue and Mo, Haoran and Liang, Xiao and Gao, Chengying},
  booktitle={2022 IEEE International Conference on Multimedia and Expo (ICME)},
  pages={1--6},
  year={2022},
  organization={IEEE}
}

2021

General Virtual Sketching Framework for Vector Line Art

Haoran Mo, Edgar Simo-Serra, Chengying Gao*, Changqing Zou and Ruomei Wang

ACM Transactions on Graphics (SIGGRAPH 2021, Journal track) (CCF-A)

Project Page Paper Supplementary Code Abstract Bibtex

Vector line art plays an important role in graphic design, however, it is tedious to manually create. We introduce a general framework to produce line drawings from a wide variety of images, by learning a mapping from raster image space to vector image space. Our approach is based on a recurrent neural network that draws the lines one by one. A differentiable rasterization module allows for training with only supervised raster data. We use a dynamic window around a virtual pen while drawing lines, implemented with a proposed aligned cropping and differentiable pasting modules. Furthermore, we develop a stroke regularization loss that encourages the model to use fewer and longer strokes to simplify the resulting vector image. Ablation studies and comparisons with existing methods corroborate the efficiency of our approach which is able to generate visually better results in less computation time, while generalizing better to a diversity of images and applications.

@article{mo2021virtualsketching,
  title   = {General Virtual Sketching Framework for Vector Line Art},
  author  = {Mo, Haoran and Simo-Serra, Edgar and Gao, Chengying and Zou, Changqing and Wang, Ruomei},
  journal = {ACM Transactions on Graphics (TOG)},
  year    = {2021},
  volume  = {40},
  number  = {4},
  pages   = {51:1--51:14}
}

Line Art Colorization Based on Explicit Region Segmentation

Ruizhi Cao, Haoran Mo and Chengying Gao*

Computer Graphics Forum (Pacific Graphics 2021) (CCF-B)

Paper Supplementary Code Abstract Bibtex

Automatic line art colorization plays an important role in anime and comic industry. While existing methods for line art colorization are able to generate plausible colorized results, they tend to suffer from the color bleeding issue. We introduce an explicit segmentation fusion mechanism to aid colorization frameworks in avoiding color bleeding artifacts. This mechanism is able to provide region segmentation information for the colorization process explicitly so that the colorization model can learn to avoid assigning the same color across regions with different semantics or inconsistent colors inside an individual region. The proposed mechanism is designed in a plug-and-play manner, so it can be applied to a diversity of line art colorization frameworks with various kinds of user guidances. We evaluate this mechanism in tag-based and reference-based line art colorization tasks by incorporating it into the state-of-the-art models. Comparisons with these existing models corroborate the effectiveness of our method which largely alleviates the color bleeding artifacts.

@inproceedings{cao2021line,
  title={Line Art Colorization Based on Explicit Region Segmentation},
  author={Cao, Ruizhi and Mo, Haoran and Gao, Chengying},
  booktitle={Computer Graphics Forum},
  volume={40},
  number={7},
  year={2021},
  organization={Wiley Online Library}
}

2019

Language-based Colorization of Scene Sketches

Changqing Zou^#, Haoran Mo^#(equal contribution), Chengying Gao*, Ruofei Du and Hongbo Fu

ACM Transactions on Graphics (SIGGRAPH Asia 2019, Journal track) (CCF-A)

Project Page Paper Supplementary Code Slide Abstract Bibtex

Being natural, touchless, and fun-embracing, language-based inputs have been demonstrated effective for various tasks from image generation to literacy education for children. This paper for the first time presents a language-based system for interactive colorization of scene sketches, based on semantic comprehension. The proposed system is built upon deep neural networks trained on a large-scale repository of scene sketches and cartoon-style color images with text descriptions. Given a scene sketch, our system allows users, via language-based instructions, to interactively localize and colorize specific foreground object instances to meet various colorization requirements in a progressive way. We demonstrate the effectiveness of our approach via comprehensive experimental results including alternative studies, comparison with the state-of-the-art methods, and generalization user studies. Given the unique characteristics of language-based inputs, we envision a combination of our interface with a traditional scribble-based interface for a practical multimodal colorization system, benefiting various applications.

@article{zouSA2019sketchcolorization,
  title   = {Language-based Colorization of Scene Sketches},
  author  = {Zou, Changqing and Mo, Haoran and Gao, Chengying and Du, Ruofei and Fu, Hongbo},
  journal = {ACM Transactions on Graphics (TOG)},
  year    = {2019},
  volume  = {38},
  number  = {6},
  pages   = {233:1--233:16}
}

2018

SketchyScene: Richly-Annotated Scene Sketches

Changqing Zou^#, Qian Yu^#, Ruofei Du, Haoran Mo, Yi-Zhe Song, Tao Xiang, Chengying Gao, Baoquan Chen* and Hao Zhang

European Conference on Computer Vision (ECCV 2018) (CCF-B)

Project Page Paper Poster Code Abstract Bibtex

We contribute the first large-scale dataset of scene sketches, SketchyScene, with the goal of advancing research on sketch understanding at both the object and scene level. The dataset is created through a novel and carefully designed crowdsourcing pipeline, enabling users to efficiently generate large quantities realistic and diverse scene sketches. SketchyScene contains more than 29,000 scene-level sketches, 7,000+ pairs of scene templates and photos, and 11,000+ object sketches. All objects in the scene sketches have ground-truth semantic and instance masks. The dataset is also highly scalable and extensible, easily allowing augmenting and/or changing scene composition. We demonstrate the potential impact of SketchyScene by training new computational models for semantic segmentation of scene sketches and showing how the new dataset enables several applications including image retrieval, sketch colorization, editing, and captioning, etc.

@inproceedings{zou2018sketchyscene,
  title={Sketchyscene: Richly-annotated scene sketches},
  author={Zou, Changqing and Yu, Qian and Du, Ruofei and Mo, Haoran and Song, Yi-Zhe and Xiang, Tao and Gao, Chengying and Chen, Baoquan and Zhang, Hao},
  booktitle={Proceedings of the european conference on computer vision (ECCV)},
  pages={421--436},
  year={2018}
}