Exploring across domains
Diffusion MLLMs Efficiency

About Me · 关于

Portrait Gary
Portrait of Gary

Finding order within noise, and resonance across modalities.

6 Featured Papers
4 Research Directions

I am Gary, a PhD at USTC focusing on diffusion models and multimodal large language models.

My research explores the boundaries of generative AI — turning noise into art and enabling text, images, video, and 3D to flow freely across modalities. I believe AI is an extension of creativity.

我是 Gary,一名专注于 扩散模型多模态大模型 研究的 PhD。致力于探索生成式 AI 的边界,将噪声转化为艺术,让文本、图像、视频、3D 在不同模态间自由流转。

Diffusion Models
Multimodal Large Language Models
Noise Optimization
Text-to-Content Generation
Style Transfer
Hyperspectral Imaging
Efficient Inference
AI Art

// Research Directions · 研究领域

Research
directions

Generation Sampling

Diffusion

Deep study of diffusion process mathematics — more efficient sampling algorithms and noise scheduling strategies. Reducing generation time from infinity to milliseconds.

Text Image Video 3D

MLLMs

Constructing a unified representation space enabling free modality conversion in latent space — true cross-modal understanding and generation.

Compression Distillation

Efficiency

Through compression, quantization, and distillation — bringing large models to edge devices, making AI more ubiquitous and practical.

Theory Optimization

Machine Learning Theory

Studying the theoretical foundations of machine learning, including model generalization, optimization dynamics, representation learning, and the mathematical principles behind modern generative models.

Selected
Papers

Discuss collaboration
01 —

Beyond Randomness: Understand the Order of the Noise in Diffusion

2025.11 · Gary, et al. · Under Review

Challenges the conventional view that initial noise in diffusion generation is merely random — reveals analyzable semantic patterns and proposes a training-free Semantic Erasure-Injection process.

Diffusion Noise Optimization Training-free
02 —

Break Stylistic Sophon: Are We Really Meant to Confine the Imagination in Style Transfer?

2025.05 · Gary, et al. · Under Review

Introduces StyleWallfacer — a unified framework for high-quality style transfer addressing semantic drift, overfitting, and color limitation through triple diffusion and semantic style injection.

AI Art Style Transfer Diffusion
03 —

SHSRD: Efficient Conditional Diffusion Model for Single Hyperspectral Image Superresolution

2025.03 · Gary, et al. · JSTARS 2025

Proposes SHSRD — an efficient conditional diffusion framework for hyperspectral image superresolution using spectral information injection and two-stage transfer learning on small HSI datasets.

Diffusion HSI Low-Level Vision
04 —

Reusing Source Diffusion Model for Domain Perception: Towards Few-shot Image Generation via Fine-tuning

2025 · Gary, et al. · Expert Systems with Applications, 130797

Studies how a source diffusion model can be reused for domain perception and adapted through fine-tuning, aiming to improve few-shot image generation under limited target-domain data.

Diffusion Few-shot Generation Domain Adaptation
05 —

Using Dynamic Knowledge for Kernel Modulation: Towards Image Generation via One-shot Multi-domain Adaptation

2025 · Gary, et al. · Pattern Recognition, 112489

Explores dynamic knowledge for kernel modulation in one-shot multi-domain adaptation, targeting image generation when only extremely limited domain-specific supervision is available.

Image Generation One-shot Adaptation Kernel Modulation
06 —

Dual Objectives in Few-Shot Domain Adaptation: Image Restoration and Cross-Domain Alignment

2025 · Gary, et al. · Expert Systems with Applications, 130759

Formulates few-shot domain adaptation with dual objectives, jointly considering image restoration quality and cross-domain alignment to improve transfer under scarce target-domain examples.

Few-shot Adaptation Image Restoration Cross-domain Alignment

Contact Me · 建立连接

Gary_144@mail
ustc.edu.cn

Email 邮箱
Gary_144@mail.ustc.edu.cn
GitHub / Homepage
songyan888.github.io
Google Scholar
Publications profile
RedNote 小红书
Gary