Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance

1The University of Tokyo, 2Microsoft Research Asia, 3National Institute of Informatics
teaser


Aimation-from-Blur supports various interfaces to extract multiple plausible sharp video clips from the same image with motion blur.

Abstract

We study the challenging problem of recovering detailed motion from a single motion-blurred image. Existing solutions to this problem estimate a single image sequence without considering the motion ambiguity for each region. Therefore, the results tend to converge to the mean of the multi-modal possibilities. In this paper, we explicitly account for such motion ambiguity, allowing us to generate multiple plausible solutions all in sharp detail. The key idea is to introduce a motion guidance representation, which is a compact quantization of 2D optical flow with only four discrete motion directions. Conditioned on the motion guidance, the blur decomposition is led to a specific, unambiguous solution by using a novel two-stage decomposition network. We propose a unified framework for blur decomposition, which supports various interfaces for generating our motion guidance, including human input, motion information from adjacent video frames, and learning from a video dataset. Extensive experiments on synthesized datasets and real-world data show that the proposed framework is qualitatively and quantitatively superior to previous methods, and also offers the merit of producing physically plausible and diverse solutions.


Video

Directional Ambiguity

Blur decomposition from a single blurry image faces the fundamental problem of directional ambiguity. Each independent and uniform motion blurred region can correspond to either a forward or a backward motion sequence, resulting in an exponential increase in the number of potential solutions for the image. However, existing methods for blur decomposition are designed to predict a single solution among them. This directional ambiguity brings instability to the training process and leads to poorly diversified and low-quality results.



Methodology

Network architecture

We propose a novel motion guidance representation to address the directional ambiguity in blur decomposition. The motion guidance is an optical flow representation quantized into four major quadrant directions, describing the motion field roughly. Given the input blurry image, conditioned on the compact motion guidance, the blur decomposition becomes a nearly deterministic one-to-one mapping problem without directional ambiguity. We propose a two-stage network to predict the image sequence. The first stage expands the blurry image into an image sequence based on the motion guidance, and the second stage refines the visual details in a residual fashion to generate high-quality images. The decomposition network shows significantly better training convergence with conditioning on an additional guidance input.

Multi-modal interfaces

Due to the compactness of motion guidance representation, our unified framework only needs to be trained once, while supporting various decomposition scenarios under different modalities. We provide three interfaces to acquire the motion guidance:

(1) a network to predict possible motion guidance, (2) motion from video, and (3) user annotation.


We follow the cVAE-GAN to build a guidance prediction network.



Visual Results

Results of using predicted guidance


Results of using guidance from video



Related Links

We have a new work that proposes BiT (Blur Interpolation Transformer), a fast and powerful transformer-based technique for arbitrary factor blur interpolation with state-of-the-art performance: Blur Interpolation Transformer for Real-World Motion from Blur (Website, Code).
In addition, we have another interesting work that uses another kind of motion artifact, i.e., rolling shutter distortion, to realize image2video: Bringing Rolling Shutter Images Alive with Dual Reversed Distortion (Website, Code).

BibTeX

@inproceedings{zhong2022animation,
  title={Animation from blur: Multi-modal blur decomposition with motion guidance},
  author={Zhong, Zhihang and Sun, Xiao and Wu, Zhirong and Zheng, Yinqiang and Lin, Stephen and Sato, Imari},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XIX},
  pages={599--615},
  year={2022},
  organization={Springer}
}
@inproceedings{zhong2023blur,
  title={Blur Interpolation Transformer for Real-World Motion from Blur},
  author={Zhong, Zhihang and Cao, Mingdeng and Ji, Xiang and Zheng, Yinqiang and Sato, Imari},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5713--5723},
  year={2023}
}