Blur Interpolation Transformer for
Real-World Motion from Blur

1The University of Tokyo, 2National Institute of Informatics
teaser


BiT (Blur Interpolation Transformer) is a fast and powerful transformer-based technique for arbitrary factor blur interpolation with state-of-the-art performance.

Synthetic Demos

Abstract

This paper studies the challenging problem of recovering motion from blur, also known as joint deblurring and interpolation or blur temporal super-resolution. The challenges are twofold: 1) the current methods still leave considerable room for improvement in terms of visual quality even on the synthetic dataset, and 2) poor generalization to real-world data. To this end, we propose a blur interpolation transformer (BiT) to effectively unravel the underlying temporal correlation encoded in blur. Based on multi-scale residual Swin transformer blocks, we introduce dual-end temporal supervision and temporally symmetric ensembling strategies to generate effective features for time-varying motion rendering. In addition, we design a hybrid camera system to collect the first real-world dataset of one-to-many blur-sharp video pairs. Experimental results show that BiT has a significant gain over the state-of-the-art methods on the public dataset Adobe240. Besides, the proposed real-world dataset effectively helps the model generalize well to real blurry scenarios.

Methodology

BiT is a a cutting-edge model or blur interpolation, constructed using Multi-scale Residual Swin Transformer Blocks (MS-RSTBs). To enhance the performance of BiT for blur interpolation, we have incorporated two temporal strategies, namely Dual-end Temporal Supervision (DTS) and Temporally Symmetric Ensembling (TSE). DTS involves the use of temporal supervision at both ends of the exposure time, while TSE involves ensembling the features obtained from forward and backward directions of the same time point. These strategies lead to a significant improvement in the performance of BiT for blur interpolation.




Real-world Blur Interpolation Dataset (RBI)

The currently available synthetic dataset, such as Adobe240, exhibits unrealistic spikes or steps in the blur trajectory. As a solution, we introduce the Real-World Blur Interpolation (RBI) dataset, which is the first dataset of its kind, created using a custom hybrid camera system. RBI offers a more authentic representation of real-world blur trajectories and is designed specifically for blur interpolation tasks.




Quantitative Results

Our method surpasses the prior art with a significant margin while also being much faster.



Qualitative Results

We can see that the predictions of BiT and BiT++ are closer to the ground truth with clearer details on both Adobe240 and RBI. Additionally, the optical flow of our results is also closer to the groundtruth, which indicates better motion consistency.



Related Links

We have another interesting work that explicitly solves the directional ambiguity problem in the blur interpolation task: Animation from Blur: Multi-modal Blur Decomposition with Motion Guidance (Website, Code).

BibTeX

@inproceedings{zhong2023blur,
  title={Blur Interpolation Transformer for Real-World Motion from Blur},
  author={Zhong, Zhihang and Cao, Mingdeng and Ji, Xiang and Zheng, Yinqiang and Sato, Imari},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5713--5723},
  year={2023}
}
@inproceedings{zhong2022animation,
  title={Animation from blur: Multi-modal blur decomposition with motion guidance},
  author={Zhong, Zhihang and Sun, Xiao and Wu, Zhirong and Zheng, Yinqiang and Lin, Stephen and Sato, Imari},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XIX},
  pages={599--615},
  year={2022},
  organization={Springer}
}