Pixel-Wise Motion Deblurring of Thermal Videos


Manikandasriram Srinivasan Ramanagopal (University of Michigan); Zixu Zhang (University of Michigan); Ram Vasudevan (University of Michigan); Matthew Johnson Roberson (University of Michigan)

Abstract

Uncooled microbolometers can enable robots to see in the absence of visible illumination by imaging the “heat” radiated from the scene. Despite this ability to see in the dark, these sensors suffer from significant motion blur. This has limitedtheir application on robotic systems. As described in this paper, this motion blur arises due to the thermal inertia of each pixel. This has meant that traditional motion deblurring techniques, which rely on identifying an appropriate spatial blur kernel to perform spatial deconvolution, are unable to reliably performmotion deblurring on thermal camera images. To address this problem, this paper formulates reversing the effect of thermal inertia at a single pixel as a Least Absolute Shrinkage and Selection Operator (LASSO) problem which we can solve rapidly using a quadratic programming solver. By leveraging sparsity anda high frame rate, this pixel-wise LASSO formulation is able to recover motion deblurred frames of thermal videos without using any spatial information. To compare its quality against state-of-the-art visible camera based deblurring methods, this paper evaluated the performance of a family of pre-trained object detectors on a set of images restored by different deblurring algorithms.All evaluated object detectors performed systematically better on images restored by the proposed algorithm rather than any other tested, state-of-the-art methods.

Live Paper Discussion Information

Start Time End Time
07/14 15:00 UTC 07/14 17:00 UTC

Virtual Conference Presentation

Paper Reviews

Review 1

The paper makes a significant novel contribution in my opinion; it is well-written and well-presented, and the maths is consistent. There are some minor comments below to improve the latter. I don’t have major criticism, but I wanted to point out the use of the term hysteresis: I am not convinced that hysteresis is the right term to describe the main phenomenon underlying the blur in thermal images... How about "thermal inertia" or the like? See for instance https://en.wikipedia.org/wiki/Hysteresis . To quote: “…where there are different values of one variable depending on the direction of change of another variable…” and: “Systems with hysteresis are nonlinear, and can be mathematically challenging to model”. All of these characteristics of hysteresis are decisively not the case here. I don’t want to appear as a nit-picker, but I think this will be a core reference for the thermal deblurring papers to come, which makes correct use of terminology all the more essential. Here are some smaller points for improvement: - Please introduce the LASSO acronym in the abstract. - substrate -> substrate - Please define the indicator function in (12). - Eqn. (12): it would be good to say that this is simply a piece-wise constant signal with K_n equal-length intervals already here to give the reader easier access to the maths employed and the intuition behind it.

Review 2

The paper is well written, original and with a clear theoretical contribution that translates into significant results. Experiments compare the proposed method against five deblurring methods in the state of the art, including learning-based methods, and the proposed method outperforms them in the metrics utilized: visual quality and the output of an object detector. Other comments: - Some symbols are not always explained. Example: function II(x) in Eq (12) or (14b) is not introduced - The absence of ground truth is not desirable. It would be beneficial to try to devise and use image quality metrics for thermal cameras, similar in spirit to the ones developed for natural images (SSIM, etc.). Relying on the results of a pre-trained object detector to justify that the method is better is not the most satisfying approach. - There is a gap in the content of the text at the end of Section 1, when the organization of the paper is introduced. - "...every A satisfying (11) ..." -> strictly speaking A does not appear in (11). Maybe the authors refer to another equation, such as (16)-(17). This reference to "A satisfying (11)" appears also in Section IV. - It is difficult to see the 11 ms in Fig. 3. - In the conclusion, the last sentence about "the blur in thermal cameras can be explained by a fixed kernel that can be estimated..." needs to be revised, since it is not illustrated in the paper: Sections 3 and 4 do not talk about the kernel, and Section 5 does not plot any kernel. So, what kernel are the authors referring to? - References need to be revised: the acronyms are not capitalized and sometimes the publication venue is missing [32]. Suggestions: - Section 1 could be split into introduction (& contributions) and the related work. - I think it would be nice to mention in the introduction that the linear system of equations arises from discretizing at high rate a differential equation that models the thermal image formation process. - The introduction of the method in Eq. (12) could benefit from using an easier to follow description, such as "we consider the class of signals given by piecewise constant functions...". This appears much later in the text (Assumption 1). I also suggest to include the clarification in Folland's book: that the chosen class of functions is dense in L1 in the L1 metric. I think it makes it easier to follow. - Why not use (continuous) piecewise linear functions for a better approximation? - Please include units whenever possible, e.g., standard deviation 0.5 (Celsius?). - It would be nice to show some sensitivity analysis, to show how the algorithm behaves as its main parameter are changed.

Review 3

The paper is well reasoned and presents an interesting approach to forming and solving the linear set of equations to deblur microbolometer images using temporal filtering. I have a few points that would greatly improve the readability and connection to time series analysis. 1) Hysteresis is used incorrectly throughout the paper, and should be removed. The authors are not modeling hysteresis in any way, but instead simple transient response of the pixel sensor. This is highly confusing and unnecessary, just replace hysteresis with transient response everywhere. 2) It should be noted that the dataset is using static cameras with moving objects, and that the background, a large proportion of the image in all scenes presented, is at steady state such that no deblurring is required. This is why the sparsity arguments work, and makes the method extremely time-consuming and problematic for the stated application of robotics. If every pixel in the image is affected by transients, does the problem become overly computationally complex? 3. It is clear that both temporal and spatial effects should be corrupting each pixel, as the underlying signal from a motion blurred pixel is driven by the source at each timestep, and that source moves to an adjacent pixel at the next time step. The authors conclusions should be that the temporal effect outweighs the spatial effect, not that their model is correct. As such, it seems the authors overstate their conclusions in the intro and results, and more performance improvement may be possible if a spatial regularization was also included in solving the resulting source intensity problem over the image. 4) Finally, the authors over-complicate the presentation of multiple standard components in their formulation, choosing to derive or define without connecting to well established methods in signal processing and linear systems. For example, Equation 2 is simply a particular form of numerical integration (ZOH), which could easily be done any number of ways.