Open Access

Probabilistic motion pixel detection for the reduction of ghost artifacts in high dynamic range images from multiple exposures

Contributed equally
EURASIP Journal on Image and Video Processing20142014:42

DOI: 10.1186/1687-5281-2014-42

Received: 15 April 2014

Accepted: 7 August 2014

Published: 21 August 2014

Abstract

This paper presents an algorithm for compositing a high dynamic range (HDR) image from multi-exposure images, considering inconsistent pixels for the reduction of ghost artifacts. In HDR images, ghost artifacts may appear when there are moving objects while taking multiple images with different exposures. To prevent such artifacts, it is important to detect inconsistent pixels caused by moving objects in consecutive frames and then to assign zero weights to the corresponding pixels in the fusion process. This problem is formulated as a binary labeling problem based on a Markov random field (MRF) framework, the solution of which is a binary map for each exposure image, which identifies the pixels to be excluded in the fusion process. To obtain the ghost map, the distribution of zero-mean normalized cross-correlation (ZNCC) of an image with respect to the reference frame is modeled as a mixture of Gaussian functions, and the parameters of this function are used to design the energy function. However, this method does not well detect faint objects that are in low-contrast regions due to over- or under-exposure, because the ZNCC does not show much difference in such areas. Hence, we obtain an additional ghost map for the low-contrast regions, based on the intensity relationship between the frames. Specifically, the intensity mapping function (IMF) between the frames is estimated using pixels from high-contrast regions without inconsistent pixels, and pixels out of the tolerance range of the IMF are considered moving pixels in the low-contrast regions. As a result, inconsistent pixels in both the low- and high-contrast areas are well found, and thus, HDR images without noticeable ghosts can be obtained.

Keywords

Exposure fusion High dynamic range image Image fusion Ghost artifacts

Introduction

The dynamic ranges of most commercial image sensors and display devices are narrower than the radiance range of an actual scene, and hence, under- or over-exposure is often inevitable. In order to overcome such limitations of image sensors and displays, a number of multi-exposure capturing and processing techniques have been proposed, which can be roughly categorized into two approaches: high dynamic range imaging (HDRI) with tone mapping [14] and image fusion methods [510]. The former generates an image of higher dynamic range (i.e., higher bit depth for each pixel) from multiple images having different exposures. To obtain this image, the camera response function (CRF) must be known or estimated, and a tone mapping process is needed when showing the synthesized HDR image on a low dynamic range (LDR) display. On the other hand, the latter generates a tone-mapped-like high-quality image by the weighted addition of multiple exposure images and thus does not need CRF estimation, HDR image generation, and tone-mapping process. Hence, fusion approaches tend to require fewer computations than conventional HDRI, while providing comparable image quality for the LDR displays. Of course, HDRI is the more appropriate solution when showing images on HDR devices.

The conventional exposure fusion and the HDRI work well for the static scene when multi-exposure images are well registered and there is no moving object. But the ghost artifact is often observed in the HDR image from the dynamic scene where the images are not aligned and/or some objects are moving. Hence, there have been much efforts to alleviate the ghosting problem in the case of HDRI approaches. Some of the existing algorithms consider misalignment of input frames and moving objects simultaneously, while others assume well-aligned input or pre-registration of misaligned frames and concentrate on the detection of moving objects that cause inconsistency. For example, the study [11] exploits the measure of local entropy differences to identify regions that might contain moving pixels, which are then excluded from the HDRI generation process. In addition, Khan et al. [12] proposed an iterative method that gives larger weights to static and well-exposed pixels, thereby diminishing the weights for pixels that can cause ghosts. Li et al. [13, 14] proposed methods to detect and modify moving pixels based on the intensity mapping function (IMF) [15]. There are also patch-based methods, in which patches including moving objects are excluded [16, 17]. To simultaneously deal with the misalignment and moving objects, Zimmer et al. [18] proposed an optical flow-based energy minimization method, and Hu et al. [19] used non-rigid dense correspondence and color transfer function for this task. Recently, low-rank matrix-based algorithms [20, 21] have also been presented, based on the assumption that irradiance maps are linearly related to LDR exposures.

In the case of exposure fusion, there are also similar approaches for ghost removal. For example, the median threshold bitmap approach was proposed to detect clusters of inconsistent pixels [22], which are then excluded when fusing the images. In addition, a gradient domain approach was introduced that gives smaller weights to inconsistent pixels [23]. The IMF is used to exclude region of inconsistent pixels in the fusion process [24, 25], where the images are over-segmented and the IMF is used to detect the inconsistent regions. In our previous work [26], we proposed a method to detect inconsistent pixels based on a test of the reciprocity law of exposure and the measure of zero-mean normalized cross-correlation (ZNCC). It is noted that the ZNCC between a region in an image and its corresponding region in the reference is close to 1 when there is no moving object. Hence, a pixel is considered to be inconsistent when the region around the pixel shows low ZNCC under a certain threshold, i.e., the hard thresholding of ZNCC was used.

In this paper, we propose a probabilistic approach to constructing a ghost map, which is a binary image depicting the pixels to be excluded in the exposure fusion process. We assume that the images are well registered, otherwise apply a registration algorithm. The basic measure is also based on the ZNCC, but probabilistic soft thresholding is used instead of the hard thresholding used in our previous work. Specifically, ZNCC histogram is modeled as a Gaussian mixture function, where the parameters are found using an expectation maximization (EM) algorithm. Generating a ghost map is then posed as a binary labeling problem based on a Markov random field (MRF) framework, where the energy to be minimized is designed as a function of the ZNCC distribution parameters. It will be shown that the proposed method provides a less noisy and more accurate binary map than the simple hard thresholding method.

However, as in other feature-based methods, the ZNCC shows meaningful differences only for well-contrasted and highly textured regions. Hence, feature-based methods often give incorrect results in low-contrast regions where the pixel values are about to be saturated due to over- or under-exposure and also in low-textured regions. For these regions, we exploit the IMF between the images, which was successfully used in [13, 14, 24, 25]. In this paper, the IMF is estimated from regions having high ZNCC only, because other regions are saturated or moving object regions that have low credibility in estimating the IMF. Then, the pixels lying outside the IMF tolerance are considered pixels on the faint moving objects. To determine the ghost map in this region, we also develop an optimization technique, which yields less noisy results than conventional IMF-based thresholding methods. Experimental results show that the proposed method constructs plausible ghost maps and hence yields pleasing HDR images without noticeable ghost artifacts.

The rest of this paper is organized as follows: In the second section, we review the conventional weight map generation method [6]. In the third section, we describe the proposed algorithm that excludes the ghost pixels from the weight map. Then, we show some experimental results, and finally, conclusions are given in the last section.

Review of exposure fusion

Conventional exposure fusion methods create an output as a weighted sum of multiple exposure images, in which the weights reflect the quality of pixels in terms of contrast, saturation, and well-exposedness. The contrast is computed by Laplacian filtering [27], and the saturation is defined as the standard deviation of the pixels in each component. The measure of well-exposedness is designed to have the largest value when a pixel value is around the center of the dynamic range. The weight map for each exposure image is calculated by using these measures as
W k ( p ) = C k ( p ) ω C × S k ( p ) ω S × E k ( p ) ω E
(1)
where p is the pixel index; k means the k-th exposure image; C, S, and E represent contrast, saturation, and well-exposedness, respectively; and ω C , ω S , and ω E are corresponding weighting factors. After the weighted images are added, multi-resolution blending is performed by using pyramidal image decomposition [28]. Figure 1 presents weight maps for the corresponding multi-exposure images, as well as the output as a result of the weighted sum of images. The ghost artifacts can be observed in the red box of Figure 1c, caused by moving people.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig1_HTML.jpg
Figure 1

An example of the exposure fusion. (a) A sequence of multi-exposure images. (b) Weight maps for the corresponding input images. (c) Fused image where ghost effect can be found in the red box due to moving people.

Proposed algorithm

The core of the ghost reduction algorithm is to find inconsistent pixels that can cause artifacts, for excluding them from the fusion process. For this task, we first determine a reference frame among the multi-exposure images, one that has the largest well-contrasted region. Then, in all other input frames except for the reference frame, we find regions that have moving objects with respect to the reference. More specifically, we construct a ghost map (binary image) for each input frame except for the reference, which indicates which pixel to exclude or include in the image fusion process. When a pixel in the ghost map is 1, the corresponding pixel in the input frame will be included in the fusion process and vice versa.

The proposed method begins by finding the reference frame that has the largest well-contrasted region (i.e., smallest saturated region) as in conventional methods [16, 2426, 29, 30]. It needs to be noted that our method identifies the inconsistent pixels in high-contrast regions and low-contrast regions separately. For this, we define a saturation map b, which is also a binary matrix with the size of input image. Note that this matrix can be constructed when finding the reference frame, because we check the contrast of regions at this time. Precisely, if we denote the element of b at the pixel position p as b(p), then it is given 1 when the p belongs to a well-contrast region in the reference frame and 0 when it belongs to the low-contrast region. In summary, for each of the input frames except the reference, we find the ghost maps for the region of b(p)=0 and b(p)=1 separately. The ghost map for the well-contrast region (b(p)=1) will be denoted as g w and the ghost map for the low-contrast region (b(p)=0) as g l in the rest of this paper. After finding these ghost maps for an input frame, the overall ghost map for the frame is constructed as g=g w g l . In this paper, finding g w and g l are posed as binary labeling problems, i.e., as the energy minimization problems that are solved by graph cuts [31].

Construction of g w

The construction of g w is based on the ZNCC measure, from the observation that the ZNCC of the region containing a moving object is low when compared with the ZNCC in the static region. The energy function for this binary labeling problem is defined as
E W ( g w ) = p P W E W D ( g w ( p ) ) + γ W × ( p , q ) N W E W S ( g w p ) , g w ( q ) )
(2)

where g w (p) is the pixel value (1 or 0) at a pixel p, P W is the set of pixels in the well-contrasted region (all the p’s with b(p)=1), N W is the set of all unordered pair of neighboring pixels over the areas of b(p)=1, and γ W is a weighting factor for balancing the data cost E W D and the smoothness cost E W S .

The data cost E W D

The ZNCC of a region R centered at a pixel p, with respect to the corresponding region of the reference image, is defined as
Z ( p ) = p R D ref ( p ) × D ( p ) p R D ref ( p ) 2 × p R D ( p ) 2
(3a)
where D ref ( p ) = I ref p Ī ref p
(3b)
D ( p ) = I p Ī p
(3c)
where Iref and I mean the reference and a given frame respectively, and Ī ref and Ī represent the mean values of Iref and I in the region R, respectively. When there is no moving object in the scene, the histogram of ZNCC usually appears like Figure 2 because the ZNCC is close to 1 at most pixels. Hence, the histogram of ZNCC can be modeled as a left-sided normal distribution with the mean close to 1. On the other hand, Figure 3 shows another pair of images that capture a scene with some moving objects. Since the ZNCC becomes very small at the pixels of moving objects, the ZNCC distribution becomes multi-modal, which can be considered a mixture model consisting of two or more Gaussian distributions. Since the state of a pixel is just two (moving pixel or not) in our problem, we model the distribution as a sum of two Gaussian function as
p r ( x ) = i = 1 2 p r ( x | i ) P r ( i )
(4)
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig2_HTML.jpg
Figure 2

A pair of static images without moving objects (left) and the histogram of ZNCC for this set of images (right).

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig3_HTML.jpg
Figure 3

A pair of dynamic images with moving people (left) and the histogram of ZNCC for this set of images (right).

where i is the state and input data x is the ZNCC values. From the learning by EM algorithm, we find the parameters of two Gaussian density functions such as mean μ i , variance σ i , and weight P r (i).

With these models and parameters, the data term E W D is designed to give penalty to the mis-labeled pixels (e.g., labeled 1 while it is close to ghost). Specifically, the data cost is constructed as a negative of log-likelihoods of two Gaussian density function as
E W D ‘ghost’ = ln Pr x p | G
(5a)
E W D ‘non-ghost’ = ln Pr x p | NG
(5b)
where
Pr x p | G = 1 2 π σ 1 exp x p μ 1 2 2 σ 1 2
(5c)
Pr x p | NG = 1 2 π σ 2 exp x p μ 2 2 2 σ 2 2 .
(5d)

The smoothness cost E W S

The smoothness cost E W S is designed by Potts model [32] as
E W S ( g w ( p ) , g w ( p ) ) = B ( p , q ) × δ ( g w ( p ) , g w ( q ) )
(6a)
where
δ ( g w ( p ) , g w ( q ) ) = 1 , if g 1 ( p ) g 1 ( q ) 0 , otherwise
(6b)
B ( p , q ) = exp ( I ( p ) I ( q ) ) 2 2 σ 2
(6c)

where B(p,q) means the edge cue which represents pixel intensity difference. When the adjacent pixels are bordering the edge, the ‘smoothness cost’ is diminished by B(p,q).

For each of the input frames, the binary map g w is found by constructing and minimizing Equation 2. For example, Figure 4a,b shows the same area of multi-exposure images, where Figure 4a shows the crop of the reference frame and Figure 4b shows the crop of an under-exposed image where there appears moving people. The binary map resulting from the above equation is shown in Figure 4d, where the white area (labeled as 1) is the non-ghost area and the dark area (labeled as 0) contains motion pixels that are to be excluded. Figure 4c shows a binary map obtained with our previous method in [26], and the comparison with Figure 4d shows that the proposed method gives a more accurate ghost map.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig4_HTML.jpg
Figure 4

A comparison of ghost map with hard thresholding method [26].(a) Reference image. (b) A differently exposed image where people appear on the left region. (c) Ghost map by hard thresholding method [26]. (d) Ghost map by the proposed method.

Construction of g l

The ghost map g w for the well-contrasted region is found from the above procedure, and now we find the ghost map g l for the low-contrast region (for the regions with b(p)=0). The problem with the low-contrast region is that there are too little textures to apply the feature-based methods (such as median pixel value [22], gradient [23], and ZNCC). Hence, we resort to intensity relationship between the frames for detecting the motion pixels in these areas. The basic idea is that the static area shows the intensity changes according to the amount of exposure difference, whereas the areas with motion pixels will not follow that. In other words, the static area will have the luminance changes according to the IMF, whereas the dynamic areas will not.

Based on the above observation, we design the energy function for finding g l as
E L ( g l ) = p P L E L D ( g l ( p ) ) + γ L × ( p , q ) N L E L S ( g l ( p ) , g l ( q ) )
(7)

where g l (p) is the pixel value (1 or 0) at a pixel p, P L is the set of pixels in the low-contrast region (all the p’s with b(p)=0), N L is the set of all unordered pair of neighboring pixels over the areas of b(p)=0, and γ L is a weighting factor for balancing the data cost E L D and the smoothness cost E L S . The smoothness cost E L S that prevents noisy result is defined the same as Equations 6a and 6c, except that g w is replaced by g l .

The data cost E L D

As stated above, we use the ‘compliance of IMF’ for detecting the moving pixels in the low-contrast region, and thus, we have to estimate the IMF. In this paper, unlike the existing IMF estimation methods in [24, 25] which use all the pixels without considering the pixel quality, we use the pixels only in the areas of ‘high-contrast region without moving objects,’ which correspond to the region of g w (p)=1 for a given image.

The algorithm for estimating the IMF is graphically shown in Figure 5, where Figure 5a shows the binary map g w (p). That is, the white pixels denote the ones with g w (p)=1. Figure 5b shows the overlap of this map with the reference frame, i.e., the pixel-wise multiplication of the map in Figure 5a and the reference image. Likewise, Figure 5c shows the multiplication of the map in Figure 5a with the given input image to be compared with the reference. Finally, the IMF is estimated by comparing only the colored pixels of Figure 5b,c. Figure 5d shows the plot of the pairs of pixel values from these colored regions of Figure 5b,c, and the red line is considered the IMF, which is obtained by curve fitting the dots by the fourth-order polynomial [24, 25]. Then, we define the tolerance range (upper and lower blue lines in Figure 5) of intensity variation from the IMF, which means that a dot out of this range is a pair of pixels where one of the pixels possibly belongs to a moving object. Figure 5e shows how this range is determined. Specifically, it shows the histogram of the distances of dots from the IMF curve, and a dot out of 4σ range (out of blue lines in Figure 5f) is considered the ghost pixel pair.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig5_HTML.jpg
Figure 5

Estimation of IMF between a given exposure image and the reference. (a) A binary map where the white pixel denotes g w (p)=1 (well contrasted and static). (b) Overlap of the reference image with the binary map. (c) Overlap of comparing image with the binary map. (d) Each dot represents a pair of pixel values (only for the pair of colored pixels of (b) and (c)), and the red line is the estimated IMF by curve fitting. (e) Histogram of the distances of dots from the IMF. (f) The dots out of 4σ range are considered to include ghost pixels.

Based on the estimated IMF, we design the data cost of energy function as
E L D ( g l ( p ) ) = 0 , if g l ( p ) = 0 G ( p ) > 4 σ or g l ( p ) = 1 G ( p ) 4 σ 1 , otherwise
(8a)
where
G ( p ) = | I ( p ) IMF ( I ref ( p ) ) |
(8b)
where I(p) is the intensity of pixel p of the given frame and IMF(Iref(p)) is the mapping of Iref(p) according to the IMF. Then, minimizing the total energy by graph cuts gives a binary map g l for the given input frame. Figure 6 compares the binary maps that represent ghost pixels as black, which are generated by hard thresholding method [29] and the proposed optimization method. Figure 6a,b shows the reference image and a differently exposed image, respectively, and Figure 6c,d shows the ghost maps of the thresholding and probabilistic methods, respectively. Also, Figure 6e,f shows the crops of the above images for better comparison. It can be observed that the optimization leads to less map.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig6_HTML.jpg
Figure 6

Ghost maps obtained by the IMF-based methods. (a) Reference image. (b) Differently exposed image. (c) Ghost map by hard thresholding method [29]. (d) Ghost map by the proposed method. (e) Magnification of (c). (f) Magnification of (d).

Experimental results

In the previous section, we have seen that each of the steps provides a less noisy and/or more plausible ghost map than the hard thresholding methods [26, 29]. In this section, we compare the accuracy of the overall ghost map g=g w g l with those of the existing methods. First, we compare the accuracy of the ghost map with [22, 23, 26], for the pair of the reference image in Figure 7a and a differently exposed image in Figure 7b. It can be seen that people appear in Figure 7b in the red box area. Hence, the ground truth ghost map for this area should be like in Figure 7c, and the ghost maps produced by [22, 23, 26] and the proposed method appear in Figure 7d,e,f,g, respectively. As can be observed in the figures, the median threshold bitmap approach [22] fails to detect the moving object, and the gradient domain method [23] determines the non-ghost region as a ghost region. The reason for this seems that the brightness/gradient difference between the moving object and the background is too small. Our previous work [26] used weight factor for ZNCC measure to extremely diminish ghost effect, so this method also regards the non-ghost pixels as ghost. On the other hand, the ghost map produced by the proposed algorithm is closest to the ground truth. Figures 8 and 9 show another results that the proposed method provides better ghost map.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig7_HTML.jpg
Figure 7

Comparison of the overall ghost maps. (a) Reference image and the magnification of the red box region. (b) Under-exposed image and the magnification of the red box region. (c) Ground truth ghost map for the red box region. (d) Motion map by [22]. (e) Consistency map by [23]. (f) Ghost map by [26]. (g) Ghost map by the proposed method.

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig8_HTML.jpg
Figure 8

Another comparison of ghost maps. (a) Reference image and the magnification of red box region. (b) Over-exposed image and its magnification in the red box. (c) Ground truth ghost map for the red box region. (d) Motion map by [22]. (e) Consistency map by [23]. (f) Ghost map by [26]. (g) Ghost map by our method.

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig9_HTML.jpg
Figure 9

Comparison of ghost maps for other set of images. (a) Reference image and the magnification of red box region. (b) Under-exposed image and its magnification in the red box. (c) Ground truth ghost map for the red box region. (d) Motion map by [22]. (e) Consistency map by [23]. (f) Ghost map by [26]. (g) Ghost map by the proposed method.

Figures 10 and 11 show the comparison of [25] and the proposed method for detecting the static region and estimating the IMF between the reference and a given frame. Figure 10a shows the static region detected by [25] (the dynamic region is denoted as black and only the static region remains) for the pair of images in Figure 8a,b and also the estimated IMF from this result. It can be seen that the moving people (dynamic region) are not removed and conversely the floor is detected as moving pixels. In the case of the result of our method (Figure 10b), it can be seen that dynamic regions are successfully detected, and hence, we can obtain more plausible IMF that should be a monotonically increasing function [15]. Figure 11 shows a similar result for the multi-exposure images in Figure 9a,b.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig10_HTML.jpg
Figure 10

Comparison with Raman and Chaudhuri [25].(a) (top) Detected static region of over-exposed source image in Figure 8b by [25] and (bottom) estimation of IMF (the red line which is obtained by curve fitting the dots by the fourth-order polynomial) using the static region. (b) The result by the proposed method.

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig11_HTML.jpg
Figure 11

Another comparison with Raman and Chaudhuri [25].(a) Detected static region of under-exposed source image in Figure 9b by [25] and (bottom) estimation of IMF (the red line which is obtained by curve fitting the dots by the fourth-order polynomial) using the static region. (b) Result by the proposed method.

Figures 12,13,14 show the comparison of final fusion results for the eight existing methods [16, 17, 2023, 26, 30] and the proposed method. Specifically, Figure 12a shows the sequence of multi-exposure images, and Figure 12b,c,d shows the comparisons of [16, 22] and [26] with the proposed method. The first row of these figures shows the area for the comparison in the red boxes, the second row are the results of these compared methods in the order [16, 22, 26], and the bottom row shows our results for the corresponding areas. In these figures, it can be seen that the compared methods show some noticeable ghosts, whereas the proposed method does not. Figure 12f shows the overall area of fusion result of our method, and Figure 12e shows the result by [20]. Comparison shows that our method yields comparable output as the HDRI approach by Oh et al. [20], which also yields almost no noticeable artifacts for the given image sequence. Likewise, Figure 13 shows the comparison of [16, 20, 21, 23, 30] and [17] with the proposed method. The photos of Figure 13a are the input images, and the second rows of Figure 13b,c,d,e show the results by [16, 23, 30] and [20], respectively, where the images in the third row are the results of the proposed method at the same area. It can be observed that the method of Gallo et al. [16] removes ghost successfully; however, when the difference of brightness among the neighboring patch is large, this causes some visible seam as shown in Figure 13b. In Figure 13c,d,e, we can see ghost artifacts in the existing methods, whereas the artifact is not noticeable in the case of our algorithm (third row). Also, Figure 13f,g,h shows the results of [17, 21] and the proposed method for this set of images, where the ghost artifact is not noticeable in the overall area. Figure 14 shows another comparison of [22, 23, 30] and [21] with the proposed method. Figure 14a shows the set of input images, and the second row of Figure 14b,c,d,e shows the result of [22, 23, 30] and [21], respectively, in the red box area of images in the first row. The images in the third row are the results by the proposed method in the same area, and Figure 14f shows the result of our method in the overall area. It can be observed that the proposed method shows almost no ghost artifacts, while others have a little noticeable artifacts as in the second row.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig12_HTML.jpg
Figure 12

Comparison of results for a set of multi-exposure images. (a) Multi-exposure images. (b) Result by the patch-based algorithm based on the HDRI method [16]. (c) Result by the median threshold bitmap approach [22]. (d) Result by our previous method [26]. (e) Result by the low-rank matrix-based approach [20]. (f) Result by the proposed method.

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig13_HTML.jpg
Figure 13

Another comparison of results for a set of multi-exposure images. (a) Multi-exposure images. (b) Result by the patch-based algorithm based on the HDRI method [16]. (c) Gradient domain approach [23]. (d) PatchMatch-based method [30]. (e) Low-rank approach in [20]. (f) Low-rank matrix-based approaches in [21]. (g) Hybrid patch-based approach [17]. (h) The proposed method.

https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig14_HTML.jpg
Figure 14

Another comparison of results for a set of multi-exposure images. (a) Multi-exposure images. (b) Result by the median threshold bitmap approach [22]. (c) Gradient domain approach [23]. (d) PatchMatch-based method [30]. (e) Low-rank matrix-based approach [21]. (f) The proposed method.

It is noted that our method has some limitations when the selected reference has ‘saturated and moving’ foreground object. For example, Figure 15a shows a set of multi-exposure images where the third image is selected as the reference frame because it has the largest area of well-contrast region in the background. In this case, the proposed algorithm yields the fusion result as shown in Figure 15c because the foreground object is moving and hence excluded from the fusion process. On the other hand, since the algorithm in [17] sets the first image as the reference and tries to track the inconsistent pixels, it keeps the foreground object very well as shown in Figure 15b. When we wish to keep the contrast of foreground object, we have to select the reference manually in this case. If we also select the reference as the first frame, then we obtain the result shown in Figure 15d. Finally, it is worth to comment that the fusion results can be enhanced by any conventional histogram equalization or edge-preserving enhancement method such as [33], like the HDRI performs tone mapping process for the optimal display of HDR on the LDR display devices. The executables for our algorithm and full resolution results with this post-processing are available at http://ispl.snu.ac.kr/~jhahn/deghost/, which are also available as Additional files 1 and 2.
https://static-content.springer.com/image/art%3A10.1186%2F1687-5281-2014-42/MediaObjects/13640_2014_Article_139_Fig15_HTML.jpg
Figure 15

Another comparison of results. (a) Multi-exposure images. (b) Result by the hybrid patch-based approach [17]. (c) The proposed method. (d) The proposed method with the manual selection of the reference frame.

Conclusions

We have proposed an HDR image fusion algorithm with reduced ghost artifacts, by detecting inconsistent pixels in the high-contrast and low-contrast regions separately. To detect inconsistent pixels in high-contrast areas, a ZNCC measure is used based on the observation that the ZNCC histogram displays a unimodal distribution in static regions, whereas it has a multimodal shape in dynamic regions. A cost function based on the parameters of these probability distributions is designed, whose minimization yields the ghost map for the highly contrasted region. To detect the ghost map in the low-contrast region, the IMF is first estimated using pixels from the high-contrast regions having no moving objects. Next, a cost function that encodes the IMF compliance of the pixel pairs is designed, whose minimization gives the ghost map for the low-contrast areas. The overall ghost map is defined as the logical operation of these two maps, and the ghost pixels are excluded from the fusion process. Since the proposed algorithm can find faint moving objects in areas where the pixel values are about to be saturated due to over- and under-exposure, it provides satisfactory HDR outputs with no noticeable ghost artifacts. However, the proposed method has limitations in correcting moving foreground object when it is saturated in the reference frame (Figure 15), because they are simply excluded from the fusion process. In this case, we have to manually select a reference frame that has well-exposed foreground objects, which can degrade the fusion results due to the narrower well-exposed background region than in the reference. Otherwise, we need to correct the inconsistent pixels instead of simply excluding them, which is a very challenging problem, especially when the moving foreground object is not consistently detected in each frame due to saturation, noise, or non-rigid motion.

Notes

Declarations

Acknowledgements

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2009-0083495).

Authors’ Affiliations

(1)
Department of Electrical and Computer Engineering, INMC, Seoul National University
(2)
Samsung SDS

References

  1. Debevec PE, Malik J: Recovering high dynamic range radiance maps from photographs. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’97). Los Angeles; Aug 1997:369-378.View ArticleGoogle Scholar
  2. Reinhard E, Ward G, Pattanaik S, Debevec P: High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting (The Morgan Kaufmann Series in Computer Graphics). Morgan Kaufmann, San Francisco; 2005.Google Scholar
  3. Mann S, Picard R, Mann S, Picard RW: On being ‘undigital’ with digital cameras: extending dynamic range by combining differently exposed pictures. In Proceedings of the 48th IS&T’s Annual Conference. Washington, DC; May 1995:442-448.Google Scholar
  4. Devlin K: A review of tone reproduction techniques. Technical report CSTR-02-005. Department of Computer Science, University of Bristol; 2002.Google Scholar
  5. Goshtasby AA: Fusion of multi-exposure images. Image Vis. Comput 2005, 23(6):611-618. 10.1016/j.imavis.2005.02.004View ArticleGoogle Scholar
  6. Mertens T, Kautz J, Reeth FV: Exposure fusion: a simple and practical alternative to high dynamic range photography. Comput. Graph Forum 2009, 28(1):161-171. 10.1111/j.1467-8659.2008.01171.xView ArticleGoogle Scholar
  7. Raman S, Chaudhuri S, matte-less A: variational approach to automatic scene compositing. In IEEE International 11th Conference on Computer Vision. Los Alamitos; Oct 2007:1-6.Google Scholar
  8. Malik MH, Asif S, Gilani M: Wavelet based exposure fusion. In Proceedings of the World Congress on Engineering. London; July 2008:688-693.Google Scholar
  9. Raman S, Chaudhuri S: Bilateral filter based compositing for variable exposure photography. In Proceeding of Eurographics Short Papers. Munich; Mar 2009:1-4.Google Scholar
  10. Shen J, Zhao Y, He Y: Detail-preserving exposure fusion using subband architecture. Vis. Comput 2012, 28(5):463-473. 10.1007/s00371-011-0642-3View ArticleGoogle Scholar
  11. Jacobs K, Loscos C, Ward G: Automatic high-dynamic range image generation for dynamic scenes. IEEE Comput. Graph. Appl 2008, 28(2):84-93.View ArticleGoogle Scholar
  12. Khan E, Akyuz A, Reinhard E: Robust generation of high dynamic range images. In Proceedings of the IEEE International Conference on Image Processing. Atlanta; Oct 2006:2005-2008.Google Scholar
  13. Li Z, Rahardja S, Zhu Z, Xie S, Wu S: Movement detection for the synthesis of high dynamic range images. In Proceedings of the IEEE International Conference on Image Processing. Hong Kong; Sept 2010:3133-3136.Google Scholar
  14. Wu S, Xie S, Rahardja S, Li Z: A robust and fast anti-ghosting algorithm for high dynamic range imaging. In Proceedings of the IEEE International Conference on Image Processing. Hong Kong; Sept 2010:397-400.Google Scholar
  15. Grossberg MD, Nayar SK: Determining the camera response from images: what is knowable. IEEE Trans.Pattern Anal. Mach. Intell 2003, 25(11):1455-1467. 10.1109/TPAMI.2003.1240119View ArticleGoogle Scholar
  16. Gallo O, Gelfand N, Chen W, Tico M, Pulli K: Artifact-free high dynamic range imaging. In IEEE International Conference on Computational Photography. San Francisco; Apr 2009:1-7.Google Scholar
  17. Zheng J, Li Z, Zhu Z, Wu S, Rahardja S: Hybrid patching for a sequence of differently exposed images with moving objects. IEEE Trans. Image Process 2013, 22(12):5190-5201.MathSciNetView ArticleGoogle Scholar
  18. Zimmer H, Bruhn A, Weickert J: Freehand HDR imaging of moving scenes with simultaneous resolution enhancement. Comput. Graph. Forum 2011, 30(2):405-414. 10.1111/j.1467-8659.2011.01870.xView ArticleGoogle Scholar
  19. Hu J, Gallo O, Pulli K: Exposure stacks of live scene with hand-held cameras. In 12th European Conference on Computer Vision. Firenze; Oct 2012:499-512.Google Scholar
  20. Oh T-H, Lee J-Y, Kweon IS: High dynamic range imaging by a rank-1 constraint. In IEEE International Conference on Image Processing. Melbourne; Sept 2013:790-794.Google Scholar
  21. Lee C, Li Y, Monga V: Ghost-free high dynamic range imaging via rank minimization. IEEE Signal Process. Lett 2014, 21(9):1045-1049.View ArticleGoogle Scholar
  22. Pece F, Kautz J: Bitmap movement detection: HDR for dynamic scenes. In The 11th European Conference on Visual Media Production. London; Nov 2010:1-8.Google Scholar
  23. Zhang W, Cham W-K: Gradient-directed composition of multi-exposure images. In IEEE Conference on Computer Vision and Pattern Recognition. San Francisco; June 2010:530-536.Google Scholar
  24. Raman S, Chaudhuri S: Bottom-up segmentation for ghost-free reconstruction of a dynamic scene from multi-exposure images. In Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing. Chennai; Dec 2010:56-63.View ArticleGoogle Scholar
  25. Raman S, Chaudhuri S: Reconstruction of high contrast images for dynamic scenes. Vis. Comput 2011, 27(12):1099-1114. 10.1007/s00371-011-0653-0View ArticleGoogle Scholar
  26. An J, Lee SH, Kuk JG, Cho NI: A multi-exposure image fusion algorithm without ghost effect. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague; May 2011:1565-1568.Google Scholar
  27. Malik J, Perona P: Preattentive texture discrimination with early vision mechanisms. J. Opt. Soc. Am 1990, 7(5):923-932. 10.1364/JOSAA.7.000923View ArticleGoogle Scholar
  28. Burt P, Adelson E: The Laplacian pyramid as a compact image code. IEEE Trans. Comm 1983, 31(4):532-540. 10.1109/TCOM.1983.1095851View ArticleGoogle Scholar
  29. An J, Ha SJ, Kuk JG, Cho NI: Reduction of ghost effect in exposure fusion by detecting the ghost pixels in saturated and non-saturated regions. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Kyoto; Mar 2012:1101-1104.Google Scholar
  30. Hu J, Gallo O, Pulli K, Sun X: HDR deghosting: how to deal with saturation? In IEEE Conference on Computer Vision and Pattern Recognition. Portland; June 2013:1163-1170.Google Scholar
  31. Boykov YY, Jolly MP: Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of Internation Conference on Computer Vision. Vancouver; July 2001:105-112.Google Scholar
  32. Boykov Y, Veksler O, Zabih R: Markov random fields with efficient approximations. In IEEE Conference on Computer Vision and Pattern Recognition. Santa Barbara; June 1998:648-655.Google Scholar
  33. He K, Sun J, Tang X: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell 2013, 35(6):1397-1409.View ArticleGoogle Scholar

Copyright

© An et al.; licensee Springer. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.