ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (2,319)
  • Institute of Electrical and Electronics Engineers (IEEE)  (2,319)
  • IEEE Transactions on Image Processing  (2,319)
  • 1412
  • Electrical Engineering, Measurement and Control Technology  (2,319)
  • Sociology
  • 1
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: Throughout the past few decades, the separable discrete cosine transform (DCT), particularly the DCT type II, has been widely used in image and video compression. It is well-known that, under first-order stationary Markov conditions, DCT is an efficient approximation of the optimal Karhunen–Loève transform. However, for natural image and video sources, the adaptivity of a single separable transform with fixed core is rather limited for the highly dynamic image statistics, e.g., textures and arbitrarily directed edges. It is also known that non-separable transforms can achieve better compression efficiency for images with directional texture patterns, yet they are computationally complex, especially when the transform size is large. In order to achieve higher transform coding gains with relatively low-complexity implementations, we propose a joint separable and non-separable transform. The proposed separable primary transform, named enhanced multiple transform (EMT), applies multiple transform cores from a pre-defined subset of sinusoidal transforms, and the transform selection is signaled in a joint block level manner. Moreover, a non-separable secondary transform (NSST) method is proposed to operate in conjunction with EMT. Unlike the existing non-separable transform schemes which require excessive amounts of memory and computation, the proposed NSST efficiently improves coding gain with much lower complexity. Extensive experimental results show that the proposed methods, in a state-of-the-art video codec, such as high efficiency video coding, can provide significant coding gains (average 6.9% and 4.5% bitrate reductions for intra and random-access coding, respectively).
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Non-blind image deconvolution is an ill-posed problem. The presence of noise and band-limited blur kernels makes the solution of this problem non-unique. Existing deconvolution techniques produce a residual between the sharp image and the estimation that is highly correlated with the sharp image, the kernel, and the noise. In most cases, different restoration models must be constructed for different blur kernels and different levels of noise, resulting in low computational efficiency or highly redundant model parameters. Here we aim to develop a single model that handles different types of kernels and different levels of noise: general non-blind deconvolution. Specifically, we propose a very deep convolutional neural network that predicts the residual between a pre-deconvolved image and the sharp image rather than the sharp image. The residual learning strategy makes it easier to train a single model for different kernels and different levels of noise, encouraging high effectiveness and efficiency. Quantitative evaluations demonstrate the practical applicability of the proposed model for different blur kernels. The model also shows the state-of-the-art performance on synthesized blurry images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Removing the undesired reflections in images taken through the glass is of broad application to various image processing and computer vision tasks. Existing single image-based solutions heavily rely on scene priors such as separable sparse gradients caused by different levels of blur, and they are fragile when such priors are not observed. In this paper, we notice that strong reflections usually dominant a limited region in the whole image, and propose a region-aware reflection removal approach by automatically detecting and heterogeneously processing regions with and without reflections. We integrate content and gradient priors to jointly achieve missing contents restoration, as well as background and reflection separation, in a unified optimization framework. Extensive validation using 50 sets of real data shows that the proposed method outperforms state-of-the-art on both quantitative metrics and visual qualities.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Unlike image blending algorithms, video blending algorithms have been little studied. In this paper, we investigate six popular blending algorithms—feather blending, multi-band blending, modified Poisson blending, mean value coordinate blending, multi-spline blending, and convolution pyramid blending. We consider their application to blending realtime panoramic videos, a key problem in various virtual reality tasks. To evaluate the performances and suitabilities of the six algorithms for this problem, we have created a video benchmark with several videos captured under various conditions. We analyze the time and memory needed by the above six algorithms, for both CPU and GPU implementations (where readily parallelizable). The visual quality provided by these algorithms is also evaluated both objectively and subjectively. The video benchmark and algorithm implementations are publicly available. 1 1 http://cg.cs.tsinghua.edu.cn/blending/
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Feature extraction is a very important step for polarimetric synthetic aperture radar (PolSAR) image classification. Many dimensionality reduction (DR) methods have been employed to extract features for supervised PolSAR image classification. However, these DR-based feature extraction methods only consider each single pixel independently and thus fail to take into account the spatial relationship of the neighboring pixels, so their performance may not be satisfactory. To address this issue, we introduce a novel tensor local discriminant embedding (TLDE) method for feature extraction for supervised PolSAR image classification. The proposed method combines the spatial and polarimetric information of each pixel by characterizing the pixel with the patch centered at this pixel. Then each pixel is represented as a third-order tensor of which the first two modes indicate the spatial information of the patch (i.e., the row and the column of the patch) and the third mode denotes the polarimetric information of the patch. Based on the label information of samples and the redundance of the spatial and polarimetric information, a supervised tensor-based DR technique, called TLDE, is introduced to find three projections which project each pixel, that is, the third-order tensor into the low-dimensional feature. Finally, classification is completed based on the extracted features using the nearest neighbor classifier and the support vector machine classifier. The proposed method is evaluated on two real PolSAR data sets and the simulated PolSAR data sets with various number of looks. The experimental results demonstrate that the proposed method not only improves the classification accuracy greatly but also alleviates the influence of speckle noise on classification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: In this paper, we propose two novel regularization models in patch-wise and pixel-wise, respectively, which are efficient to reconstruct high-resolution (HR) face image from low-resolution (LR) input. Unlike the conventional patch-based models which depend on the assumption of local geometry consistency in LR and HR spaces, the proposed method directly regularizes the relationship between the target patch and corresponding training set in the HR space. It avoids dealing with the tough problem of preserving local geometry in various resolutions. Taking advantage of kernel function in efficiently describing intrinsic features, we further conduct the patch-based reconstruction model in the high-dimensional kernel space for capturing nonlinear characteristics. Meanwhile, a pixel-based model is proposed to regularize the relationship of pixels in the local neighborhood, which can be employed to enhance the fuzzy details in the target HR face image. It privileges the reconstruction of pixels along the dominant orientation of structure, which is useful for preserving high-frequency information on complex edges. Finally, we combine the two reconstruction models into a unified framework. The output HR face image can be finally optimized by performing an iterative procedure. Experimental results demonstrate that the proposed face hallucination method produces superior performance than the state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Most of existing image denoising methods learn image priors from either an external data or the noisy image itself to remove noise. However, priors learned from an external data may not be adaptive to the image to be denoised, while priors learned from the given noisy image may not be accurate due to the interference of corrupted noise. Meanwhile, the noise in real-world noisy images is very complex, which is hard to be described by simple distributions such as Gaussian distribution, making real-world noisy image denoising a very challenging problem. We propose to exploit the information in both external data and the given noisy image, and develop an external prior guided internal prior learning method for real-world noisy image denoising. We first learn external priors from an independent set of clean natural images. With the aid of learned external priors, we then learn internal priors from the given noisy image to refine the prior model. The external and internal priors are formulated as a set of orthogonal dictionaries to efficiently reconstruct the desired image. Extensive experiments are performed on several real-world noisy image datasets. The proposed method demonstrates highly competitive denoising performance, outperforming state-of-the-art denoising methods including those designed for real-world noisy images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Human motion capture data has been widely used in many areas, but it involves a complex capture process and the captured data inevitably contains missing data due to the occlusions caused by the actor’s body or clothing. Motion recovery, which aims to recover the underlying complete motion sequence from its degraded observation, still remains as a challenging task due to the nonlinear structure and kinematics property embedded in motion data. Low-rank matrix completion-based methods have shown promising performance in short-time-missing motion recovery problems. However, low-rank matrix completion, which is designed for linear data, lacks the theoretic guarantee when applied to the recovery of nonlinear motion data. To overcome this drawback, we propose a tailored nonlinear matrix completion model for human motion recovery. Within the model, we first learn a combined low-rank kernel via multiple kernel learning. By exploiting the learned kernel, we embed the motion data into a high dimensional Hilbert space where motion data is of desirable low-rank and we then use the low-rank matrix completion to recover motions. In addition, we add two kinematic constraints to the proposed model to preserve the kinematics property of human motion. Extensive experiment results and comparisons with five other state-of-the-art methods demonstrate the advantage of the proposed method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Various 3D reconstruction methods have enabled civil engineers to detect damage on a road surface. To achieve the millimeter accuracy required for road condition assessment, a disparity map with subpixel resolution needs to be used. However, none of the existing stereo matching algorithms are specially suitable for the reconstruction of the road surface. Hence in this paper, we propose a novel dense subpixel disparity estimation algorithm with high computational efficiency and robustness. This is achieved by first transforming the perspective view of the target frame into the reference view, which not only increases the accuracy of the block matching for the road surface but also improves the processing speed. The disparities are then estimated iteratively using our previously published algorithm, where the search range is propagated from three estimated neighboring disparities. Since the search range is obtained from the previous iteration, errors may occur when the propagated search range is not sufficient. Therefore, a correlation maxima verification is performed to rectify this issue, and the subpixel resolution is achieved by conducting a parabola interpolation enhancement. Furthermore, a novel disparity global refinement approach developed from the Markov random fields and fast bilateral stereo is introduced to further improve the accuracy of the estimated disparity map, where disparities are updated iteratively by minimizing the energy function that is related to their interpolated correlation polynomials. The algorithm is implemented in C language with a near real-time performance. The experimental results illustrate that the absolute error of the reconstruction varies from 0.1 to 3 mm.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2018-03-31
    Description: The discriminability of the bag-of-words representations can be increased via encoding the spatial relationship among virtual words on 3D shapes. However, this encoding task involves several issues, including arbitrary mesh resolutions , irregular vertex topology , orientation ambiguity on 3D surface , invariance to rigid , and non-rigid shape transformations . To address these issues, a novel unsupervised spatial learning framework based on deep neural network, deep spatiality (DS), is proposed. Specifically, DS employs two novel components: spatial context extractor and deep context learner . Spatial context extractor extracts the spatial relationship among virtual words in a local region into a raw spatial representation . Along a consistent circular direction , a directed circular graph is constructed to encode relative positions between pairwise virtual words in each face ring into a relative spatial matrix . By decomposing each relative spatial matrix using singular value decomposition, the raw spatial representation is formed, from which deep context learner conducts unsupervised learning of the global and local features. Deep context learner is a deep neural network with a novel model structure to adapt the proposed coupled softmax layer , which encodes not only the discriminative information among local regions but also the one among global shapes. Experimental results show that DS outperforms state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Video segmentation is an important building block for high level applications, such as scene understanding and interaction analysis. While outstanding results are achieved in this field by the state-of-the-art learning and model-based methods, they are restricted to certain types of scenes or require a large amount of annotated training data to achieve object segmentation in generic scenes. On the other hand, RGBD data, widely available with the introduction of consumer depth sensors, provide actual world 3D geometry compared with 2D images. The explicit geometry in RGBD data greatly help in computer vision tasks, but the lack of annotations in this type of data may also hinder the extension of learning-based methods to RGBD. In this paper, we present a novel generic segmentation approach for 3D point cloud video (stream data) thoroughly exploiting the explicit geometry in RGBD. Our proposal is only based on low level features, such as connectivity and compactness. We exploit temporal coherence by representing the rough estimation of objects in a single frame with a hierarchical structure and propagating this hierarchy along time. The hierarchical structure provides an efficient way to establish temporal correspondences at different scales of object-connectivity and to temporally manage the splits and merges of objects. This allows updating the segmentation according to the evidence observed in the history. The proposed method is evaluated on several challenging data sets, with promising results for the presented approach.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: In this paper, we propose a novel single image Bayesian super-resolution (SR) algorithm where the hyperspectral image (HSI) is the only source of information. The main contribution of the proposed approach is to convert the ill-posed SR reconstruction problem in the spectral domain to a quadratic optimization problem in the abundance map domain. In order to do so, Markov random field based energy minimization approach is proposed and proved that the solution is quadratic. The proposed approach consists of five main steps. First, the number of endmembers in the scene is determined using virtual dimensionality. Second, the endmembers and their low resolution abundance maps are computed using simplex identification via the splitted augmented Lagrangian and fully constrained least squares algorithms. Third, high resolution (HR) abundance maps are obtained using our proposed maximum a posteriori based energy function. This energy function is minimized subject to smoothness, unity, and boundary constraints. Fourth, the HR abundance maps are further enhanced with texture preserving methods. Finally, HR HSI is reconstructed using the extracted endmembers and the enhanced abundance maps. The proposed method is tested on three real HSI data sets; namely the Cave, Harvard, and Hyperspectral Remote Sensing Scenes and compared with state-of-the-art alternative methods using peak signal to noise ratio, structural similarity, spectral angle mapper, and relative dimensionless global error in synthesis metrics. It is shown that the proposed method outperforms the state of the art methods in terms of quality while preserving the spectral consistency.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-04-04
    Description: This paper presents a method leveraging coded motion information to obtain fast, high quality motion field estimation. The method is inspired by a recent trend followed by a number of top-performing optical flow estimation schemes that first estimate a sparse set of features between two frames, and then use an edge-preserving interpolation scheme (EPIC) to obtain a piecewise-smooth motion field that respects moving object boundaries. In order to skip the time-consuming estimation of features, we propose to directly derive motion seeds from decoded HEVC block motion; we call the resulting scheme “HEVC-EPIC”. We propose motion seed weighting strategies that account for the fact that some motion seeds are less reliable than others. Experiments on a large variety of challenging sequences and various bit-rates show that HEVC-EPIC runs significantly faster than EPIC flow, while producing motion fields that have a slightly lower average endpoint error. HEVC-EPIC opens the door of seamlessly integrating HEVC motion into video analysis and enhancement tasks. When employed as input to a framerate upsampling scheme, the average Y-PSNR of the interpolated frames using HEVC-EPIC motion slightly outperforms EPIC flow across the tested bit-rates, while running an order of magnitude faster.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-04-04
    Description: Extracting the background from a video in the presence of various moving patterns is the focus of several background-initialization approaches. To model the scene background using rank-one matrices, this paper proposes a background-initialization technique that relies on the singular-value decomposition (SVD) of spatiotemporally extracted slices from the video tensor. The proposed method is referred to as spatiotemporal slice-based SVD (SS-SVD). To determine the SVD components that best model the background, a depth analysis of the computation of the left/right singular vectors and singular values is performed, and the relationship with tensor-tube fibers is determined. The analysis proves that a rank-1 matrix extracted from the first left and right singular vectors and singular value represents an efficient model of the scene background. The performance of the proposed SS-SVD method is evaluated using 93 complex video sequences of different challenges, and the method is compared with state-of-the-art tensor/matrix completion-based methods, statistical-based methods, search-based methods, and labeling-based methods. The results not only show better performance over most of the tested challenges, but also demonstrate the capability of the proposed technique to solve the background-initialization problem in a less computational time and with fewer frames.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Existing learning-based atmospheric particle-removal approaches such as those used for rainy and hazy images are designed with strong assumptions regarding spatial frequency, trajectory, and translucency. However, the removal of snow particles is more complicated because they possess additional attributes of particle size and shape, and these attributes may vary within a single image. Currently, hand-crafted features are still the mainstream for snow removal, making significant generalization difficult to achieve. In response, we have designed a multistage network named DesnowNet to in turn deal with the removal of translucent and opaque snow particles. We also differentiate snow attributes of translucency and chromatic aberration for accurate estimation. Moreover, our approach individually estimates residual complements of the snow-free images to recover details obscured by opaque snow. Additionally, a multi-scale design is utilized throughout the entire network to model the diversity of snow. As demonstrated in the qualitative and quantitative experiments, our approach outperforms state-of-the-art learning-based atmospheric phenomena removal methods and one semantic segmentation baseline on the proposed Snow100K dataset. The results indicate our network would benefit applications involving computer vision and graphics.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: The explosive availability of remote sensing images has challenged supervised classification algorithms such as support vector machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this paper, a SVM-based sequential classifier training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is first predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2018-03-31
    Description: In this paper, we deal with the problem of denoising 3D scene range measurements acquired by time-of-flight (ToF) range sensors and composed in the form of 2D image-like depth maps. We address the specific case of ToF low-sensing environment (LSE). Such environment is set by low-light sensing conditions, low-power hardware requirements, and low-reflectivity scenes. We demonstrate that data captured by a device in such mode can be effectively post-processed in order to reach the same measurement accuracy as if the device was working in normal operating mode. In order to achieve this, we first present an elaborated analysis of noise properties of ToF data sensed in LSE and verify the derived noise models by empirical measurements. Then, we develop a related novel non-local denoising approach working in complex domain and demonstrate its superiority against the state of the art for data acquired by an off-the-shelf ToF device.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-31
    Description: Most existing part-based tracking methods are part-to-part trackers, which usually have two separated steps including the part matching and target localization. Different from existing methods, in this paper, we propose a novel part-to-target (P2T) tracker in a unified fashion by inferring target location from parts directly. To achieve this goal, we propose a novel deep regression model for P2T regression in an end-to-end framework via convolutional neural networks. The proposed model is designed not only to exploit the part context information to preserve object spatial layout structure, but also to learn part reliability to emphasize part importance for the robust P2T regression. We evaluate the proposed tracker on four challenging benchmark sequences, and extensive experimental results demonstrate that our method performs favorably against state-of-the-art trackers because of the powerful capacity of the proposed deep regression model.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-04-07
    Description: Hashing, a widely studied solution to the approximate nearest neighbor search, aims to map data points in the high-dimensional Euclidean space to the low-dimensional Hamming space while preserving the similarity between original points. As directly learning binary codes can be NP-hard due to discrete constraints, a two-stage scheme, namely, “projection and quantization”, has already become a standard paradigm for learning similarity-preserving hash codes. However, most existing hashing methods typically separate these two stages and thus fail to investigate complementary effects of both stages. In this paper, we systematically study the relationship between “projection and quantization”, and propose a novel minimal reconstruction bias hashing (MRH) method to learn compact binary codes, in which the projection learning and quantization optimizing are jointly performed. By introducing a lower bound analysis, we design an effective ternary search algorithm to solve the corresponding optimization problem. Furthermore, we conduct some insightful discussions on the proposed MRH approach, including the theoretical proof, and computational complexity. Distinct from previous works, the MRH can adaptively adjust the projection dimensionality to balance the information loss between the projection and quantization. The proposed framework not only provides a unique perspective to view traditional hashing methods, but also evokes some other researches, e.g., guiding the design of the loss functions in deep networks. Extensive experiment results have shown that the proposed MRH significantly outperforms a variety of state-of-the-art methods over eight widely used benchmarks.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2018-02-09
    Description: We present an efficient deep learning framework for identifying, segmenting, and classifying cell membranes and nuclei from human epidermal growth factor receptor-2 (HER2)-stained breast cancer images with minimal user intervention. This is a long-standing issue for pathologists because the manual quantification of HER2 is error-prone, costly, and time-consuming. Hence, we propose a deep learning-based HER2 deep neural network ( Her2Net ) to solve this issue. The convolutional and deconvolutional parts of the proposed Her2Net framework consisted mainly of multiple convolution layers, max-pooling layers, spatial pyramid pooling layers, deconvolution layers, up-sampling layers, and trapezoidal long short-term memory (TLSTM). A fully connected layer and a softmax layer were also used for classification and error estimation. Finally, HER2 scores were calculated based on the classification results. The main contribution of our proposed Her2Net framework includes the implementation of TLSTM and a deep learning framework for cell membrane and nucleus detection, segmentation, and classification and HER2 scoring. Our proposed Her2Net achieved 96.64% precision, 96.79% recall, 96.71% F-score, 93.08% negative predictive value, 98.33% accuracy, and a 6.84% false-positive rate. Our results demonstrate the high accuracy and wide applicability of the proposed Her2Net in the context of HER2 scoring for breast cancer evaluation.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: The blind quality evaluation of screen content images (SCIs) and natural scene images (NSIs) has become an important, yet very challenging issue. In this paper, we present an effective blind quality evaluation technique for SCIs and NSIs based on a dictionary of learned local and global quality features. First, a local dictionary is constructed using local normalized image patches and conventional $K$ -means clustering. With this local dictionary, the learned local quality features can be obtained using a locality-constrained linear coding with max pooling. To extract the learned global quality features, the histogram representations of binary patterns are concatenated to form a global dictionary. The collaborative representation algorithm is used to efficiently code the learned global quality features of the distorted images using this dictionary. Finally, kernel-based support vector regression is used to integrate these features into an overall quality score. Extensive experiments involving the proposed evaluation technique demonstrate that in comparison with most relevant metrics, the proposed blind metric yields significantly higher consistency in line with subjective fidelity ratings.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: Facial landmark detection is typically cast as a point-wise regression problem that focuses on how to build an effective image-to-point mapping function. In this paper, we propose an end-to-end deep learning approach for contextually discriminative feature construction together with effective facial structure modeling. The proposed learning approach is able to predict more contextually discriminative facial landmarks by capturing their associated contextual information. Moreover, we present a tree model to characterize human face structure and a structural loss function to measure the deformation cost between the ground-truth and predicted tree model, which are further incorporated into the proposed learning approach and jointly optimized within a unified framework. The presented tree model is able to well characterize the spatial layout patterns of facial landmarks for capturing the facial structure information. Experimental results demonstrate the effectiveness of the proposed approach against the state-of-the-art over the MTFL and AFLW-full data sets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: In this paper, we propose a vanishing-point constrained Dijkstra road model for road detection in a stereo-vision paradigm. First, the stereo-camera is used to generate the u- and v-disparity maps of road image, from which the horizon can be extracted. With the horizon and ground region constraints, we can robustly locate the vanishing point of road region. Second, a weighted graph is constructed using all pixels of the image, and the detected vanishing point is treated as the source node of the graph. By computing a vanishing-point constrained Dijkstra minimum-cost map, where both disparity and gradient of gray image are used to calculate cost between two neighbor pixels, the problem of detecting road borders in image is transformed into that of finding two shortest paths that originate from the vanishing point to two pixels in the last row of image. The proposed approach has been implemented and tested over 2600 grayscale images of different road scenes in the KITTI data set. The experimental results demonstrate that this training-free approach can detect horizon, vanishing point, and road regions very accurately and robustly. It can achieve promising performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: We propose novel convolutional sparse and low-rank coding-based methods for cartoon and texture decomposition. In our method, we first learn a set of generic filters that can efficiently represent cartoon-and texture-type images. Then, using these learned filters, we propose two optimization frameworks to decompose a given image into cartoon and texture components: convolutional sparse coding-based image decomposition; and convolutional low-rank coding-based image decomposition. By working directly on the whole image, the proposed image separation algorithms do not need to divide the image into overlapping patches for leaning local dictionaries. The shift-invariance property is directly modeled into the objective function for learning filters. Extensive experiments show that the proposed methods perform favorably compared with state-of-the-art image separation methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: Median filtering is a smoothing technique for noise removal in images. While there are various implementations of median filtering for a single-core CPU, there are few implementations for accelerators and multi-core systems. Many parallel implementations of median filtering use a sorting algorithm for rearranging the values within a filtering window and taking the median of the sorted value. While using sorting algorithms allows for simple parallel implementations, the cost of the sorting becomes prohibitive as the filtering windows grow. This makes such algorithms, sequential and parallel alike, inefficient. In this work, we introduce the first software parallel median filtering that is non-sorting-based. The new algorithm uses efficient histogram-based operations. These reduce the computational requirements of the new algorithm while also accessing the image fewer times. We show an implementation of our algorithm for both the CPU and NVIDIA’s CUDA supported graphics processing unit (GPU). The new algorithm is compared with several other leading CPU and GPU implementations. The CPU implementation has near perfect linear scaling with a $3.7times $ speedup on a quad-core system. The GPU implementation is several orders of magnitude faster than the other GPU implementations for mid-size median filters. For small kernels, $3 times 3$ and $5 times 5$ , comparison-based approaches are preferable as fewer operations are required. Lastly, the new algorithm is open-source and can be found in the OpenCV library.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2018-02-09
    Description: Thermographic inspection has been widely applied to non-destructive testing and evaluation with the capabilities of rapid, contactless, and large surface area detection. Image segmentation is considered essential for identifying and sizing defects. To attain a high-level performance, specific physics-based models that describe defects generation and enable the precise extraction of target region are of crucial importance. In this paper, an effective genetic first-order statistical image segmentation algorithm is proposed for quantitative crack detection. The proposed method automatically extracts valuable spatial-temporal patterns from unsupervised feature extraction algorithm and avoids a range of issues associated with human intervention in laborious manual selection of specific thermal video frames for processing. An internal genetic functionality is built into the proposed algorithm to automatically control the segmentation threshold to render enhanced accuracy in sizing the cracks. Eddy current pulsed thermography will be implemented as a platform to demonstrate surface crack detection. Experimental tests and comparisons have been conducted to verify the efficacy of the proposed method. In addition, a global quantitative assessment index F-score has been adopted to objectively evaluate the performance of different segmentation algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: The compact descriptors for visual search (CDVS) standard from ISO/IEC moving pictures experts group has succeeded in enabling the interoperability for efficient and effective image retrieval by standardizing the bitstream syntax of compact feature descriptors. However, the intensive computation of a CDVS encoder unfortunately hinders its widely deployment in industry for large-scale visual search. In this paper, we revisit the merits of low complexity design of CDVS core techniques and present a very fast CDVS encoder by leveraging the massive parallel execution resources of graphics processing unit (GPU). We elegantly shift the computation-intensive and parallel-friendly modules to the state-of-the-arts GPU platforms, in which the thread block allocation as well as the memory access mechanism are jointly optimized to eliminate performance loss. In addition, those operations with heavy data dependence are allocated to CPU for resolving the extra but non-necessary computation burden for GPU. Furthermore, we have demonstrated the proposed fast CDVS encoder can work well with those convolution neural network approaches which enables to leverage the advantages of GPU platforms harmoniously, and yield significant performance improvements. Comprehensive experimental results over benchmarks are evaluated, which has shown that the fast CDVS encoder using GPU-CPU hybrid computing is promising for scalable visual search.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: Video contents are inherently heterogeneous. To exploit different feature modalities in a diverse video collection for video summarization, we propose to formulate the task as a multiview representative selection problem. The goal is to select visual elements that are representative of a video consistently across different views (i.e., feature modalities). We present in this paper the multiview sparse dictionary selection with centroid co-regularization method, which optimizes the representative selection in each view, and enforces that the view-specific selections to be similar by regularizing them towards a consensus selection. We also introduce a diversity regularizer to favor a selection of diverse representatives. The problem can be efficiently solved by an alternating minimizing optimization with the fast iterative shrinkage thresholding algorithm. Experiments on synthetic data and benchmark video datasets validate the effectiveness of the proposed approach for video summarization, in comparison with other video summarization methods and representative selection methods such as K-medoids, sparse dictionary selection, and multiview clustering.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-13
    Description: Over-the-top adaptive video streaming services are frequently impacted by fluctuating network conditions that can lead to rebuffering events (stalling events) and sudden bitrate changes. These events visually impact video consumers’ quality of experience (QoE) and can lead to consumer churn. The development of models that can accurately predict viewers’ instantaneous subjective QoE under such volatile network conditions could potentially enable the more efficient design of quality-control protocols for media-driven services, such as YouTube, Amazon, Netflix, and so on. However, most existing models only predict a single overall QoE score on a given video and are based on simple global video features, without accounting for relevant aspects of human perception and behavior. We have created a QoE evaluator, called the time-varying QoE Indexer, that accounts for interactions between stalling events, analyzes the spatial and temporal content of a video, predicts the perceptual video quality, models the state of the client-side data buffer, and consequently predicts continuous-time quality scores that agree quite well with human opinion scores. The new QoE predictor also embeds the impact of relevant human cognitive factors, such as memory and recency, and their complex interactions with the video content being viewed. We evaluated the proposed model on three different video databases and attained standout QoE prediction performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-13
    Description: We propose a new pixel binning scheme for color image sensors. We minimized distortion caused by binning by requiring that the superpixels lie on a square sampling lattice. The proposed binning schemes achieve the equivalent of 4.42 times signal strength improvement with the image resolution loss of 5 times, higher in noise performance and in resolution than the existing binning schemes. As a result, the proposed binning has considerably less artifacts and better noise performance compared with the existing binning schemes. In addition, we provide an extension to the proposed binning scheme for performing single-shot high dynamic range image acquisition.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-13
    Description: Action prediction on a partially observed action sequence is a very challenging task. To address this challenge, we first design a global-local distance model, where a global-temporal distance compares subsequences as a whole and local-temporal distance focuses on individual segment. Our distance model introduces temporal saliency for each segment to adapt its contribution. Finally, a global-local temporal action prediction model is formulated in order to jointly learn and fuse these two types of distances. Such a prediction model is capable of recognizing action of: 1) an on-going sequence and 2) a sequence with arbitrarily frames missing between the beginning and end (known as gap-filling). Our proposed model is tested and compared with related action prediction models on BIT, UCF11, and HMDB data sets. The results demonstrated the effectiveness of our proposal. In particular, we showed the benefit of our proposed model on predicting unseen action types and the advantage on addressing the gapfilling problem as compared with recently developed action prediction models.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: A fundamental problem in Nyström-based kernel matrix approximation is the sampling method by which training set is built. In this paper, we suggest to use kernel $k$ -means sampling, which is shown in our works to minimize the upper bound of a matrix approximation error. We first propose a unified kernel matrix approximation framework, which is able to describe most existing Nyström approximations under many popular kernels, including Gaussian kernel and polynomial kernel. We then show that, the matrix approximation error upper bound, in terms of the Frobenius norm, is equal to the $k$ -means error of data points in kernel space plus a constant. Thus, the $k$ -means centers of data in kernel space, or the kernel $k$ -means centers, are the optimal representative points with respect to the Frobenius norm error upper bound. Experimental results, with both Gaussian kernel and polynomial kernel, on real-world data sets and image segmentation tasks show the superiority of the proposed method over the state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-09
    Description: Light field imaging extends the traditional photography by capturing both spatial and angular distribution of light, which enables new capabilities, including post-capture refocusing, post-capture aperture control, and depth estimation from a single shot. Micro-lens array (MLA) based light field cameras offer a cost-effective approach to capture light field. A major drawback of MLA based light field cameras is low spatial resolution, which is due to the fact that a single image sensor is shared to capture both spatial and angular information. In this paper, we present a learning based light field enhancement approach. Both spatial and angular resolution of captured light field is enhanced using convolutional neural networks. The proposed method is tested with real light field data captured with a Lytro light field camera, clearly demonstrating spatial and angular resolution improvement.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-13
    Description: Hyperspectral unmixing while considering endmember variability is usually performed by the normal compositional model, where the endmembers for each pixel are assumed to be sampled from unimodal Gaussian distributions. However, in real applications, the distribution of a material is often not Gaussian. In this paper, we use Gaussian mixture models (GMM) to represent endmember variability. We show, given the GMM starting premise, that the distribution of the mixed pixel (under the linear mixing model) is also a GMM (and this is shown from two perspectives). The first perspective originates from random variable transformations and gives a conditional density function of the pixels given the abundances and GMM parameters. With proper smoothness and sparsity prior constraints on the abundances, the conditional density function leads to a standard maximum a posteriori (MAP ) problem which can be solved using generalized expectation maximization. The second perspective originates from marginalizing over the endmembers in the GMM, which provides us with a foundation to solve for the endmembers at each pixel. Hence, compared to the other distribution based methods, our model can not only estimate the abundances and distribution parameters, but also the distinct endmember set for each pixel. We tested the proposed GMM on several synthetic and real datasets, and showed its potential by comparing it to current popular methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: This paper addresses the multi-attributed graph matching problem, which considers multiple attributes jointly while preserving the characteristics of each attribute for graph matching. Since most of conventional graph matching algorithms integrate multiple attributes to construct a single unified attribute in an oversimplified manner, the information from multiple attributes is often not completely utilized. In order to solve this problem, we propose a novel multi-layer graph structure that can preserve the characteristics of each attribute in separated layers, and also propose a multi-attributed graph matching algorithm based on random walk centrality with the proposed multi-layer graph structure. We compare the proposed algorithm with other state-of-the-art graph matching algorithms based on a single-layer structure using synthetic and real data sets and demonstrate the superior performance of the proposed multi-layer graph structure and the multi-attributed graph matching algorithm.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: The two-stream CNNs prove very successful for video-based action recognition. However, the classical two-stream CNNs are time costly, mainly due to the bottleneck of calculating optical flows (OFs). In this paper, we propose a two-stream-based real-time action recognition approach by using motion vector (MV) to replace OF. MVs are encoded in video stream and can be extracted directly without extra calculation. However, directly training CNN with MVs degrades accuracy severely due to the noise and the lack of fine details in MVs. In order to relieve this problem, we propose four training strategies which leverage the knowledge learned from OF CNN to enhance the accuracy of MV CNN. Our insight is that MV and OF share inherent similar structures which allow us to transfer knowledge from one domain to another. To fully utilize the knowledge learned in OF domain, we develop deeply transferred MV CNN. Experimental results on various datasets show the effectiveness of our training strategies. Our approach is significantly faster than OF based approaches and achieves processing speed of 390.7 frames per second, surpassing real-time requirement. We release our model and code to facilitate further research. 1 1 https://github.com/zbwglory/MV-release
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: As a simple representation of interactions among distributed brain regions, brain networks have been widely applied to automated diagnosis of brain diseases, such as Alzheimer’s disease (AD) and its early stage, i.e. , mild cognitive impairment (MCI). In brain network analysis, a challenging task is how to measure the similarity between a pair of networks. Although many graph kernels ( i.e. , kernels defined on graphs) have been proposed for measuring the topological similarity of a pair of brain networks, most of them are defined using general graphs, thus ignoring the uniqueness of each node in brain networks. That is, each node in a brain network denotes a particular brain region, which is a specific characteristics of brain networks. Accordingly, in this paper, we construct a novel sub-network kernel for measuring the similarity between a pair of brain networks and then apply it to brain disease classification. Different from current graph kernels, our proposed sub-network kernel not only takes into account the inherent characteristic of brain networks, but also captures multi-level (from local to global) topological properties of nodes in brain networks, which are essential for defining the similarity measure of brain networks. To validate the efficacy of our method, we perform extensive experiments on subjects with baseline functional magnetic resonance imaging data obtained from the Alzheimer’s disease neuroimaging initiative database. Experimental results demonstrate that the proposed method outperforms several state-of-the-art graph-based methods in MCI classification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: This paper presents a new supervised classification algorithm for remotely sensed hyperspectral image (HSI) which integrates spectral and spatial information in a unified Bayesian framework. First, we formulate the HSI classification problem from a Bayesian perspective. Then, we adopt a convolutional neural network (CNN) to learn the posterior class distributions using a patch-wise training strategy to better use the spatial information. Next, spatial information is further considered by placing a spatial smoothness prior on the labels. Finally, we iteratively update the CNN parameters using stochastic gradient decent and update the class labels of all pixel vectors using $alpha $ -expansion min-cut-based algorithm. Compared with the other state-of-the-art methods, the classification method achieves better performance on one synthetic data set and two benchmark HSI data sets in a number of experimental settings.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: In this paper, we aim to predict human eye fixation with view-free scenes based on an end-to-end deep learning architecture. Although convolutional neural networks (CNNs) have made substantial improvement on human attention prediction, it is still needed to improve the CNN-based attention models by efficiently leveraging multi-scale features. Our visual attention network is proposed to capture hierarchical saliency information from deep, coarse layers with global saliency information to shallow, fine layers with local saliency response. Our model is based on a skip-layer network structure, which predicts human attention from multiple convolutional layers with various reception fields. Final saliency prediction is achieved via the cooperation of those global and local predictions. Our model is learned in a deep supervision manner, where supervision is directly fed into multi-level layers, instead of previous approaches of providing supervision only at the output layer and propagating this supervision back to earlier layers. Our model thus incorporates multi-level saliency predictions within a single network, which significantly decreases the redundancy of previous approaches of learning multiple network streams with different input scales. Extensive experimental analysis on various challenging benchmark data sets demonstrate our method yields the state-of-the-art performance with competitive inference time. 1 1 Our source code is available at https://github.com/wenguanwang/deepattention .
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: As one of the most common human helminths, hookworm is a leading cause of maternal and child morbidity, which seriously threatens human health. Recently, wireless capsule endoscopy (WCE) has been applied to automatic hookworm detection. Unfortunately, it remains a challenging task. In recent years, deep convolutional neural network (CNN) has demonstrated impressive performance in various image and video analysis tasks. In this paper, a novel deep hookworm detection framework is proposed for WCE images, which simultaneously models visual appearances and tubular patterns of hookworms. This is the first deep learning framework specifically designed for hookworm detection in WCE images. Two CNN networks, namely edge extraction network and hookworm classification network, are seamlessly integrated in the proposed framework, which avoid the edge feature caching and speed up the classification. Two edge pooling layers are introduced to integrate the tubular regions induced from edge extraction network and the feature maps from hookworm classification network, leading to enhanced feature maps emphasizing the tubular regions. Experiments have been conducted on one of the largest WCE datasets with $440K$ WCE images, which demonstrate the effectiveness of the proposed hookworm detection framework. It significantly outperforms the state-of-the-art approaches. The high sensitivity and accuracy of the proposed method in detecting hookworms shows its potential for clinical application.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-02-24
    Description: Variational Level Set (LS) has been a widely used method in medical segmentation. However, it is limited when dealing with multi-instance objects in the real world. In addition, its segmentation results are quite sensitive to initial settings and highly depend on the number of iterations. To address these issues and boost the classic variational LS methods to a new level of the learnable deep learning approaches, we propose a novel definition of contour evolution named Recurrent Level Set (RLS) 1 to employ Gated Recurrent Unit under the energy minimization of a variational LS functional. The curve deformation process in RLS is formed as a hidden state evolution procedure and updated by minimizing an energy functional composed of fitting forces and contour length. By sharing the convolutional features in a fully end-to-end trainable framework , we extend RLS to Contextual RLS (CRLS) to address semantic segmentation in the wild. The experimental results have shown that our proposed RLS improves both computational time and segmentation accuracy against the classic variational LS-based method whereas the fully end-to-end system CRLS achieves competitive performance compared to the state-of-the-art semantic segmentation approaches. 1 Source codes will be publicly available.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: In this paper, we focus on improving the proposal classification stage in the object detection task and present implicit negative sub-categorization and sink diversion to lift the performance by strengthening loss function in this stage. First, based on the observation that the “background” class is generally very diverse and thus challenging to be handled as a single indiscriminative class in existing state-of-the-art methods, we propose to divide the background category into multiple implicit sub-categories to explicitly differentiate diverse patterns within it. Second, since the ground truth class inevitably has low-value probability scores for certain images, we propose to add a “sink” class and divert the probabilities of wrong classes to this class when necessary, such that the ground truth label will still have a higher probability than other wrong classes even though it has low probability output. Additionally, we propose to use dilated convolution, which is widely used in the semantic segmentation task, for efficient and valuable context information extraction. Extensive experiments on PASCAL VOC 2007 and 2012 data sets show that our proposed methods based on faster R-CNN implementation can achieve state-of-the-art mAPs, i.e., 84.1%, 82.6%, respectively, and obtain 2.5% improvement on ILSVRC DET compared with that of ResNet.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: As more and more stereo cameras are installed on electronic devices, we are motivated to investigate how to leverage disparity information for autofocus. The main challenge is that stereo images captured for disparity estimation are subject to defocus blur unless the lenses of the stereo cameras are at the in-focus position. Therefore, it is important to investigate how the presence of defocus blur would affect stereo matching and, in turn, the performance of disparity estimation. In this paper, we give an analytical treatment of this fundamental issue of disparity-based autofocus by examining the relation between image sharpness and disparity error. A statistical approach that treats the disparity estimate as a random variable is developed. Our analysis provides a theoretical backbone for the empirical observation that, regardless of the initial lens position, disparity-based autofocus can bring the lens to the hill zone of the focus profile in one movement. The insight gained from the analysis is useful for the implementation of an autofocus system.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: This paper aims to understand the practical features of hierarchies of morphological segmentations, namely the quasi-flat zones hierarchy and watershed hierarchies, and to evaluate their potential in the context of natural image analysis. We propose a novel evaluation framework for the hierarchies of partitions designed to capture various aspects of those representations: precision of their regions and contours, possibility to extract high quality horizontal cuts and optimal non-horizontal cuts for image segmentation, and the ease of finding a set of regions representing a semantic object. This framework is used to assess and to optimize hierarchies with respect to the possible pre- and post-processing steps. We show that, used in conjunction with a state-of-the-art contour detector, watershed hierarchies are competitive with the complex state-of-the-art methods for hierarchy construction. In particular, the proposed framework allows us to identify a watershed hierarchy based on a novel extinction value, the number of parent nodes that outperforms the other hierarchies of morphological segmentations. This coupled with the fact that watershed hierarchies satisfy clear global optimality properties and can be efficiently computed on large data, make them valuable candidates for various computer vision tasks.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: In this paper, we propose a novel no reference quality assessment method by incorporating statistical luminance and texture features (NRLT) for screen content images (SCIs) with both local and global feature representation. The proposed method is designed inspired by the perceptual property of the human visual system (HVS) that the HVS is sensitive to luminance change and texture information for image perception. In the proposed method, we first calculate the luminance map through the local normalization, which is further used to extract the statistical luminance features in global scope. Second, inspired by existing studies from neuroscience that high-order derivatives can capture image texture, we adopt four filters with different directions to compute gradient maps from the luminance map. These gradient maps are then used to extract the second-order derivatives by local binary pattern. We further extract the texture feature by the histogram of high-order derivatives in global scope. Finally, support vector regression is applied to train the mapping function from quality-aware features to subjective ratings. Experimental results on the public large-scale SCI database show that the proposed NRLT can achieve better performance in predicting the visual quality of SCIs than relevant existing methods, even including some full reference visual quality assessment methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: In order to achieve efficient similarity searching, hash functions are designed to encode images into low-dimensional binary codes with the constraint that similar features will have a short distance in the projected Hamming space. Recently, deep learning-based methods have become more popular, and outperform traditional non-deep methods. However, without label information, most state-of-the-art unsupervised deep hashing (DH) algorithms suffer from severe performance degradation for unsupervised scenarios. One of the main reasons is that the ad-hoc encoding process cannot properly capture the visual feature distribution. In this paper, we propose a novel unsupervised framework that has two main contributions: 1) we convert the unsupervised DH model into supervised by discovering pseudo labels; 2) the framework unifies likelihood maximization, mutual information maximization, and quantization error minimization so that the pseudo labels can maximumly preserve the distribution of visual features. Extensive experiments on three popular data sets demonstrate the advantages of the proposed method, which leads to significant performance improvement over the state-of-the-art unsupervised hashing algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: Given a set of images that contain objects from a common category, object co-segmentation aims at automatically discovering and segmenting such common objects from each image. During the past few years, object co-segmentation has received great attention in the computer vision community. However, the existing approaches are usually designed with misleading assumptions, unscalable priors, or subjective computational models, which do not have sufficient robustness for dealing with complex and unconstrained real-world image contents. This paper proposes a novel two-stage co-segmentation framework, mainly for addressing the robustness issue. In the proposed framework, we first introduce the concept of union background and use it to improve the robustness for suppressing the image backgrounds contained by the given image groups. Then, we also weaken the requirement for the strong prior knowledge by using the background prior instead. This can improve the robustness when scaling up for the unconstrained image contents. Based on the weak background prior, we propose a novel MR-SGS model, i.e., manifold ranking with the self-learned graph structure, which can infer suitable graph structures in a data-driven manner rather than building the fixed graph structure relying on the subjective design. Such capacity is critical for further improving the robustness in inferring the foreground/background probability of each image pixel. Comprehensive experiments and comparisons with other state-of-the-art approaches can demonstrate the effectiveness of the proposed work.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: Benefiting from multi-view video plus depth and depth-image-based-rendering technologies, only limited views of a real 3-D scene need to be captured, compressed, and transmitted. However, the quality assessment of synthesized views is very challenging, since some new types of distortions, which are inherently different from the texture coding errors, are inevitably produced by view synthesis and depth map compression, and the corresponding original views (reference views) are usually not available. Thus the full-reference quality metrics cannot be used for synthesized views. In this paper, we propose a novel no-reference image quality assessment method for 3-D synthesized views (called NIQSV+). This blind metric can evaluate the quality of synthesized views by measuring the typical synthesis distortions: blurry regions, black holes, and stretching, with access to neither the reference image nor the depth map. To evaluate the performance of the proposed method, we compare it with four full-reference 3-D (synthesized view dedicated) metrics, five full-reference 2-D metrics, and three no-reference 2-D metrics. In terms of their correlations with subjective scores, our experimental results show that the proposed no-reference metric approaches the best of the state-of-the-art full reference and no-reference 3-D metrics; and outperforms the widely used no-reference and full-reference 2-D metrics significantly. In terms of its approximation of human ranking, the proposed metric achieves the best performance in the experimental test.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-06
    Description: We propose a novel technique for detection of visual saliency in dynamic video based on video decomposition. The decomposition obtains the sparse features in a particular orientation by exploiting the spatiotemporal discontinuities present in a video cube. A weighted sum of the sparse features along three orthogonal directions determines the salient regions in the video cubes. The weights computed using the frame correlation along three directions are based on the characteristic of human visual system that identifies the sparsest feature as the most salient feature in a video. Unlike the existing methods, which detect the salient region as blob, the proposed approach detects the exact boundaries of salient region with minimum false detection. The experimental results confirm that the detected salient regions of a video closely resemble the salient regions detected by actual tracking of human eyes. The algorithm is tested on different types of video contents and compared with the several state-of-the-art methods to establish the effectiveness of the proposed method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2018-01-06
    Description: Optical tomography (OPT) is a method to capture a cross-sectional image based on the data obtained by sensors, distributed around the periphery of the analyzed system. This system is based on the measurement of the final light attenuation or absorption of radiation after crossing the measured objects. The number of sensor views will affect the results of image reconstruction, where the high number of sensor views per projection will give a high image quality. This research presents an application of charge-coupled device linear sensor and laser diode in an OPT system. Experiments in detecting solid and transparent objects in crystal clear water were conducted. Two numbers of sensors views, 160 and 320 views are evaluated in this research in reconstructing the images. The image reconstruction algorithms used were filtered images of linear back projection algorithms. Analysis on comparing the simulation and experiments image results shows that, with 320 image views giving less area error than 160 views. This suggests that high image view resulted in the high resolution of image reconstruction.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Hyper-lapse video with high speed-up rate is an efficient way to overview long videos, such as a human activity in first-person view. Existing hyper-lapse video creation methods produce a fast-forward video effect using only one video source. In this paper, we present a novel hyper-lapse video creation approach based on multiple spatially-overlapping videos. We assume the videos share a common view or location, and find transition points where jumps from one video to another may occur. We represent the collection of videos using a hyper-lapse transition graph ; the edges between nodes represent possible hyper-lapse frame transitions. To create a hyper-lapse video, a shortest path search is performed on this digraph to optimize frame sampling and assembly simultaneously. Finally, we render the hyper-lapse results using video stabilization and appearance smoothing techniques on the selected frames. Our technique can synthesize novel virtual hyper-lapse routes, which may not exist originally. We show various application results on both indoor and outdoor video collections with static scenes, moving objects, and crowds.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: We propose a method to remove motion blur in a single light field captured with a moving plenoptic camera. Since motion is unknown, we resort to a blind deconvolution formulation, where one aims to identify both the blur point spread function and the latent sharp image. Even in the absence of motion, light field images captured by a plenoptic camera are affected by a non-trivial combination of both aliasing and defocus, which depends on the 3D geometry of the scene. Therefore, motion deblurring algorithms designed for standard cameras are not directly applicable. Moreover, many state of the art blind deconvolution algorithms are based on iterative schemes, where blurry images are synthesized through the imaging model. However, current imaging models for plenoptic images are impractical due to their high dimensionality. We observe that plenoptic cameras introduce periodic patterns that can be exploited to obtain highly parallelizable numerical schemes to synthesize images. These schemes allow extremely efficient GPU implementations that enable the use of iterative methods. We can then cast blind deconvolution of a blurry light field image as a regularized energy minimization to recover a sharp high-resolution scene texture and the camera motion. Furthermore, the proposed formulation can handle non-uniform motion blur due to camera shake as demonstrated on both synthetic and real light field data.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: In this paper, we present a novel two-layer video representation for human action recognition employing hierarchical group sparse encoding technique and spatio-temporal structure. In the first layer, a new sparse encoding method named locally consistent group sparse coding (LCGSC) is proposed to make full use of motion and appearance information of local features. LCGSC method not only encodes global layouts of features within the same video-level groups, but also captures local correlations between them, which obtains expressive sparse representations of video sequences. Meanwhile, two kinds of efficient location estimation models, namely an absolute location model and a relative location model, are developed to incorporate spatio-temporal structure into LCGSC representations. In the second layer, action-level group is established, where a hierarchical LCGSC encoding scheme is applied to describe videos at different levels of abstractions. On the one hand, the new layer captures higher order dependency between video sequences; on the other hand, it takes label information into consideration to improve discrimination of videos’ representations. The superiorities of our hierarchical framework are demonstrated on several challenging datasets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Compressed sensing techniques have been applied to through-the-wall radar imaging (TWRI) and multipolarization TWRI for fast data acquisition and enhanced target localization. The studies so far in this area have either assumed effective wall clutter removal prior to image formation or performed signal estimation, wall clutter mitigation, and image formation independently. This paper proposes a low-rank and sparse imaging model for jointly addressing the problem of wall clutter mitigation and image formation in multichannel TWRI. The proposed model exploits two important structures of through-wall radar signals: low-rank structure of the wall reflections and jointly-sparse structure among the different polarization images. The task of removing wall clutter and reconstructing multichannel images of the same scene behind-the-wall is formulated as a regularized least squares problem, where low-rank regularization is enforced for the wall components, and joint-sparsity penalty is imposed on channel images. To solve the optimization problem, an iterative algorithm based on the proximal gradient technique is introduced, which simultaneously estimates the wall interferences and yields multichannel images of the indoor targets. Experiments on real and simulated radar data are conducted under full measurements and compressive sensing scenarios. The results show that the proposed model is very effective at removing unwanted wall clutter and enhancing the stationary targets, even under considerable reduction in measurements.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: In this paper, we develop a new low-rank matrix recovery algorithm for image denoising. We incorporate the total variation (TV) norm and the pixel range constraint into the existing reweighted low-rank matrix analysis to achieve structural smoothness and to significantly improve quality in the recovered image. Our proposed mathematical formulation of the low-rank matrix recovery problem combines the nuclear norm, TV norm, and $l_{1}$ norm, thereby allowing us to exploit the low-rank property of natural images, enhance the structural smoothness, and detect and remove large sparse noise. Using the iterative alternating direction and fast gradient projection methods, we develop an algorithm to solve the proposed challenging non-convex optimization problem. We conduct extensive performance evaluations on single-image denoising, hyper-spectral image denoising, and video background modeling from corrupted images. Our experimental results demonstrate that the proposed method outperforms the state-of-the-art low-rank matrix recovery methods, particularly for large random noise. For example, when the density of random sparse noise is 30%, for single-image denoising, our proposed method is able to improve the quality of the restored image by up to 4.21 dB over existing methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: The change detection in heterogeneous remote sensing images remains an important and open problem for damage assessment. We propose a new change detection method for heterogeneous images (i.e., SAR and optical images) based on homogeneous pixel transformation (HPT). HPT transfers one image from its original feature space (e.g., gray space) to another space (e.g., spectral space) in pixel-level to make the pre-event and post-event images represented in a common space for the convenience of change detection. HPT consists of two operations, i.e., the forward transformation and the backward transformation. In forward transformation, for each pixel of pre-event image in the first feature space, we will estimate its mapping pixel in the second space corresponding to post-event image based on the known unchanged pixels. A multi-value estimation method with noise tolerance is introduced to determine the mapping pixel using $K$ -nearest neighbors technique. Once the mapping pixels of pre-event image are available, the difference values between the mapping image and the post-event image can be directly calculated. After that, we will similarly do the backward transformation to associate the post-event image with the first space, and one more difference value for each pixel will be obtained. Then, the two difference values are combined to improve the robustness of detection with respect to the noise and heterogeneousness (modality difference) of images. Fuzzy-c means clustering algorithm is employed to divide the integrated difference values into two clusters: changed pixels and unchanged pixels. This detection results may contain some noisy regions (i.e., small error detections), and we develop a spatial-neighbor-based noise filter to further reduce the false alarms and missing detections using belief functions theory. The experiments for change detection with real images (e.g., SPOT, ERS, and NDVI) duri- g a flood in U.K. are given to validate the effectiveness of the proposed method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Thanks to the increasing number of images stored in the cloud, external image similarities can be leveraged to efficiently compress images by exploiting inter-images correlations. In this paper, we propose a novel image prediction scheme for cloud storage. Unlike current state-of-the-art methods, we use a semi-local approach to exploit inter-image correlation. The reference image is first segmented into multiple planar regions determined from matched local features and super-pixels. The geometric and photometric disparities between the matched regions of the reference image and the current image are then compensated. Finally, multiple references are generated from the estimated compensation models and organized in a pseudo-sequence to differentially encode the input image using classical video coding tools. Experimental results demonstrate that the proposed approach yields significant rate-distortion performance improvements compared with the current image inter-coding solutions such as high efficiency video coding.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: 2D complex Gabor filtering has found numerous applications in the fields of computer vision and image processing. Especially, in some applications, it is often needed to compute 2D complex Gabor filter bank consisting of filtering outputs at multiple orientations and frequencies. Although several approaches for fast Gabor filtering have been proposed, they focus primarily on reducing the runtime for performing filtering once at specific orientation and frequency. To obtain the Gabor filter bank, the existing methods are repeatedly applied with respect to multiple orientations and frequencies. In this paper, we propose a novel approach that efficiently computes the 2D complex Gabor filter bank by reducing the computational redundancy that arises when performing filtering at multiple orientations and frequencies. The proposed method first decomposes the Gabor kernel to allow a fast convolution with the Gaussian kernel in a separable manner. This enables reducing the runtime of the Gabor filter bank by reusing intermediate results computed at a specific orientation. By extending this idea, we also propose a fast approach for 2D localized sliding discrete Fourier transform that uses the Gaussian kernel in order to lend spatial localization ability as in the Gabor filter. Experimental results demonstrate that the proposed method runs faster than the state-of-the-art methods, while maintaining similar filtering quality.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: We present a physics-based illumination estimation approach explicitly designed to handle natural images under ambient light. Existing physics-based color constancy methods are theoretically perfect but do not handle real-world images well because the majority of these methods assume a single illuminant. Therefore, specular pixels selected using existing methods produce estimated dichromatic lines that are thick or curvilinear in the presence of ambient light, thus generating significant errors. Based on the Phong reflection model, we show that a group of specular pixels on a uniformly colored object, although they are subject to intensity thresholding, produce a unique dichromatic line length depending on the geometry of each image path. Assuming that the longest dichromatic line is the most desirable when estimating the chromaticity of an illuminant, ambient-robust specular pixels are also found on the same path on which the longest dichromatic line segment is generated. Therefore, we propose a method to find the optimal image path in which the specular pixels produce the longest dichromatic line. Even though the number of collected specular pixels is reduced using the proposed method, they are proven to be more accurate when determining the illuminant chromaticity even in the existing methods. Experiments with an established benchmark data set and a self-produced image set find that the proposed method is better able to locate the illuminant chromaticity compared with the state-of-the-art color constancy methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Spatially or temporally corrupted action videos are impractical for recognition via vision or learning models. It usually happens when streaming data are captured from unintended moving cameras, which bring occlusion or camera vibration and accordingly result in arbitrary loss of spatiotemporal information. In reality, it is intractable to deal with both spatial and temporal corruptions at the same time. In this paper, we propose a coupled stacked denoising tensor auto-encoder (CSDTAE) model, which approaches this corruption problem in a divide-and-conquer fashion by jointing both the spatial and temporal schemes together. In particular, each scheme is a SDTAE designed to handle either spatial or temporal corruption, respectively. SDTAE is composed of several blocks, each of which is a denoising tensor auto-encoder (DTAE). Therefore, CSDTAE is designed based on several DTAE building blocks to solve the spatiotemporal corruption problem simultaneously. In one DTAE, the video features are represented as a high-order tensor to preserve the spatiotemporal structure of data, where the temporal and spatial information are processed separately in different hidden layers via tensor unfolding. In summary, DTAE explores the spatial and temporal structure of the tensor representation, and SDTAE handles different corrupted ratios progressively to extract more discriminative features. CSDTAE couples the temporal and spatial corruptions of the same data through a thorough step-by-step procedure based on canonical correlation analysis, which integrates the two sub-problems into one problem. The key point is solving the spatiotemporal corruption in one model by considering them as noises in either spatial or temporal direction. Extensive experiments on three action data sets demonstrate the effectiveness of our model, especially when large volumes of corruption in the video.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: To effectively solve the challenges in object tracking, such as large deformation and severe occlusion, many existing methods use graph-based models to capture target part relations, and adopt a sequential scheme of target part selection, part matching, and state estimation. However, such methods have two major drawbacks: 1) inaccurate part selection leads to performance deterioration of part matching and state estimation and 2) there are insufficient effective global constraints for local part selection and matching. In this paper, we propose a new object tracking method based on iterative graph seeking, which integrate target part selection, part matching, and state estimation using a unified energy minimization framework. Our method also incorporates structural information in local parts variations using the global constraint. We devise an alternative iteration scheme to minimize the energy function for searching the most plausible target geometric graph. Experimental results on several challenging benchmarks (i.e., VOT2015, OTB2013, and OTB2015) demonstrate improved performance and robustness in comparison with existing algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Differential interference contrast (DIC) microscopy is widely used for observing unstained biological samples that are otherwise optically transparent. Combining this optical technique with machine vision could enable the automation of many life science experiments; however, identifying relevant features under DIC is challenging. In particular, precise tracking of cell boundaries in a thick ( ${>} 100 mu text{m}$ ) slice of tissue has not previously been accomplished. We present a novel deconvolution algorithm that achieves the state-of-the-art performance at identifying and tracking these membrane locations. Our proposed algorithm is formulated as a regularized least squares optimization that incorporates a filtering mechanism to handle organic tissue interference and a robust edge-sparsity regularizer that integrates dynamic edge tracking capabilities. As a secondary contribution, this paper also describes new community infrastructure in the form of a MATLAB toolbox for accurately simulating DIC microscopy images of in vitro brain slices. Building on existing DIC optics modeling, our simulation framework additionally contributes an accurate representation of interference from organic tissue, neuronal cell-shapes, and tissue motion due to the action of the pipette. This simulator allows us to better understand the image statistics (to improve algorithms), as well as quantitatively test cell segmentation and tracking algorithms in scenarios, where ground truth data is fully known.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-20
    Description: Understanding the visual quality of a feature map plays a significant role in many active vision applications. Previous works mostly rely on object-level features, such as compactness, to estimate the quality score of a feature map. However, the compactness is leveraged on feature maps produced by salient object detection techniques where the maps tend to be compact. As a result, the compactness feature fails when the feature maps are blurry (e.g., fixation maps). In this paper, we regard the process of estimating the quality score of feature maps, specifically fixation maps, as a regression problem. After extracting several local, global, geometric, and positional characteristic features from a feature map, a model is learned using a random forest regressor to estimate the quality score of any unseen feature map. Our model is specifically tailored to estimate the quality of three types of maps: bottom-up, target, and contextual feature maps. These maps are produced for a large benchmark fixation data set of more than 900 challenging outdoor images. We demonstrate that our approach provides an accurate estimate of the quality of the abovementioned feature maps compared to the groundtruth data. In addition, we show that our proposed approach is useful in feature map integration for predicting human fixation. Instead of naively integrating all three feature maps when predicting human fixation, our proposed approach dynamically selects the best feature map with the highest estimated quality score on an individual image basis, thereby improving the fixation prediction accuracy.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-27
    Description: Hand detection is essential for many hand related tasks, e.g., recovering hand pose and understanding gesture. However, hand detection in uncontrolled environments is challenging due to the flexibility of wrist joint and cluttered background. We propose a convolutional neural network (CNN), which formulates in-plane rotation explicitly to solve hand detection and rotation estimation jointly. Our network architecture adopts the backbone of faster R-CNN to generate rectangular region proposals and extract local features. The rotation network takes the feature as input and estimates an in-plane rotation which manages to align the hand, if any in the proposal, to the upward direction. A derotation layer is then designed to explicitly rotate the local spatial feature map according to the rotation network and feed aligned feature map for detection. Experiments show that our method outperforms the state-of-the-art detection models on widely-used benchmarks, such as Oxford and Egohands database. Further analysis show that rotation estimation and classification can mutually benefit each other.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-27
    Description: We develop a framework to virtually unroll fragile historical parchment scrolls, which cannot be physically unfolded via a sequence of X-ray tomographic slices, thus providing easy access to those parchments whose contents have remained hidden for centuries. The first step is to produce a topologically correct segmentation, which is challenging as the parchment layers vary significantly in thickness, contain substantial interior textures and can often stick together in places. For this purpose, our method starts with linking the broken layers in a slice using the topological structure propagated from its previous processed slice. To ensure topological correctness, we identify fused regions by detecting junction sections, and then match them using global optimization efficiently solved by the blossom algorithm, taking into account the shape energy of curves separating fused layers. The fused layers are then separated using as-parallel-as-possible curves connecting junction section pairs. To flatten the segmented parchment, pixels in different frames need to be put into alignment. This is achieved via a dynamic programming-based global optimization, which minimizes the total matching distances and penalizes stretches. Eventually, the text of the parchment is revealed by ink projection. We demonstrate the effectiveness of our approach using challenging real-world data sets, including the water damaged fifteenth century Bressingham scroll.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-27
    Description: Inevitable camera motion during exposure does not augur well for free-hand photography. Distortions introduced in images can be of different types and mainly depend on the structure of the scene, the nature of camera motion, and the shutter mechanism of the camera. In this paper, we address the problem of registering images taken from global shutter and rolling shutter cameras and reveal the constraints on camera motion that admit registration, change detection, and rectification. Our analysis encompasses degradations arising from camera motion during exposure and differences in shutter mechanisms. We also investigate conditions under which camera motions causing distortions in reference and target image can be decoupled to yield the underlying latent image through RS rectification. We validate our approach using several synthetic and real examples.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: Clustering a high-dimensional data set is known to be very difficult. In this paper, we show that this is not the case when the points to cluster correspond to images. More specifically, image data sets are shown to have a lot of structures, so much, so that projecting the set onto a random 1D linear subspace is likely to uncover a binary grouping among the images. Based on this observation, we propose a method to quantify the clusterability of a data set. The method is based on the probability density of a measure ( $S$ ) of clusterability (in 1D) of the projection of the data onto a random line. After comparing the clusterability of image datasets with that of synthetically generated clusters, we conclude that these intriguing structures we find in image datasets do not fit the notion of clusters in the traditional sense. Further suggested by our observation is a fast method for clustering high-dimensional data in a hierarchical fashion; at each stage, the data is partitioned into two based on the binary clustering found in a 1D random projection of the data. Since most of the computations are performed in 1D, this approach is extremely efficient. But despite its simplicity, it achieves overall a better quality of clustering than existing high-dimensional clustering methods, not only for datasets representing image data, but for other real data sets as well. Our results highlight the need to re-examine our assumptions about high-dimensional clustering and the geometry of real datasets such as sets of images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: In this paper, we propose the discriminative multiple canonical correlation analysis (DMCCA) for multimodal information analysis and fusion. DMCCA is capable of extracting more discriminative characteristics from multimodal information representations. Specifically, it finds the projected directions, which simultaneously maximize the within-class correlation and minimize the between-class correlation, leading to better utilization of the multimodal information. In the process, we analytically demonstrate that the optimally projected dimension by DMCCA can be quite accurately predicted, leading to both superior performance and substantial reduction in computational cost. We further verify that canonical correlation analysis (CCA), multiple canonical correlation analysis (MCCA) and discriminative canonical correlation analysis (DCCA) are special cases of DMCCA, thus establishing a unified framework for canonical correlation analysis. We implement a prototype of DMCCA to demonstrate its performance in handwritten digit recognition and human emotion recognition. Extensive experiments show that DMCCA outperforms the traditional methods of serial fusion, CCA, MCCA, and DCCA.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: This paper proposes a single-image super-resolution scheme by introducing a gradient field sharpening transform that converts the blurry gradient field of upsampled low-resolution (LR) image to a much sharper gradient field of original high-resolution (HR) image. Different from the existing methods that need to figure out the whole gradient profile structure and locate the edge points, we derive a new approach that sharpens the gradient field adaptively only based on the pixels in a small neighborhood. To maintain image contrast, image gradient is adaptively scaled to keep the integral of gradient field stable. Finally, the HR image is reconstructed by fusing the LR image with the sharpened HR gradient field. Experimental results demonstrate that the proposed algorithm can generate more accurate gradient field and produce super-resolved images with better objective and visual qualities. Another advantage is that the proposed gradient sharpening transform is very fast and suitable for low-complexity applications.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: Features that capture well the textural patterns of a certain class of images are crucial for the performance of texture segmentation methods. The manual selection of features or designing new ones can be a tedious task. Therefore, it is desirable to automatically adapt the features to a certain image or class of images. Typically, this requires a large set of training images with similar textures and ground truth segmentation. In this paper, we propose a framework to learn features for texture segmentation when no such training data is available. The cost function for our learning process is constructed to match a commonly used segmentation model, the piecewise constant Mumford-Shah model. This means that the features are learned such that they provide an approximately piecewise constant feature image with a small jump set. Based on this idea, we develop a two-stage algorithm which first learns suitable convolutional features and then performs segmentation. We note that the features can be learned from a small set of images, from a single image, or even from image patches. The proposed method achieves a competitive rank in the Prague texture segmentation benchmark, and it is effective for segmenting histological images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: The use of multiple features has been shown to be an effective strategy for visual tracking because of their complementary contributions to appearance modeling. The key problem is how to learn a fused representation from multiple features for appearance modeling. Different features extracted from the same object should share some commonalities in their representations while each feature should also have some feature-specific representation patterns which reflect its complementarity in appearance modeling. Different from existing multi-feature sparse trackers which only consider the commonalities among the sparsity patterns of multiple features, this paper proposes a novel multiple sparse representation framework for visual tracking which jointly exploits the shared and feature-specific properties of different features by decomposing multiple sparsity patterns. Moreover, we introduce a novel online multiple metric learning to efficiently and adaptively incorporate the appearance proximity constraint, which ensures that the learned commonalities of multiple features are more representative. Experimental results on tracking benchmark videos and other challenging videos demonstrate the effectiveness of the proposed tracker.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: Building up on the advances in low rank matrix completion, this paper presents a novel method for propagating the inpainting of the central view of a light field to all the other views. After generating a set of warped versions of the inpainted central view with random homographies, both the original light field views and the warped ones are vectorized and concatenated into a matrix. Because of the redundancy between the views, the matrix satisfies a low rank assumption enabling us to fill the region to inpaint with low rank matrix completion. To this end, a new matrix completion algorithm, better suited to the inpainting application than existing methods, is also developed in this paper. In its simple form, our method does not require any depth prior, unlike most existing light field inpainting algorithms. The method has then been extended to better handle the case where the area to inpaint contains depth discontinuities. In this case, a segmentation map of the different depth layers of the inpainted central view is required. This information is used to warp the depth layers with different homographies. Our experiments with natural light fields captured with plenoptic cameras demonstrate the robustness of the low rank approach to noisy data as well as large color and illumination variations between the views of the light field.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: Vast databases of billions of contact-based fingerprints have been developed to protect national borders and support e-governance programs. Emerging contactless fingerprint sensors offer better hygiene, security, and accuracy. However, the adoption/success of such contactless fingerprint technologies largely depends on advanced capability to match contactless 2D fingerprints with legacy contact-based fingerprint databases. This paper investigates such problem and develops a new approach to accurately match such fingerprint images. Robust thin-plate spline (RTPS) is developed to more accurately model elastic fingerprint deformations using splines. In order to correct such deformations on the contact-based fingerprints, RTPS-based generalized fingerprint deformation correction model (DCM) is proposed. The usage of DCM results in accurate alignment of key minutiae features observed on the contactless and contact-based fingerprints. Further improvement in such cross-matching performance is investigated by incorporating minutiae related ridges. We also develop a new database of 1800 contactless 2D fingerprints and the corresponding contact-based fingerprints acquired from 300 clients which is made publicly accessible for further research. The experimental results presented in this paper, using two publicly available databases, validate our approach and achieve outperforming results for matching contactless 2D and contact-based fingerprint images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-01-31
    Description: Robust principal component analysis, which extracts low-dimensional data from high-dimensional data, can also be regarded as a source separation problem of the sparse error matrix and the low-rank matrix. Until recently, various methods have attempted to precisely predict the discrete rank function by assigning a weight to the nuclear norm. However, if the weights are not in ascending order, the algorithms will diverge and exhibit high computational complexity. Moreover, from the viewpoint of source separation, these methods overlook the fact that two components must be sufficiently different for accurate demixing. In this paper, we employ the incoherence term with convex shape, which considers that components must appear different from one another for boosting separability. Since it is intractable to directly exploit mutual incoherence defined in linear algebra, we guarantee the incoherence by indirectly making the sparse matrix lack the low-rank property by using the duality norm principle. This approach can also be associated with the null space. To analyze the results of the proposed algorithm geometrically, we measure the geodesic distance between the tangent spaces of the manifolds of two separate components. As this distance increases, the degree of dissimilarity of the two components is adequately assured; thus, separation succeeds. Furthermore, this paper is the first to provide insights into the relationship between source separation conditions and the derivatives of the nuclear norm and $L_{1}$ norm. Experiments are conducted on still image separation and background subtraction to confirm the superiority of the proposed methods both qualitatively and quantitatively.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: In the field of objective image quality assessment (IQA), Spearman’s $rho$ and Kendall’s $tau$ , which straightforwardly assign uniform weights to all quality levels and assume that each pair of images is sortable, are the two most popular rank correlation indicators. These indicators can successfully measure the average accuracy of an IQA metric for ranking multiple processed images. However, two important perceptual properties are ignored. First, the sorting accuracy ( SA ) of high-quality images is usually more important than that of poor-quality images in many real-world applications, where only top-ranked images are pushed to the users. Second, due to the subjective uncertainty in making judgments, two perceptually similar images are usually barely sortable, and their ranks do not contribute to the evaluation of an IQA metric. To more accurately compare different IQA algorithms, in this paper, we explore a perceptually weighted rank correlation indicator, which rewards the capability of correctly ranking high-quality images and suppresses the attention toward insensitive rank mistakes. Specifically, we focus on activating a “valid” pairwise comparison of images whose quality difference exceeds a given sensory threshold ( ST ). Meanwhile, each image pair is assigned a unique weight that is determined by both the quality level and rank deviation. By modifying the perception threshold, we can illustrate the sorting accuracy with a sophisticated SA-ST curve rather than a single rank correlation coefficient. The proposed indicator offers new insight into interpreting visual perception behavior. Furthermore, the applicability of our indicator is validated for recommending robust IQA metrics for both d- graded and enhanced image data.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: Recently, micro-expression recognition has attracted lots of researchers’ attention due to its potential value in many practical applications, e.g., lie detection. In this paper, we investigate an interesting and challenging problem in micro-expression recognition, i.e., cross-database micro-expression recognition, in which the training and testing samples come from different micro-expression databases. Under this problem setting, the consistent feature distribution between the training and testing samples originally existing in conventional micro-expression recognition would be seriously broken, and hence, the performance of most current well-performing micro-expression recognition methods may sharply drop. In order to overcome it, we propose a simple yet effective framework called domain regeneration (DR) in this paper. The DR framework aims at learning a domain regenerator to regenerate the micro-expression samples from source and target databases, respectively, such that they can abide by the same or similar feature distributions. Thus, we are able to use the classifier learned based on the labeled source micro-expression samples to predict the label information of the unlabeled target micro-expression samples. To evaluate the proposed DR framework, we conduct extensive cross-database micro-expression recognition experiments designed based on the Spontaneous Micro-Expression Database and Chinese Academy of Sciences Micro-Expression II Database. Experimental results show that compared with the recent state-of-the-art cross-database emotion recognition methods, the proposed DR framework has more promising performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: Convolutional neural networks are currently the state-of-the-art solution for a wide range of image processing tasks. Their deep architecture extracts low- and high-level features from images, thus improving the model’s performance. In this paper, we propose a method for image demosaicking based on deep convolutional neural networks. Demosaicking is the task of reproducing full color images from incomplete images formed from overlaid color filter arrays on image sensors found in digital cameras. Instead of producing the output image directly, the proposed method divides the demosaicking task into an initial demosaicking step and a refinement step. The initial step produces a rough demosaicked image containing unwanted color artifacts. The refinement step then reduces these color artifacts using deep residual estimation and multi-model fusion producing a higher quality image. Experimental results show that the proposed method outperforms several existing and state-of-the-art methods in terms of both the subjective and objective evaluations.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: This paper proposes novel bistatic inverse synthetic aperture radar (ISAR) imaging algorithm for the target with complex motion under low signal to noise ratio (SNR) condition. Note the bistatic ISAR system generally suffers from a lower SNR than the monostatic one because of its non-mirror reflection geometry. A de-noising method, therefore, is proposed to improve SNR of range profiles, which accumulates the aligned range profiles non-coherently to obtain a window for noise suppression. In addition, since the complex motion of target induces non-stationary Doppler, which is destructive to ISAR imaging, an optimal coherent processing interval (CPI) selection algorithm is further proposed to find out the interval, where the Doppler is relatively stationary, so as to produce well-focused ISAR images. It utilizes the reassigned time-frequency method to obtain the high resolution instantaneous Doppler spectrum, and the minimum entropy criterion to select the optimal CPI, respectively. Note the selected CPI often contains too limited pulses to produce ISAR images with high resolution. A sparse aperture ISAR imaging method within the Bayesian framework is further proposed, which introduces the Laplacian scale mixture (LSM) model as the sparse prior, so as to reconstruct well-focused ISAR images with high resolution and low side lobes from the limited data. Compared with the traditional sparse Bayesian learning method, the proposed LSM based ISAR imaging performs superiorly on resolution improvement and noise reduction. Experimental results based on both simulated and measured data validate the effectiveness of the proposed algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2018-03-06
    Description: Estimation of missing digital information is mostly addressed by 1- or 2-D signal processing methods; however, this problem can emerge in multi-dimensional data including 3-D images. Examples of 3-D images dealing with missing edge information are often found using dental micro-CT, where the natural contours of dental enamel and dentine are partially dissolved or lost by caries. In this paper, we present a novel sequential approach to estimate the missing surface of an object. First, an initial correct contour is determined interactively or automatically, for the starting slice. This contour information defines the local search area and provides the overall estimation pattern for the edge candidates in the next slice. The search for edge candidates in the next slice is performed in the perpendicular direction to the obtained initial edge in order to find and label the corrupted edge candidates. Subsequently, the location information of both initial and nominated edge candidates are transformed and segregated into two independent signals (X-coordinates and Y-coordinates) and the problem is changed into error concealment. In the next step, the missing samples of these signals are estimated using a modified Tikhonov regularization model with two new terms. One term contributes in the denoising of the corrupted signal by defining an estimation model for a group of mildly destructed samples, and the other term contributes in the estimation of the missing samples with the highest similarity to the samples of the obtained signals from the previous slice. Finally, the reconstructed signals are transformed inversely to edge pixel representation. The estimated edges in each slice are considered as initial edge information for the next slice, and this procedure is repeated slice by slice until the entire contour of the destructed surface is estimated. The visual results as well as quantitative results (using both contour-based and area-based metrics) for seven ima- e data sets of tooth samples with considerable destruction of the dentin-enamel junction demonstrates that the proposed method can accurately interpolate the shape and the position of the missing surfaces in computed tomography images in both two and 3-D (e.g., 14.87 ± 3.87 $mu text{m}$ of mean distance (MD) error for the proposed method versus 7.33 ± 0.27 $mu text{m}$ of MD error between human experts and 1.25± ~ 0 % error rate (ER) of the proposed method versus 0.64± ~ 0 % of ER between human experts (~1% difference)).
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: The ubiquitous large, complex, and high dimensional datasets in computer vision and machine learning generates the problem of subspace clustering, which aims to partition the data into several low dimensional subspaces. By utilizing relatively limited labeled data and sufficient unlabeled data, the semi-supervised subspace clustering is more effective, practical, and become more popular. In this paper, we present a new regularity combing the labels and the affinity to ensure the coherence of the affinity between data points from the same subspace as well as the discrimination of cluster labels for data points from different subspaces. We combine it with the manifold smoothing term of the existing methods and the Gaussian fields and harmonic functions method to give a new unified optimization framework for semi-supervised subspace clustering. Analysis shows the proposed model fully combines the affinity and the labels to guide each other so that both are discriminative between clusters and coherent within clusters. Extensive experiments show that our method outperforms the existing state-of-the-art methods, thus suggests that the property of discriminative between clusters and coherent within clusters of our method is advantageous to semi-supervised subspace clustering.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: Since homomorphic encryption operations have high computational complexity, image applications based on homomorphic encryption are often time consuming, which makes them impractical. In this paper, we study efficient encrypted image applications with the encrypted domain Walsh-Hadamard transform (WHT) and parallel algorithms. We first present methods to implement real and complex WHTs in the encrypted domain. We then propose a parallel algorithm to improve the computational efficiency of the encrypted domain WHT. To compare the WHT with the discrete cosine transform (DCT), integer DCT, and Haar transform in the encrypted domain, we conduct theoretical analysis and experimental verification, which reveal that the encrypted domain WHT has the advantages of lower computational complexity and a shorter running time. Our analysis shows that the encrypted WHT can accommodate plaintext data of larger values and has better energy compaction ability on dithered images. We propose two encrypted image applications using the encrypted domain WHT. To accelerate the practical execution, we present two parallelization strategies for the proposed applications. The experimental results show that the speedup of the homomorphic encrypted image application exceeds 12.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: Gabor magnitude is known to be among the most discriminative representations for face images due to its space- frequency co-localization property. However, such property causes adverse effects even when the images are acquired under moderate head pose variations. To address this pose sensitivity issue and other moderate imaging variations, we propose an analytic Gabor feedforward network which can absorb such moderate changes. Essentially, the network works directly on the raw face images and produces directionally projected Gabor magnitude features at the hidden layer. Subsequently, several sets of magnitude features obtained from various orientations and scales are fused at the output layer for final classification decision. The network model is analytically trained using a single sample per identity. The obtained solution is globally optimal with respect to the classification total error rate. Our empirical experiments conducted on five face data sets (six subsets) from the public domain show encouraging results in terms of identification accuracy and computational efficiency.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: Images degraded by light scattering and absorption, such as hazy, sandstorm, and underwater images, often suffer color distortion and low contrast because of light traveling through turbid media. In order to enhance and restore such images, we first estimate ambient light using the depth-dependent color change. Then, via calculating the difference between the observed intensity and the ambient light, which we call the scene ambient light differential, scene transmission can be estimated. Additionally, adaptive color correction is incorporated into the image formation model (IFM) for removing color casts while restoring contrast. Experimental results on various degraded images demonstrate the new method outperforms other IFM-based methods subjectively and objectively. Our approach can be interpreted as a generalization of the common dark channel prior (DCP) approach to image restoration, and our method reduces to several DCP variants for different special cases of ambient lighting and turbid medium conditions.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: Superpixel segmentation targets at grouping pixels in an image into atomic regions whose boundaries align well with the natural object boundaries. This paper first proposes a new feature representation for superpixel segmentation that holistically embraces color, contour, texture, and spatial features. Then, we introduce a clustering-based discriminability measure to iteratively evaluate the importance of different features. Integrating the feature representation and the discriminability measure, we propose a novel content-adaptive superpixel (CAS) segmentation algorithm. CAS is able to automatically and iteratively adjust the weights of different features to fit various properties of image instances. Experiments on several challenging datasets demonstrate that the proposed CAS outperforms the state-of-the-art methods and has a low computational cost.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-06
    Description: During the recent years, correlation filters have shown dominant and spectacular results for visual object tracking. The types of the features that are employed in this family of trackers significantly affect the performance of visual tracking. The ultimate goal is to utilize the robust features invariant to any kind of appearance change of the object, while predicting the object location as properly as in the case of no appearance change. As the deep learning based methods have emerged, the study of learning features for specific tasks has accelerated. For instance, discriminative visual tracking methods based on deep architectures have been studied with promising performance. Nevertheless, correlation filter based (CFB) trackers confine themselves to use the pre-trained networks, which are trained for object classification problem. To this end, in this manuscript the problem of learning deep fully convolutional features for the CFB visual tracking is formulated. In order to learn the proposed model, a novel and efficient backpropagation algorithm is presented based on the loss function of the network. The proposed learning framework enables the network model to be flexible for a custom design. Moreover, it alleviates the dependency on the network trained for classification. Extensive performance analysis shows the efficacy of the proposed custom design in the CFB tracking framework. By fine-tuning the convolutional parts of a state-of-the-art network and integrating this model to a CFB tracker, which is the top performing one of VOT2016, 18% increase is achieved in terms of expected average overlap, and tracking failures are decreased by 25%, while maintaining the superiority over the state-of-the-art methods in OTB-2013 and OTB-2015 tracking datasets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-13
    Description: Single-image super-resolution (SR) reconstruction via sparse representation has recently attracted broad interest. It is known that a low-resolution (LR) image is susceptible to noise or blur due to the degradation of the observed image, which would lead to a poor SR performance. In this paper, we propose a novel robust edge-preserving smoothing SR (REPS-SR) method in the framework of sparse representation. An EPS regularization term is designed based on gradient-domain-guided filtering to preserve image edges and reduce noise in the reconstructed image. Furthermore, a smoothing-aware factor adaptively determined by the estimation of the noise level of LR images without manual interference is presented to obtain an optimal balance between the data fidelity term and the proposed EPS regularization term. An iterative shrinkage algorithm is used to obtain the SR image results for LR images. The proposed adaptive smoothing-aware scheme makes our method robust to different levels of noise. Experimental results indicate that the proposed method can preserve image edges and reduce noise and outperforms the current state-of-the-art methods for noisy images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-13
    Description: In this paper, we propose YoTube —a novel deep learning framework for generating action proposals in untrimmed videos, where each action proposal corresponds to a spatial-temporal tube that potentially locates one human action. Most of the existing works generate proposals by clustering low-level features or linking image proposals, which ignore the interplay between long-term temporal context and short-term cues. Different from these works, our method considers the interplay by designing a new recurrent YoTube detector and static YoTube detector. The recurrent YoTube detector sequentially regresses candidate bounding boxes using Recurrent Neural Network learned long-term temporal contexts. The static YoTube detector produces bounding boxes using rich appearance cues in every single frame. To fully exploit the complementary appearance, motion, and temporal context, we train the recurrent and static detector using RGB (Color) and flow information. Moreover, we fuse the corresponding outputs of the detectors to produce accurate and robust proposal boxes and obtain the final action proposals by linking the proposal boxes using dynamic programming with a novel path trimming method. Benefiting from the pipeline of our method, the untrimmed video could be effectively and efficiently handled. Extensive experiments on the challenging UCF-101, UCF-Sports, and JHMDB datasets show superior performance of the proposed method compared with the state of the arts.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-13
    Description: Convolutional neural network (CNN) is of great interest in machine learning and has demonstrated excellent performance in hyperspectral image classification. In this paper, we propose a classification framework, called diverse region-based CNN, which can encode semantic context-aware representation to obtain promising features. With merging a diverse set of discriminative appearance factors, the resulting CNN-based representation exhibits spatial-spectral context sensitivity that is essential for accurate pixel classification. The proposed method exploiting diverse region-based inputs to learn contextual interactional features is expected to have more discriminative power. The joint representation containing rich spectral and spatial information is then fed to a fully connected network and the label of each pixel vector is predicted by a softmax layer. Experimental results with widely used hyperspectral image data sets demonstrate that the proposed method can surpass any other conventional deep learning-based classifiers and other state-of-the-art classifiers.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: Graph-based dimensionality reduction techniques have been widely and successfully applied to clustering and classification tasks. The basis of these algorithms is the constructed graph which dictates their performance. In general, the graph is defined by the input affinity matrix. However, the affinity matrix derived from the data is sometimes suboptimal for dimension reduction as the data used are very noisy. To address this issue, we propose the projective unsupervised flexible embedding models with optimal graph (PUFE-OG). We build an optimal graph by adjusting the affinity matrix. To tackle the out-of-sample problem, we employ a linear regression term to learn a projection matrix. The optimal graph and the projection matrix are jointly learned by integrating the manifold regularizer and regression residual into a unified model. The experimental results on the public benchmark datasets demonstrate that the proposed PUFE-OG outperforms state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: In this paper, we propose a novel correlation particle filter (CPF) for robust visual tracking. Instead of a simple combination of a correlation filter and a particle filter, we exploit and complement the strength of each one. Compared with existing tracking methods based on correlation filters and particle filters, the proposed tracker has four major advantages: 1) it is robust to partial and total occlusions, and can recover from lost tracks by maintaining multiple hypotheses; 2) it can effectively handle large-scale variation via a particle sampling strategy; 3) it can efficiently maintain multiple modes in the posterior density using fewer particles than conventional particle filters, resulting in low computational cost; and 4) it can shepherd the sampled particles toward the modes of the target state distribution using a mixture of correlation filters, resulting in robust tracking performance. Extensive experimental results on challenging benchmark data sets demonstrate that the proposed CPF tracking algorithm performs favorably against the state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: We propose a new trajectory clustering method using submodular optimization for better motion segmentation in videos. A small number of representative trajectories are first selected by submodular maximization automatically. Then all the initial trajectories can be segmented into fragments with the representative trajectories as centers of fragments. At last, fragments are merged into clusters by a two-stage bottom-up clustering method, and each cluster shows the motion of one moving object. The submodular energy function integrates the quality of all trajectories and their correlations. As a result, thousands of initial trajectories are replaced by only dozens of representative trajectories, which will reduce the negative influence of inaccurate initial trajectories on motion segmentation. The representative trajectories will have larger weights while extracting color or texture information of each moving entity at the step of motion segmentation. Experimental results demonstrate that our method can divide trajectories into more accurate clusters. The final motion segmentation results also illustrate that our method outperforms state-of-the-art motion segmentation methods based on trajectory clustering. 1 1 Our source code is available at https://github.com/shenjianbing/submodularmotion .
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: This paper presents a method of modeling edge profiles with two blur parameters, and estimating and predicting those edge parameters with varying brightness combinations and camera-to-object distances (COD). First, the validity of the edge model is proven mathematically. Then, it is proven experimentally with edges from a set of images captured for specifically designed target sheets and with edges from natural images. Estimation of the two blur parameters for each observed edge profile is performed with a brute-force method to find parameters that produce global minimum errors. Then, using the estimated blur parameters, actual blur parameters of edges with arbitrary brightness combinations are predicted using a surface interpolation method (i.e., kriging). The predicted surfaces show that the two blur parameters of the proposed edge model depend on both dark-side edge brightness and light-side edge brightness following a certain global trend. This is similar across varying CODs. The proposed edge model is compared with a one-blur parameter edge model using experiments of the root mean squared error for fitting the edge models to each observed edge profile. The comparison results suggest that the proposed edge model has superiority over the one-blur parameter edge model in most cases where edges have varying brightness combinations.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: Noise estimation is crucial in many image processing algorithms such as image denoising. Conventionally, the noise is assumed as a signal-independent additive white Gaussian process. However, for the real raw data of image sensor, the present noise should be practically modeled as signal dependent. In this paper, we propose an effective and fast image sensor noise estimation method for a single raw image. The noise model parameters are estimated via constrained weighted least squares (WLS) fitting on a number of data samples, each of which is generated from a group of weakly textured patches. Specifically, we first design a fast scheme for selecting weakly textured patches, with the guidance of image histogram. To robustly fit the data samples, we then explicitly account for the credibility of each sample by measuring the texture strength of the grouped patches. The image sensor noise estimation is finally formulated as a constrained WLS optimization problem, which can be solved efficiently. Experimental results demonstrate that our method could run much faster than the existing schemes, while retaining the state-of-the-art estimation performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2018-03-16
    Description: The implementation of automatic image registration is still difficult in various applications. In this paper, an automatic image registration approach through line-support region segmentation and geometrical outlier removal is proposed. This new approach is designed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations. To begin with, line-support regions, namely a straight region whose points share roughly the same image gradient angle, are extracted to address the issues of inconsistent content existing in images. To alleviate the incompleteness of line segments, an iterative strategy with multi-resolution is employed to preserve global structures that are masked at full resolution by image details or noise. Then, geometrical outlier removal is developed to provide reliable feature point matching, which is based on affine-invariant geometrical classifications for corresponding matches initialized by scale invariant feature transform. The candidate outliers are selected by comparing the disparity of accumulated classifications among all matches, instead of conventional methods which only rely on local geometrical relations. Various image sets have been considered in this paper for the evaluation of the proposed approach, including aerial images with simulated affine deformations, remote sensing optical and synthetic aperture radar images taken at different situations (multispectral, multisensor, and multitemporal), and map images with inconsistent annotations. Experimental results demonstrate the superior performance of the proposed method over the existing approaches for the whole data set.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: We tackle the challenge of constructing 64 pixels for each individual pixel of a thumbnail face image. We show that such an aggressive super-resolution objective can be attained by taking advantage of the global context and making the best use of the prior information portrayed by the image class. Our input image is so small (e.g., $16times 16$ pixels) that it can be considered as a patch of itself. Thus, conventional patch-matching-based super-resolution solutions are unsuitable. In order to enhance the resolution while enforcing the global context, we incorporate a pixel-wise appearance similarity objective into a deconvolutional neural network, which allows efficient learning of mappings between low-resolution input images and their high-resolution counterparts in the training data set. Furthermore, the deconvolutional network blends the learned high-resolution constituent parts in an authentic manner, where the face structure is naturally imposed and the global context is preserved. To account for the possible artifacts in upsampled feature maps, we employ a sub-network composed of additional convolutional layers. During training, we use roughly aligned images (only eye locations), yet demonstrate that our network has the capacity to super-resolve face images regardless of pose and facial expression variations. This significantly reduces the requirement of precisely face alignments in the data set. Owing to the network topology we apply, our method is robust to translational misalignments. In addition, our method is able to upsample rotational unaligned faces with data augmentation. Our extensive experimental analysis manifests that our method achieves more appealing and superior results than the state of the art.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-16
    Description: The problem of blind image recovery using multiple blurry images of the same scene is addressed in this paper. To perform blind deconvolution, which is also called blind image recovery, the blur kernel and image are represented by groups of sparse domains to exploit the local and nonlocal information such that a novel joint deblurring approach is conceived. In the proposed approach, the group sparse regularization on both the blur kernel and image is provided, where the sparse solution is promoted by $ell _{1}$ -norm. In addition, the reweighted data fidelity is developed to further improve the recovery performance, where the weight is determined by the estimation error. Moreover, to reduce the undesirable noise effects in group sparse representation, distance measures are studied in the block matching process to find similar patches. In such a joint deblurring approach, a more sophisticated two-step interactive process is needed in which each step is solved by means of the well-known split Bregman iteration algorithm, which is generally used to efficiently solve the proposed joint deblurring problem. Finally, numerical studies, including synthetic and real images, demonstrate that the performance of this joint estimation algorithm is superior to the previous state-of-the-art algorithms in terms of both objective and subjective evaluation standards. The recovery results of real captured images using unmanned aerial vehicles are also provided to further validate the effectiveness of the proposed method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: In this paper, we address the problem of quantifying the reliability of computational saliency for videos, which can be used to improve saliency-based video processing algorithms and enable more reliable performance and objective risk assessment of saliency-based video processing applications. Our approach to quantify such reliability is twofold. First, we explore spatial correlations in both the saliency map and the eye-fixation map. Then, we learn the spatiotemporal correlations that define a reliable saliency map. We first study spatiotemporal eye-fixation data from the public CRCNS data set and investigate a common feature in human visual attention, which dictates a correlation in saliency between a pixel and its direct neighbors. Based on the study, we then develop an algorithm that estimates a pixel-wise uncertainty map that reflects our supposed confidence in the associated computational saliency map by relating a pixel’s saliency to the saliency of its direct neighbors. To estimate such uncertainties, we measure the divergence of a pixel, in a saliency map, from its local neighborhood. In addition, we propose a systematic procedure to evaluate uncertainty estimation performance by explicitly computing uncertainty ground truth as a function of a given saliency map and eye fixations of human subjects. In our experiments, we explore multiple definitions of locality and neighborhoods in spatiotemporal video signals. In addition, we examine the relationship between the parameters of our proposed algorithm and the content of the videos. The proposed algorithm is unsupervised, making it more suitable for generalization to most natural videos. Also, it is computationally efficient and flexible for customization to specific video content. Experiments using three publicly available video data sets show that the proposed algorithm outperforms state-of-the-art uncertainty estimation methods with improvement in accuracy up to 63% an- offers efficiency and flexibility that make it more useful in practical situations.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: In this paper, we propose a novel matching based tracker by investigating the relationship between template matching and the recent popular correlation filter based trackers (CFTs). Compared to the correlation operation in CFTs, a sophisticated similarity metric termed mutual buddies similarity is proposed to exploit the relationship of multiple reciprocal nearest neighbors for target matching. By doing so, our tracker obtains powerful discriminative ability on distinguishing target and background as demonstrated by both empirical and theoretical analyses. Besides, instead of utilizing single template with the improper updating scheme in CFTs, we design a novel online template updating strategy named memory, which aims to select a certain amount of representative and reliable tracking results in history to construct the current stable and expressive template set. This scheme is beneficial for the proposed tracker to comprehensively understand the target appearance variations, recall some stable results. Both qualitative and quantitative evaluations on two benchmarks suggest that the proposed tracking method performs favorably against some recently developed CFTs and other competitive trackers.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: Low-light image enhancement methods based on classic Retinex model attempt to manipulate the estimated illumination and to project it back to the corresponding reflectance. However, the model does not consider the noise, which inevitably exists in images captured in low-light conditions. In this paper, we propose the robust Retinex model, which additionally considers a noise map compared with the conventional Retinex model, to improve the performance of enhancing low-light images accompanied by intensive noise. Based on the robust Retinex model, we present an optimization function that includes novel regularization terms for the illumination and reflectance. Specifically, we use $ell _{1}$ norm to constrain the piece-wise smoothness of the illumination, adopt a fidelity term for gradients of the reflectance to reveal the structure details in low-light images, and make the first attempt to estimate a noise map out of the robust Retinex model. To effectively solve the optimization problem, we provide an augmented Lagrange multiplier based alternating direction minimization algorithm without logarithmic transformation. Experimental results demonstrate the effectiveness of the proposed method in low-light image enhancement. In addition, the proposed method can be generalized to handle a series of similar problems, such as the image enhancement for underwater or remote sensing and in hazy or dusty conditions.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2018-03-23
    Description: This paper presents a new representation of skeleton sequences for 3D action recognition. Existing methods based on hand-crafted features or recurrent neural networks cannot adequately capture the complex spatial structures and the long-term temporal dynamics of the skeleton sequences, which are very important to recognize the actions. In this paper, we propose to transform each channel of the 3D coordinates of a skeleton sequence into a clip. Each frame of the generated clip represents the temporal information of the entire skeleton sequence and one particular spatial relationship between the skeleton joints. The entire clip incorporates multiple frames with different spatial relationships, which provide useful spatial structural information of the human skeleton. We also propose a multitask convolutional neural network (MTCNN) to learn the generated clips for action recognition. The proposed MTCNN processes all the frames of the generated clips in parallel to explore the spatial and temporal information of the skeleton sequences. The proposed method has been extensively tested on six challenging benchmark datasets. Experimental results consistently demonstrate the superiority of the proposed clip representation and the feature learning method for 3D action recognition compared to the existing techniques.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...