ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
  • 1
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: Histopathological grading of cancer not only offers an insight to the patients’ prognosis but also helps in making individual treatment plans. Mitosis counts in histopathological slides play a crucial role for invasive breast cancer grading using the Nottingham grading system. Pathologists perform this grading by manual examinations of a few thousand images for each patient. Hence, finding the mitotic figures from these images is a tedious job and also prone to observer variability due to variations in the appearances of the mitotic cells. We propose a fast and accurate approach for automatic mitosis detection from histopathological images. We employ area morphological scale space for cell segmentation. The scale space is constructed in a novel manner by restricting the scales with the maximization of relative-entropy between the cells and the background. This results in precise cell segmentation. The segmented cells are classified in mitotic and non-mitotic category using the random forest classifier. Experiments show at least 12% improvement in $F_{1}$ score on more than 450 histopathological images at $40times $ magnification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: This paper proposes a fast multi-band image fusion algorithm, which combines a high-spatial low-spectral resolution image and a low-spatial high-spectral resolution image. The well admitted forward model is explored to form the likelihoods of the observations. Maximizing the likelihoods leads to solving a Sylvester equation. By exploiting the properties of the circulant and downsampling matrices associated with the fusion problem, a closed-form solution for the corresponding Sylvester equation is obtained explicitly, getting rid of any iterative update step. Coupled with the alternating direction method of multipliers and the block coordinate descent method, the proposed algorithm can be easily generalized to incorporate prior information for the fusion problem, allowing a Bayesian estimator. Simulation results show that the proposed algorithm achieves the same performance as the existing algorithms with the advantage of significantly decreasing the computational complexity of these algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: In recent years, baggage screening at airports has included the use of dual-energy X-ray computed tomography (DECT), an advanced technology for nondestructive evaluation. The main challenge remains to reliably find and identify threat objects in the bag from DECT data. This task is particularly hard due to the wide variety of objects, the high clutter, and the presence of metal, which causes streaks and shading in the scanner images. Image noise and artifacts are generally much more severe than in medical CT and can lead to splitting of objects and inaccurate object labeling. The conventional approach performs object segmentation and material identification in two decoupled processes. Dual-energy information is typically not used for the segmentation, and object localization is not explicitly used to stabilize the material parameter estimates. We propose a novel learning-based framework for joint segmentation and identification of objects directly from volumetric DECT images, which is robust to streaks, noise and variability due to clutter. We focus on segmenting and identifying a small set of objects of interest with characteristics that are learned from training images, and consider everything else as background. We include data weighting to mitigate metal artifacts and incorporate an object boundary field to reduce object splitting. The overall formulation is posed as a multilabel discrete optimization problem and solved using an efficient graph-cut algorithm. We test the method on real data and show its potential for producing accurate labels of the objects of interest without splits in the presence of metal and clutter.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: Feature point matching is a fundamental and challenging problem in many computer vision applications. In this paper, a robust feature point matching algorithm named spatial order constraints bilateral-neighbor vote (SOCBV) is proposed to remove outliers for a set of matches (including outliers) between two images. A directed ${k}$ nearest neighbor ( knn ) graph of match sets is generated, and the problem of feature point matching is formulated as a binary discrimination problem. In the discrimination process, the class labeled matrix is built via the spatial order constraints defined on the edges that connect a point to its knn . Then, the posterior inlier class probability of each match is estimated with the knn density estimation and spatial order constraints. The vote of each match is determined by averaging all posterior class probabilities that originate from its associative inliers set and is used for removing outliers. The algorithm iteratively removes outliers from the directed graph and recomputes the votes until the stopping condition is satisfied. Compared with other popular algorithms, such as RANSAC, RSOC, GTM, SOC and WGTM, experiments under various testing data sets demonstrate strong robustness for the proposed algorithm.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-18
    Description: This paper presents a novel low-complexity motion estimation and mode decision algorithm for encoding multiple quality layers following the H.264/scalable video coding standard, considering both coarse grain scalability (CGS) and medium grain scalability (MGS). The proposed algorithm conducts motion estimation and mode decision only at the base layer (BL) and enforces the higher layers to inherit the motion and mode decisions of the BL. In order for the decision made at the BL to be nearly optimal for all layers, we use the highest layer reconstructed frame as the reference frame for motion estimation and set the Lagrangian multipliers according to the quantization parameter of the current and higher layers. We also propose a simple early skip/direct decision to further boost the encoding speed. Mode decision and motion estimation is conducted at a higher layer only if the layer below it uses the skip/direct mode for a block. Significant complexity reduction can be achieved because the mode and motion estimation is performed at most once for each macroblock. Because the mode and motion information only needs to be transmitted once, we also achieve a slightly better rate-distortion (R–D) performance for typical videos. Experiments have shown more than $2times $ (up to $5times $ ) speedup for a three-layer encoder against the conventional R–D optimized reference software JSVM on both CIF and HD sequences, and for both CGS and MGS, with the tradeoff of the coding efficiency measured by the Bjontegaard delta rate.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-18
    Description: In this paper, we propose a novel unifying framework using a Markov network to learn the relationships among multiple classifiers. In face recognition, we assume that we have several complementary classifiers available, and assign observation nodes to the features of a query image and hidden nodes to those of gallery images. Under the Markov assumption, we connect each hidden node to its corresponding observation node and the hidden nodes of neighboring classifiers. For each observation-hidden node pair, we collect the set of gallery candidates most similar to the observation instance, and capture the relationship between the hidden nodes in terms of a similarity matrix among the retrieved gallery images. Posterior probabilities in the hidden nodes are computed using the belief propagation algorithm, and we use marginal probability as the new similarity value of the classifier. The novelty of our proposed framework lies in the method that considers classifier dependence using the results of each neighboring classifier. We present the extensive evaluation results for two different protocols, known and unknown image variation tests, using four publicly available databases: 1) the Face Recognition Grand Challenge ver. 2.0; 2) XM2VTS; 3) BANCA; and 4) Multi-PIE. The result shows that our framework consistently yields improved recognition rates in various situations.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-18
    Description: Ellipse fitting is widely applied in the fields of computer vision and automatic manufacture. However, the introduced edge point errors (especially outliers) from image edge detection will cause severe performance degradation of the subsequent ellipse fitting procedure. To alleviate the influence of outliers, we develop a robust ellipse fitting method in this paper. The main contributions of this paper are as follows. First, to be robust against the outliers, we introduce the maximum correntropy criterion into the constrained least-square (CLS) ellipse fitting method, and apply the half-quadratic optimization algorithm to solve the nonlinear and nonconvex problem in an alternate manner. Second, to ensure that the obtained solution is related to an ellipse, we introduce a special quadratic equality constraint into the aforementioned CLS model, which results in the nonconvex quadratically constrained quadratic programming problem. Finally, we derive the semidefinite relaxation version of the aforementioned problem in terms of the trace operator and thus determine the ellipse parameters using semidefinite programming. Some simulated and experimental examples are presented to illustrate the effectiveness of the proposed ellipse fitting approach.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-18
    Description: State-of-the-art web image search frameworks are often based on the bag-of-visual-words (BoVWs) model and the inverted index structure. Despite the simplicity, efficiency, and scalability, they often suffer from low precision and/or recall, due to the limited stability of local features and the considerable information loss on the quantization stage. To refine the quality of retrieved images, various postprocessing methods have been adopted after the initial search process. In this paper, we investigate the online querying process from a graph-based perspective. We introduce a heterogeneous graph model containing both image and feature nodes explicitly, and propose an efficient reranking approach consisting of two successive modules, i.e., incremental query expansion and image-feature voting, to improve the recall and precision, respectively. Compared with the conventional reranking algorithms, our method does not require using geometric information of visual words, therefore enjoys low consumptions of both time and memory. Moreover, our method is independent of the initial search process, and could cooperate with many BoVW-based image search pipelines, or adopted after other postprocessing algorithms. We evaluate our approach on large-scale image search tasks and verify its competitive search performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-21
    Description: The study of fluid flow through solid matter by computed tomography (CT) imaging has many applications, ranging from petroleum and aquifer engineering to biomedical, manufacturing, and environmental research. To avoid motion artifacts, current experiments are often limited to slow fluid flow dynamics. This severely limits the applicability of the technique. In this paper, a new iterative CT reconstruction algorithm for improved a temporal/spatial resolution in the imaging of fluid flow through solid matter is introduced. The proposed algorithm exploits prior knowledge in two ways. First, the time-varying object is assumed to consist of stationary (the solid matter) and dynamic regions (the fluid flow). Second, the attenuation curve of a particular voxel in the dynamic region is modeled by a piecewise constant function over time, which is in accordance with the actual advancing fluid/air boundary. Quantitative and qualitative results on different simulation experiments and a real neutron tomography data set show that, in comparison with the state-of-the-art algorithms, the proposed algorithm allows reconstruction from substantially fewer projections per rotation without image quality loss. Therefore, the temporal resolution can be substantially increased, and thus fluid flow experiments with faster dynamics can be performed.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-21
    Description: Most existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly learning features from raw RGB-D data, but the performance is not satisfactory. In this paper, we propose an unsupervised joint feature learning and encoding (JFLE) framework for RGB-D scene labeling. The main novelty of our learning framework lies in the joint optimization of feature learning and feature encoding in a coherent way, which significantly boosts the performance. By stacking basic learning structure, higher level features are derived and combined with lower level features for better representing RGB-D data. Moreover, to explore the nonlinear intrinsic characteristic of data, we further propose a more general joint deep feature learning and encoding (JDFLE) framework that introduces the nonlinear mapping into JFLE. The experimental results on the benchmark NYU depth dataset show that our approaches achieve competitive performance, compared with the state-of-the-art methods, while our methods do not need complex feature handcrafting and feature combination and can be easily applied to other data sets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-21
    Description: Out-of-focus blur occurs frequently in multispectral imaging systems when the camera is well focused at a specific (reference) imaging channel. As the effective focal lengths of the lens are wavelength dependent, the blurriness levels of the images at individual channels are different. This paper proposes a multispectral image deblurring framework to restore out-of-focus spectral images based on the characteristic of interchannel correlation (ICC). The ICC is investigated based on the fact that a high-dimensional color spectrum can be linearly approximated using rather a few number of intrinsic spectra. In the method, the spectral images are classified into an out-of-focus set and a well-focused set via blurriness computation. For each out-of-focus image, a guiding image is derived from the well-focused spectral images and is used as the image prior in the deblurring framework. The out-of-focus blur is modeled as a Gaussian point spread function, which is further employed as the blur kernel prior. The regularization parameters in the image deblurring framework are determined using generalized cross validation, and thus the proposed method does not need any parameter tuning. The experimental results validate that the method performs well on multispectral image deblurring and outperforms the state of the arts.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: We present a novel spatiotemporal saliency detection method to estimate salient regions in videos based on the gradient flow field and energy optimization. The proposed gradient flow field incorporates two distinctive features: 1) intra-frame boundary information and 2) inter-frame motion information together for indicating the salient regions. Based on the effective utilization of both intra-frame and inter-frame information in the gradient flow field, our algorithm is robust enough to estimate the object and background in complex scenes with various motion patterns and appearances. Then, we introduce local as well as global contrast saliency measures using the foreground and background information estimated from the gradient flow field. These enhanced contrast saliency cues uniformly highlight an entire object. We further propose a new energy function to encourage the spatiotemporal consistency of the output saliency maps, which is seldom explored in previous video saliency methods. The experimental results show that the proposed algorithm outperforms state-of-the-art video saliency detection methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: Hyperspectral unmixing is one of the crucial steps for many hyperspectral applications. The problem of hyperspectral unmixing has proved to be a difficult task in unsupervised work settings where the endmembers and abundances are both unknown. In addition, this task becomes more challenging in the case that the spectral bands are degraded by noise. This paper presents a robust model for unsupervised hyperspectral unmixing. Specifically, our model is developed with the correntropy-based metric where the nonnegative constraints on both endmembers and abundances are imposed to keep physical significance. Besides, a sparsity prior is explicitly formulated to constrain the distribution of the abundances of each endmember. To solve our model, a half-quadratic optimization technique is developed to convert the original complex optimization problem into an iteratively reweighted nonnegative matrix factorization with sparsity constraints. As a result, the optimization of our model can adaptively assign small weights to noisy bands and put more emphasis on noise-free bands. In addition, with sparsity constraints, our model can naturally generate sparse abundances. Experiments on synthetic and real data demonstrate the effectiveness of our model in comparison to the related state-of-the-art unmixing models.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: We present a hierarchical grid-based, globally optimal tracking-by-detection approach to track an unknown number of targets in complex and dense scenarios, particularly addressing the challenges of complex interaction and mutual occlusion. Frame-by-frame detection is performed by hierarchical likelihood grids, matching shape templates through a fast oriented distance transform. To allow recovery from misdetections, common heuristics such as nonmaxima suppression within observations is eschewed. Within a discretized state-space, the data association problem is formulated as a grid-based network flow model, resulting in a convex problem casted into an integer linear programming form, giving a global optimal solution. In addition, we show how a behavior cue (body orientation) can be integrated into our association affinity model, providing valuable hints for resolving ambiguities between crossing trajectories. Unlike traditional motion-based approaches, we estimate body orientation by a hybrid methodology, which combines the merits of motion-based and 3D appearance-based orientation estimation, thus being capable of dealing also with still-standing or slowly moving targets. The performance of our method is demonstrated through experiments on a large variety of benchmark video sequences, including both indoor and outdoor scenarios.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: Many impulse noise (IN) reduction methods suffer from two obstacles, the improper noise detectors and imperfect filters they used. To address such issue, in this paper, a weighted couple sparse representation model is presented to remove IN. In the proposed model, the complicated relationships between the reconstructed and the noisy images are exploited to make the coding coefficients more appropriate to recover the noise-free image. Moreover, the image pixels are classified into clear, slightly corrupted, and heavily corrupted ones. Different data-fidelity regularizations are then accordingly applied to different pixels to further improve the denoising performance. In our proposed method, the dictionary is directly trained on the noisy raw data by addressing a weighted rank-one minimization problem, which can capture more features of the original data. Experimental results demonstrate that the proposed method is superior to several state-of-the-art denoising methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: In this paper, a hierarchical multi-task structural learning algorithm is developed to support large-scale plant species identification, where a visual tree is constructed for organizing large numbers of plant species in a coarse-to-fine fashion and determining the inter-related learning tasks automatically. For a given parent node on the visual tree, it contains a set of sibling coarse-grained categories of plant species or sibling fine-grained plant species, and a multi-task structural learning algorithm is developed to train their inter-related classifiers jointly for enhancing their discrimination power. The inter-level relationship constraint, e.g., a plant image must first be assigned to a parent node (high-level non-leaf node) correctly if it can further be assigned to the most relevant child node (low-level non-leaf node or leaf node) on the visual tree, is formally defined and leveraged to learn more discriminative tree classifiers over the visual tree. Our experimental results have demonstrated the effectiveness of our hierarchical multi-task structural learning algorithm on training more discriminative tree classifiers for large-scale plant species identification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: This paper proposes a two-stage texture synthesis algorithm. At the first stage, a structure tensor map carrying information about the local orientation is synthesized from the exemplar’s data and used at the second stage to constrain the synthesis of the texture. Keeping in mind that the algorithm should be able to reproduce as faithfully as possible the visual aspect, statistics, and morphology of the input sample, the method is tested on various textures and compared objectively with existing methods, highlighting its strength in successfully synthesizing the output texture in many situations where traditional algorithms fail to reproduce the exemplar’s patterns. The promising results pave the way towards the synthesis of accurately large and multi-scale patterns as it is the case for carbon material samples showing laminar structures, for example.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: An image search reranking (ISR) technique aims at refining text-based search results by mining images’ visual content. Feature extraction and ranking function design are two key steps in ISR. Inspired by the idea of hypersphere in one-class classification, this paper proposes a feature extraction algorithm named hypersphere-based relevance preserving projection (HRPP) and a ranking function called hypersphere-based rank (H-Rank). Specifically, an HRPP is a spectral embedding algorithm to transform an original high-dimensional feature space into an intrinsically low-dimensional hypersphere space by preserving the manifold structure and a relevance relationship among the images. An H-Rank is a simple but effective ranking algorithm to sort the images by their distances to the hypersphere center. Moreover, to capture the user’s intent with minimum human interaction, a reversed $k$ -nearest neighbor (KNN) algorithm is proposed, which harvests enough pseudorelevant images by requiring that the user gives only one click on the initially searched images. The HRPP method with reversed KNN is named one-click-based HRPP (OC-HRPP). Finally, an OC-HRPP algorithm and the H-Rank algorithm form a new ISR method, H-reranking. Extensive experimental results on three large real-world data sets show that the proposed algorithms are effective. Moreover, the fact that only one relevant image is required to be labeled makes it has a strong practical significance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: In this paper, we propose a novel method for image fusion with a high-resolution panchromatic image and a low-resolution multispectral (Ms) image at the same geographical location. The fusion is formulated as a convex optimization problem which minimizes a linear combination of a least-squares fitting term and a dynamic gradient sparsity regularizer. The former is to preserve accurate spectral information of the Ms image, while the latter is to keep sharp edges of the high-resolution panchromatic image. We further propose to simultaneously register the two images during the fusing process, which is naturally achieved by virtue of the dynamic gradient sparsity property. An efficient algorithm is then devised to solve the optimization problem, accomplishing a linear computational complexity in the size of the output image in each iteration. We compare our method against six state-of-the-art image fusion methods on Ms image data sets from four satellites. Extensive experimental results demonstrate that the proposed method substantially outperforms the others in terms of both spatial and spectral qualities. We also show that our method can provide high-quality products from coarsely registered real-world IKONOS data sets. Finally, a MATLAB implementation is provided to facilitate future research.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2015-08-14
    Description: Automatic fluorescent particle tracking is an essential task to study the dynamics of a large number of biological structures at a sub-cellular level. We have developed a probabilistic particle tracking approach based on multi-scale detection and two-step multi-frame association. The multi-scale detection scheme allows coping with particles in close proximity. For finding associations, we have developed a two-step multi-frame algorithm, which is based on a temporally semiglobal formulation as well as spatially local and global optimization. In the first step, reliable associations are determined for each particle individually in local neighborhoods. In the second step, the global spatial information over multiple frames is exploited jointly to determine optimal associations. The multi-scale detection scheme and the multi-frame association finding algorithm have been combined with a probabilistic tracking approach based on the Kalman filter. We have successfully applied our probabilistic tracking approach to synthetic as well as real microscopy image sequences of virus particles and quantified the performance. We found that the proposed approach outperforms previous approaches.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: In this paper, we propose a novel model, a discriminatively learned iterative shrinkage (DLIS) model, for color image denoising. The DLIS is a generalization of wavelet shrinkage by iteratively performing shrinkage over patch groups and whole image aggregation. We discriminatively learn the shrinkage functions and basis from the training pairs of noisy/noise-free images, which can adaptively handle different noise characteristics in luminance/chrominance channels, and the unknown structured noise in real-captured color images. Furthermore, to remove the splotchy real color noises, we design a Laplacian pyramid-based denoising framework to progressively recover the clean image from the coarsest scale to the finest scale by the DLIS model learned from the real color noises. Experiments show that our proposed approach can achieve the state-of-the-art denoising results on both synthetic denoising benchmark and real-captured color images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: In cross-view action recognition, what you saw in one view is different from what you recognize in another view, since the data distribution even the feature space can change from one view to another. In this paper, we address the problem of transferring action models learned in one view (source view) to another different view (target view), where action instances from these two views are represented by heterogeneous features. A novel learning method, called heterogeneous transfer discriminant-analysis of canonical correlations (HTDCC), is proposed to discover a discriminative common feature space for linking source view and target view to transfer knowledge between them. Two projection matrices are learned to, respectively, map data from the source view and the target view into a common feature space via simultaneously minimizing the canonical correlations of interclass training data, maximizing the canonical correlations of intraclass training data, and reducing the data distribution mismatch between the source and target views in the common feature space. In our method, the source view and the target view neither share any common features nor have any corresponding action instances. Moreover, our HTDCC method is capable of handling only a few or even no labeled samples available in the target view, and can also be easily extended to the situation of multiple source views. We additionally propose a weighting learning framework for multiple source views adaptation to effectively leverage action knowledge learned from multiple source views for the recognition task in the target view. Under this framework, different source views are assigned different weights according to their different relevances to the target view. Each weight represents how contributive the corresponding source view is to the target view. Extensive experiments on the IXMAS data set demonstrate the effectiveness of HTDCC on learning the common feature space for heterogeneous cross-view action rec- gnition. In addition, the weighting learning framework can achieve promising results on automatically adapting multiple transferred source-view knowledge to the target view.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-14
    Description: A complete encoding solution for efficient intra-based depth map compression is proposed in this paper. The algorithm, denominated predictive depth coding (PDC), was specifically developed to efficiently represent the characteristics of depth maps, mostly composed by smooth areas delimited by sharp edges. At its core, PDC involves a directional intra prediction framework and a straightforward residue coding method, combined with an optimized flexible block partitioning scheme. In order to improve the algorithm in the presence of depth edges that cannot be efficiently predicted by the directional modes, a constrained depth modeling mode, based on explicit edge representation, was developed. For residue coding, a simple and low complexity approach was investigated, using constant and linear residue modeling, depending on the prediction mode. The performance of the proposed intra depth map coding approach was evaluated based on the quality of the synthesized views using the encoded depth maps and original texture views. The experimental tests based on all intra configuration demonstrated the superior rate-distortion performance of PDC, with average bitrate savings of 6%, when compared with the current state-of-the-art intra depth map coding solution present in the 3D extension of a high-efficiency video coding (3D-HEVC) standard. By using view synthesis optimization in both PDC and 3D-HEVC encoders, the average bitrate savings increase to 14.3%. This suggests that the proposed method, without using transform-based residue coding, is an efficient alternative to the current 3D-HEVC algorithm for intra depth map coding.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Person re-identification aims to match people across non-overlapping camera views, which is an important but challenging task in video surveillance. In order to obtain a robust metric for matching, metric learning has been introduced recently. Most existing works focus on seeking a Mahalanobis distance by employing sparse pairwise constraints, which utilize image pairs with the same person identity as positive samples, and select a small portion of those with different identities as negative samples. However, this training strategy has abandoned a large amount of discriminative information, and ignored the relative similarities. In this paper, we propose a novel relevance metric learning method with listwise constraints (RMLLCs) by adopting listwise similarities, which consist of the similarity list of each image with respect to all remaining images. By virtue of listwise similarities, RMLLC could capture all pairwise similarities, and consequently learn a more discriminative metric by enforcing the metric to conserve predefined similarity lists in a low-dimensional projection subspace. Despite the performance enhancement, RMLLC using predefined similarity lists fails to capture the relative relevance information, which is often unavailable in practice. To address this problem, we further introduce a rectification term to automatically exploit the relative similarities, and develop an efficient alternating iterative algorithm to jointly learn the optimal metric and the rectification term. Extensive experiments on four publicly available benchmarking data sets are carried out and demonstrate that the proposed method is significantly superior to the state-of-the-art approaches. The results also show that the introduction of the rectification term could further boost the performance of RMLLC.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Tomographic iterative reconstruction methods need a very thorough modeling of data. This point becomes critical when the number of available projections is limited. At the core of this issue is the projector design, i.e., the numerical model relating the representation of the object of interest to the projections on the detector. Voxel driven and ray driven projection models are widely used for their short execution time in spite of their coarse approximations. Distance driven model has an improved accuracy but makes strong approximations to project voxel basis functions. Cubic voxel basis functions are anisotropic, accurately modeling their projection is, therefore, computationally expensive. Both smoother and more isotropic basis functions better represent the continuous functions and provide simpler projectors. These considerations have led to the development of spherically symmetric volume elements, called blobs. Set apart their isotropy, blobs are often considered too computationally expensive in practice. In this paper, we consider using separable B-splines as basis functions to represent the object, and we propose to approximate the projection of these basis functions by a 2D separable model. When the degree of the B-splines increases, their isotropy improves and projections can be computed regardless of their orientation. The degree and the sampling of the B-splines can be chosen according to a tradeoff between approximation quality and computational complexity. We quantitatively measure the good accuracy of our model and compare it with other projectors, such as the distance-driven and the model proposed by Long et al. From the numerical experiments, we demonstrate that our projector with an improved accuracy better preserves the quality of the reconstruction as the number of projections decreases. Our projector with cubic B-splines requires about twice as many operations as a model based on voxel basis functions. Higher accuracy projectors can be used to - mprove the resolution of the existing systems, or to reduce the number of projections required to reach a given resolution, potentially reducing the dose absorbed by the patient.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Despite important recent advances, the vulnerability of biometric systems to spoofing attacks is still an open problem. Spoof attacks occur when impostor users present synthetic biometric samples of a valid user to the biometric system seeking to deceive it. Considering the case of face biometrics, a spoofing attack consists in presenting a fake sample (e.g., photograph, digital video, or even a 3D mask) to the acquisition sensor with the facial information of a valid user. In this paper, we introduce a low cost and software-based method for detecting spoofing attempts in face recognition systems. Our hypothesis is that during acquisition, there will be inevitable artifacts left behind in the recaptured biometric samples allowing us to create a discriminative signature of the video generated by the biometric sensor. To characterize these artifacts, we extract time-spectral feature descriptors from the video, which can be understood as a low-level feature descriptor that gathers temporal and spectral information across the biometric sample and use the visual codebook concept to find mid-level feature descriptors computed from the low-level ones. Such descriptors are more robust for detecting several kinds of attacks than the low-level ones. The experimental results show the effectiveness of the proposed method for detecting different types of attacks in a variety of scenarios and data sets, including photos, videos, and 3D masks.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Target representation is a necessary component for a robust tracker. However, during tracking, many complicated factors may make the accumulated errors in the representation significantly large, leading to tracking drift. This paper aims to improve the robustness of target representation to avoid the influence of the accumulated errors, such that the tracker only acquires the information that facilitates tracking and ignores the distractions. We observe that the locally mutual relations between the feature observations of temporally obtained targets are beneficial to the subspace representation in visual tracking. Thus, we propose a novel subspace learning algorithm for visual tracking, which imposes joint row-wise sparsity structure on the target subspace to adaptively exclude distractive information. The sparsity is induced by exploiting the locally mutual relations between the feature observations during learning. To this end, we formulate tracking as a subspace sparsity inducing problem. A large number of experiments on various challenging video sequences demonstrate that our tracker outperforms many other state-of-the-art trackers.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Color-to-gray (C2G) image conversion is the process of transforming a color image into a grayscale one. Despite its wide usage in real-world applications, little work has been dedicated to compare the performance of C2G conversion algorithms. Subjective evaluation is reliable but is also inconvenient and time consuming. Here, we make one of the first attempts to develop an objective quality model that automatically predicts the perceived quality of C2G converted images. Inspired by the philosophy of the structural similarity index, we propose a C2G structural similarity (C2G-SSIM) index, which evaluates the luminance, contrast, and structure similarities between the reference color image and the C2G converted image. The three components are then combined depending on image type to yield an overall quality measure. Experimental results show that the proposed C2G-SSIM index has close agreement with subjective rankings and significantly outperforms existing objective quality metrics for C2G conversion. To explore the potentials of C2G-SSIM, we further demonstrate its use in two applications: 1) automatic parameter tuning for C2G conversion algorithms and 2) adaptive fusion of C2G converted images.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: In this paper, we propose a skin classification method exploiting faces and bodies automatically detected in the image, to adaptively initialize individual ad hoc skin classifiers. Each classifier is initialized by a face and body couple or by a single face, if no reliable body is detected. Thus, the proposed method builds an ad hoc skin classifier for each person in the image, resulting in a classifier less dependent from changes in skin color due to tan levels, races, genders, and illumination conditions. The experimental results on a heterogeneous data set of labeled images show that our proposal outperforms the state-of-the-art methods, and that this improvement is statistically significant.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-18
    Description: Local binary descriptors are attracting increasingly attention due to their great advantages in computational speed, which are able to achieve real-time performance in numerous image/vision applications. Various methods have been proposed to learn data-dependent binary descriptors. However, most existing binary descriptors aim overly at computational simplicity at the expense of significant information loss which causes ambiguity in similarity measure using Hamming distance. In this paper, by considering multiple features might share complementary information, we present a novel local binary descriptor, referred as ring-based multi-grouped descriptor (RMGD), to successfully bridge the performance gap between current binary and floated-point descriptors. Our contributions are twofold. First, we introduce a new pooling configuration based on spatial ring-region sampling, allowing for involving binary tests on the full set of pairwise regions with different shapes, scales, and distances. This leads to a more meaningful description than the existing methods which normally apply a limited set of pooling configurations. Then, an extended Adaboost is proposed for an efficient bit selection by emphasizing high variance and low correlation, achieving a highly compact representation. Second, the RMGD is computed from multiple image properties where binary strings are extracted. We cast multi-grouped features integration as rankSVM or sparse support vector machine learning problem, so that different features can compensate strongly for each other, which is the key to discriminativeness and robustness. The performance of the RMGD was evaluated on a number of publicly available benchmarks, where the RMGD outperforms the state-of-the-art binary descriptors significantly.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-25
    Description: Palmprint recognition (PR) is an effective technology for personal recognition. A main problem, which deteriorates the performance of PR, is the deformations of palmprint images. This problem becomes more severe on contactless occasions, in which images are acquired without any guiding mechanisms, and hence critically limits the applications of PR. To solve the deformation problems, in this paper, a model for non-linearly deformed palmprint matching is derived by approximating non-linear deformed palmprint images with piecewise-linear deformed stable regions. Based on this model, a novel approach for deformed palmprint matching, named key point-based block growing (KPBG), is proposed. In KPBG, an iterative M-estimator sample consensus algorithm based on scale invariant feature transform features is devised to compute piecewise-linear transformations to approximate the non-linear deformations of palmprints, and then, the stable regions complying with the linear transformations are decided using a block growing algorithm. Palmprint feature extraction and matching are performed over these stable regions to compute matching scores for decision. Experiments on several public palmprint databases show that the proposed models and the KPBG approach can effectively solve the deformation problem in palmprint verification and outperform the state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-25
    Description: Nonnegative Tucker decomposition (NTD) is a powerful tool for the extraction of nonnegative parts-based and physically meaningful latent components from high-dimensional tensor data while preserving the natural multilinear structure of data. However, as the data tensor often has multiple modes and is large scale, the existing NTD algorithms suffer from a very high computational complexity in terms of both storage and computation time, which has been one major obstacle for practical applications of NTD. To overcome these disadvantages, we show how low (multilinear) rank approximation (LRA) of tensors is able to significantly simplify the computation of the gradients of the cost function, upon which a family of efficient first-order NTD algorithms are developed. Besides dramatically reducing the storage complexity and running time, the new algorithms are quite flexible and robust to noise, because any well-established LRA approaches can be applied. We also show how nonnegativity incorporating sparsity substantially improves the uniqueness property and partially alleviates the curse of dimensionality of the Tucker decompositions. Simulation results on synthetic and real-world data justify the validity and high efficiency of the proposed NTD algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-25
    Description: The problem of estimating the parameters of a Rayleigh-Rice mixture density is often encountered in image analysis (e.g., remote sensing and medical image processing). In this paper, we address this general problem in the framework of change detection (CD) in multitemporal and multispectral images. One widely used approach to CD in multispectral images is based on the change vector analysis. Here, the distribution of the magnitude of the difference image can be theoretically modeled by a Rayleigh-Rice mixture density. However, given the complexity of this model, in applications, a Gaussian-mixture approximation is often considered, which may affect the CD results. In this paper, we present a novel technique for parameter estimation of the Rayleigh-Rice density that is based on a specific definition of the expectation-maximization algorithm. The proposed technique, which is characterized by good theoretical properties, iteratively updates the parameters and does not depend on specific optimization routines. Several numerical experiments on synthetic data demonstrate the effectiveness of the method, which is general and can be applied to any image processing problem involving the Rayleigh-Rice mixture density. In the CD context, the Rayleigh-Rice model (which is theoretically derived) outperforms other empirical models. Experiments on real multitemporal and multispectral remote sensing images confirm the validity of the model by returning significantly higher CD accuracies than those obtained by using the state-of-the-art approaches.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-09-25
    Description: In this paper, we propose a very simple deep learning network for image classification that is based on very basic data processing components: 1) cascaded principal component analysis (PCA); 2) binary hashing; and 3) blockwise histograms. In the proposed architecture, the PCA is employed to learn multistage filter banks. This is followed by simple binary hashing and block histograms for indexing and pooling. This architecture is thus called the PCA network (PCANet) and can be extremely easily and efficiently designed and learned. For comparison and to provide a better understanding, we also introduce and study two simple variations of PCANet: 1) RandNet and 2) LDANet. They share the same topology as PCANet, but their cascaded filters are either randomly selected or learned from linear discriminant analysis. We have extensively tested these basic networks on many benchmark visual data sets for different tasks, including Labeled Faces in the Wild (LFW) for face verification; the MultiPIE, Extended Yale B, AR, Facial Recognition Technology (FERET) data sets for face recognition; and MNIST for hand-written digit recognition. Surprisingly, for all tasks, such a seemingly naive PCANet model is on par with the state-of-the-art features either prefixed, highly hand-crafted, or carefully learned [by deep neural networks (DNNs)]. Even more surprisingly, the model sets new records for many classification tasks on the Extended Yale B, AR, and FERET data sets and on MNIST variations. Additional experiments on other public data sets also demonstrate the potential of PCANet to serve as a simple but highly competitive baseline for texture classification and object recognition.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-10-27
    Description: The segmentation of brain MR images into different tissue classes is an important task for automatic image analysis technique, particularly due to the presence of intensity inhomogeneity artifact in MR images. In this regard, this paper presents a novel approach for simultaneous segmentation and bias field correction in brain MR images. It integrates judiciously the concept of rough sets and the merit of a novel probability distribution, called stomped normal (SN) distribution. The intensity distribution of a tissue class is represented by SN distribution, where each tissue class consists of a crisp lower approximation and a probabilistic boundary region. The intensity distribution of brain MR image is modeled as a mixture of finite number of SN distributions and one uniform distribution. The proposed method incorporates both the expectation-maximization and hidden Markov random field frameworks to provide an accurate and robust segmentation. The performance of the proposed approach, along with a comparison with related methods, is demonstrated on a set of synthetic and real brain MR images for different bias fields and noise levels.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-10-27
    Description: People know and care for personal objects, which can be different for individuals. Automatically discovering personal objects is thus of great practical importance. We, in this paper, pursue this task with wearable cameras based on the common sense that personal objects generally company us in various scenes. With this clue, we exploit a new object-scene distribution for robust detection. Two technical challenges involved in estimating this distribution, i.e., scene extraction and unsupervised object discovery, are tackled. For scene extraction, we learn the latent representation instead of simply selecting a few frames from the videos. In object discovery, we build an interaction model to select frame-level objects and use nonparametric Bayesian clustering. Experiments verify the usefulness of our approach.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-10-27
    Description: Single object tracking, in which a target is often initialized manually in the first frame and then is tracked and located automatically in the subsequent frames, is a hot topic in computer vision. The traditional tracking-by-detection framework, which often formulates tracking as a binary classification problem, has been widely applied and achieved great success in single object tracking. However, there are some potential issues in this formulation. For instance, the boundary between the positive and negative training samples is fuzzy, and the objectives of tracking and classification are inconsistent. In this paper, we attempt to address the above issues from the fuzzy system perspective and propose a novel tracking method by formulating tracking as a fuzzy classification problem. First, we introduce the fuzzy strategy into tracking and propose a novel fuzzy tracking framework, which can measure the importance of the training samples by assigning different memberships to them and offer more strict spatial constraints. Second, we develop a fuzzy least squares support vector machine (FLS-SVM) approach and employ it to implement a concrete tracker. In particular, the primal form, dual form, and kernel form of FLS-SVM are analyzed and the corresponding closed-form solutions are derived for efficient realizations. Besides, a least squares regression model is built to control the update adaptively, retaining the robustness of the appearance model. The experimental results demonstrate that our method can achieve comparable or superior performance to many state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Slow feature analysis (SFA) is a dimensionality reduction technique which has been linked to how visual brain cells work. In recent years, the SFA was adopted for computer vision tasks. In this paper, we propose an exact kernel SFA (KSFA) framework for positive definite and indefinite kernels in Krein space. We then formulate an online KSFA which employs a reduced set expansion. Finally, by utilizing a special kind of kernel family, we formulate exact online KSFA for which no reduced set is required. We apply the proposed system to develop a SFA-based change detection algorithm for stream data. This framework is employed for temporal video segmentation and tracking. We test our setup on synthetic and real data streams. When combined with an online learning tracking system, the proposed change detection approach improves upon tracking setups that do not utilize change detection.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Quality assessment of 3D images encounters more challenges than its 2D counterparts. Directly applying 2D image quality metrics is not the solution. In this paper, we propose a new full-reference quality assessment for stereoscopic images by learning binocular receptive field properties to be more in line with human visual perception. To be more specific, in the training phase, we learn a multiscale dictionary from the training database, so that the latent structure of images can be represented as a set of basis vectors. In the quality estimation phase, we compute sparse feature similarity index based on the estimated sparse coefficient vectors by considering their phase difference and amplitude difference, and compute global luminance similarity index by considering luminance changes. The final quality score is obtained by incorporating binocular combination based on sparse energy and sparse complexity. Experimental results on five public 3D image quality assessment databases demonstrate that in comparison with the most related existing methods, the devised algorithm achieves high consistency with subjective assessment.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Our goal is to detect and group different kinds of local symmetries in images in a scale- and rotation-invariant way. We propose an efficient wavelet-based method to determine the order of local symmetry at each location. Our algorithm relies on circular harmonic wavelets which are used to generate steerable wavelet channels corresponding to different symmetry orders. To give a measure of local symmetry, we use the F-test to examine the distribution of the energy across different channels. We provide experimental results on synthetic images, biological micrographs, and electron-microscopy images to demonstrate the performance of the algorithm.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: In this paper, we study a novel problem of classifying covert photos, whose acquisition processes are intentionally concealed from the subjects being photographed. Covert photos are often privacy invasive and, if distributed over Internet, can cause serious consequences. Automatic identification of such photos, therefore, serves as an important initial step toward further privacy protection operations. The problem is, however, very challenging due to the large semantic similarity between covert and noncovert photos, the enormous diversity in the photographing process and environment of cover photos, and the difficulty to collect an effective data set for the study. Attacking these challenges, we make three consecutive contributions. First, we collect a large data set containing 2500 covert photos, each of them is verified rigorously and carefully. Second, we conduct a user study on how humans distinguish covert photos from noncovert ones. The user study not only provides an important evaluation baseline, but also suggests fusing heterogeneous information for an automatic solution. Our third contribution is a covert photo classification algorithm that fuses various image features and visual attributes in the multiple kernel learning framework. We evaluate the proposed approach on the collected data set in comparison with other modern image classifiers. The results show that our approach achieves an average classification rate (1–EER) of 0.8940, which significantly outperforms other competitors as well as human’s performance.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-06-13
    Description: Driven by recent vision and graphics applications such as image segmentation and object recognition, computing pixel-accurate saliency values to uniformly highlight foreground objects becomes increasingly important. In this paper, we propose a unified framework called pixelwise image saliency aggregating (PISA) various bottom-up cues and priors. It generates spatially coherent yet detail-preserving, pixel-accurate, and fine-grained saliency, and overcomes the limitations of previous methods, which use homogeneous superpixel based and color only treatment. PISA aggregates multiple saliency cues in a global context, such as complementary color and structure contrast measures, with their spatial priors in the image domain. The saliency confidence is further jointly modeled with a neighborhood consistence constraint into an energy minimization formulation, in which each pixel will be evaluated with multiple hypothetical saliency levels. Instead of using global discrete optimization methods, we employ the cost-volume filtering technique to solve our formulation, assigning the saliency levels smoothly while preserving the edge-aware structure details. In addition, a faster version of PISA is developed using a gradient-driven image subsampling strategy to greatly improve the runtime efficiency while keeping comparable detection accuracy. Extensive experiments on a number of public data sets suggest that PISA convincingly outperforms other state-of-the-art approaches. In addition, with this work, we also create a new data set containing 800 commodity images for evaluating saliency detection.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: In this paper, we propose a new video inpainting method which applies to both static or free-moving camera videos. The method can be used for object removal, error concealment, and background reconstruction applications. To limit the computational time, a frame is inpainted by considering a small number of neighboring pictures which are grouped into a group of pictures (GoP). More specifically, to inpaint a frame, the method starts by aligning all the frames of the GoP. This is achieved by a region-based homography computation method which allows us to strengthen the spatial consistency of aligned frames. Then, from the stack of aligned frames, an energy function based on both spatial and temporal coherency terms is globally minimized. This energy function is efficient enough to provide high quality results even when the number of pictures in the GoP is rather small, e.g. 20 neighboring frames. This drastically reduces the algorithm complexity and makes the approach well suited for near real-time video editing applications as well as for loss concealment applications. Experiments with several challenging video sequences show that the proposed method provides visually pleasing results for object removal, error concealment, and background reconstruction context.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Single-sensor imaging using the Bayer color filter array (CFA) and demosaicking is well established for current compact and low-cost color digital cameras. An extension from the CFA to a multispectral filter array (MSFA) enables us to acquire a multispectral image in one shot without increased size or cost. However, multispectral demosaicking for the MSFA has been a challenging problem because of very sparse sampling of each spectral band in the MSFA. In this paper, we propose a high-performance multispectral demosaicking algorithm, and at the same time, a novel MSFA pattern that is suitable for our proposed algorithm. Our key idea is the use of the guided filter to interpolate each spectral band. To generate an effective guide image, in our proposed MSFA pattern, we maintain the sampling density of the $G$ -band as high as the Bayer CFA, and we array each spectral band so that an adaptive kernel can be estimated directly from raw MSFA data. Given these two advantages, we effectively generate the guide image from the most densely sampled $G$ -band using the adaptive kernel. In the experiments, we demonstrate that our proposed algorithm with our proposed MSFA pattern outperforms existing algorithms and provides better color fidelity compared with a conventional color imaging system with the Bayer CFA. We also show some real applications using a multispectral camera prototype we built.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Blind motion deblurring from a single image is a highly under-constrained problem with many degenerate solutions. A good approximation of the intrinsic image can, therefore, only be obtained with the help of prior information in the form of (often nonconvex) regularization terms for both the intrinsic image and the kernel. While the best choice of image priors is still a topic of ongoing investigation, this research is made more complicated by the fact that historically each new prior requires the development of a custom optimization method. In this paper, we develop a stochastic optimization method for blind deconvolution. Since this stochastic solver does not require the explicit computation of the gradient of the objective function and uses only efficient local evaluation of the objective, new priors can be implemented and tested very quickly. We demonstrate that this framework, in combination with different image priors produces results with Peak Signal-to-Noise Ratio (PSNR) values that match or exceed the results obtained by much more complex state-of-the-art blind motion deblurring algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Recent advances in object detection have led to the development of segmentation by detection approaches that integrate top-down geometric priors for multiclass object segmentation. A key yet under-addressed issue in utilizing top-down cues for the problem of multiclass object segmentation by detection is efficiently generating robust and accurate geometric priors. In this paper, we propose a random geometric prior forest scheme to obtain object-adaptive geometric priors efficiently and robustly. In the scheme, a testing object first searches for training neighbors with similar geometries using the random geometric prior forest, and then the geometry of the testing object is reconstructed by linearly combining the geometries of its neighbors. Our scheme enjoys several favorable properties when compared with conventional methods. First, it is robust and very fast because its inference does not suffer from bad initializations, poor local minimums or complex optimization. Second, the figure/ground geometries of training samples are utilized in a multitask manner. Third, our scheme is object-adaptive but does not require the labeling of parts or poselets, and thus, it is quite easy to implement. To demonstrate the effectiveness of the proposed scheme, we integrate the obtained top-down geometric priors with conventional bottom-up color cues in the frame of graph cut. The proposed random geometric prior forest achieves the best segmentation results of all of the methods tested on VOC2010/2012 and is 90 times faster than the current state-of-the-art method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Tone mapping operators (TMOs) aim to compress high dynamic range (HDR) images to low dynamic range (LDR) ones so as to visualize HDR images on standard displays. Most existing TMOs were demonstrated on specific examples without being thoroughly evaluated using well-designed and subject-validated image quality assessment models. A recently proposed tone mapped image quality index (TMQI) made one of the first attempts on objective quality assessment of tone mapped images. Here, we propose a substantially different approach to design TMO. Instead of using any predefined systematic computational structure for tone mapping (such as analytic image transformations and/or explicit contrast/edge enhancement), we directly navigate in the space of all images, searching for the image that optimizes an improved TMQI. In particular, we first improve the two building blocks in TMQI—structural fidelity and statistical naturalness components—leading to a TMQI-II metric. We then propose an iterative algorithm that alternatively improves the structural fidelity and statistical naturalness of the resulting image. Numerical and subjective experiments demonstrate that the proposed algorithm consistently produces better quality tone mapped images even when the initial images of the iteration are created by the most competitive TMOs. Meanwhile, these results also validate the superiority of TMQI-II over TMQI. 1 1 Partial preliminary results of this work were presented at ICASSP 2013 and ICME 2014.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Extracting the pixel-level 3D layout from a single image is important for different applications, such as object localization, image, and video categorization. Traditionally, the 3D layout is derived by solving a pixel-level classification problem. However, the image-level 3D structure can be very beneficial for extracting pixel-level 3D layout since it implies the way how pixels in the image are organized. In this paper, we propose an approach that first predicts the global image structure, and then we use the global structure for fine-grained pixel-level 3D layout extraction. In particular, image features are extracted based on multiple layout templates. We then learn a discriminative model for classifying the global layout at the image-level. Using latent variables, we implicitly model the sublevel semantics of the image, which enrich the expressiveness of our model. After the image-level structure is obtained, it is used as the prior knowledge to infer pixel-wise 3D layout. Experiments show that the results of our model outperform the state-of-the-art methods by 11.7% for 3D structure classification. Moreover, we show that employing the 3D structure prior information yields accurate 3D scene layout segmentation.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-13
    Description: Recognizing human activities from videos is a fundamental research problem in computer vision. Recently, there has been a growing interest in analyzing human behavior from data collected with wearable cameras. First-person cameras continuously record several hours of their wearers’ life. To cope with this vast amount of unlabeled and heterogeneous data, novel algorithmic solutions are required. In this paper, we propose a multitask clustering framework for activity of daily living analysis from visual data gathered from wearable cameras. Our intuition is that, even if the data are not annotated, it is possible to exploit the fact that the tasks of recognizing everyday activities of multiple individuals are related, since typically people perform the same actions in similar environments, e.g., people working in an office often read and write documents). In our framework, rather than clustering data from different users separately, we propose to look for clustering partitions which are coherent among related tasks. In particular, two novel multitask clustering algorithms, derived from a common optimization problem, are introduced. Our experimental evaluation, conducted both on synthetic data and on publicly available first-person vision data sets, shows that the proposed approach outperforms several single-task and multitask learning methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: Obtaining robust and efficient rotation-invariant texture features in content-based image retrieval field is a challenging work. We propose three efficient rotation-invariant methods for texture image retrieval using copula model based in the domains of Gabor wavelet (GW) and circularly symmetric GW (CSGW). The proposed copula models use copula function to capture the scale dependence of GW/CSGW for improving the retrieval performance. It is well known that the Kullback–Leibler distance (KLD) is the commonly used similarity measurement between probability models. However, it is difficult to deduce the closed-form of KLD between two copula models due to the complexity of the copula model. We also put forward a kind of retrieval scheme using the KLDs of marginal distributions and the KLD of copula function to calculate the KLD of copula model. The proposed texture retrieval method has low computational complexity and high retrieval precision. The experimental results on VisTex and Brodatz data sets show that the proposed retrieval method is more effective compared with the state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: In this paper, we propose an efficient semidefinite programming (SDP) approach to worst case linear discriminant analysis (WLDA). Compared with the traditional LDA, WLDA considers the dimensionality reduction problem from the worst case viewpoint, which is in general more robust for classification. However, the original problem of WLDA is non-convex and difficult to optimize. In this paper, we reformulate the optimization problem of WLDA into a sequence of semidefinite feasibility problems. To efficiently solve the semidefinite feasibility problems, we design a new scalable optimization method with a quasi-Newton method and eigen-decomposition being the core components. The proposed method is orders of magnitude faster than standard interior-point SDP solvers. Experiments on a variety of classification problems demonstrate that our approach achieves better performance than standard LDA. Our method is also much faster and more scalable than standard interior-point SDP solvers-based WLDA. The computational complexity for an SDP with $m$ constraints and matrices of size $d$ by $d$ is roughly reduced from $mathcal {O}(m^{3}+md^{3}+m^{2}d^{2})$ to $mathcal {O}(d^{3})$ ( $m>d$ in our case).
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: We now know that good mid-level features can greatly enhance the performance of image classification, but how to efficiently learn the image features is still an open question. In this paper, we present an efficient unsupervised midlevel feature learning approach (MidFea), which only involves simple operations, such as $k$ -means clustering, convolution, pooling, vector quantization, and random projection. We show this simple feature can also achieve good performance in traditional classification task. To further boost the performance, we model the neuron selectivity (NS) principle by building an additional layer over the midlevel features prior to the classifier. The NS-layer learns category-specific neurons in a supervised manner with both bottom-up inference and top-down analysis, and thus supports fast inference for a query image. Through extensive experiments, we demonstrate that this higher level NS-layer notably improves the classification accuracy with our simple MidFea, achieving comparable performances for face recognition, gender classification, age estimation, and object categorization. In particular, our approach runs faster in inference by an order of magnitude than sparse coding-based feature learning methods. As a conclusion, we argue that not only do carefully learned features (MidFea) bring improved performance, but also a sophisticated mechanism (NS-layer) at higher level boosts the performance further.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: Face alignment has been well studied in recent years, however, when a face alignment model is applied on facial images with heavy partial occlusion, the performance deteriorates significantly. In this paper, instead of training an occlusion-aware model with visibility annotation, we address this issue via a model adaptation scheme that uses the result of a local regression forest (RF) voting method. In the proposed scheme, the consistency of the votes of the local RF in each of several oversegmented regions is used to determine the reliability of predicting the location of the facial landmarks. The latter is what we call regional predictive power (RPP). Subsequently, we adapt a holistic voting method (cascaded pose regression based on random ferns) by putting weights on the votes of each fern according to the RPP of the regions used in the fern tests. The proposed method shows superior performance over existing face alignment models in the most challenging data sets (COFW and 300-W). Moreover, it can also estimate with high accuracy (72.4% overlap ratio) which image areas belong to the face or nonface objects, on the heavily occluded images of the COFW data set, without explicit occlusion modeling.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: In this paper, we propose a sparse coding approach to background modeling. The obtained model is based on dictionaries which we learn and keep up to date as new data are provided by a video camera. We observe that, without dynamic events, video frames may be seen as noisy data belonging to the background. Over time, such background is subject to local and global changes due to variable illumination conditions, camera jitter, stable scene changes, and intermittent motion of background objects. To capture the locality of some changes, we propose a space-variant analysis where we learn a dictionary of atoms for each image patch, the size of which depends on the background variability. At run time, each patch is represented by a linear combination of the atoms learnt online. A change is detected when the atoms are not sufficient to provide an appropriate representation, and stable changes over time trigger an update of the current dictionary. Even if the overall procedure is carried out at a coarse level, a pixel-wise segmentation can be obtained by comparing the atoms with the patch corresponding to the dynamic event. Experiments on benchmarks indicate that the proposed method achieves very good performances on a variety of scenarios. An assessment on long video streams confirms our method incorporates periodical changes, as the ones caused by variations in natural illumination. The model, fully data driven, is suitable as a main component of a change detection system.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: We consider a pipeline for image classification or search based on coding approaches like bag of words or Fisher vectors. In this context, the most common approach is to extract the image patches regularly in a dense manner on several scales. This paper proposes and evaluates alternative choices to extract patches densely. Beyond simple strategies derived from regular interest region detectors, we propose approaches based on superpixels, edges, and a bank of Zernike filters used as detectors. The different approaches are evaluated on recent image retrieval and fine-grained classification benchmarks. Our results show that the regular dense detector is outperformed by other methods in most situations, leading us to improve the state-of-the-art in comparable setups on standard retrieval and fined-grained benchmarks. As a byproduct of our study, we show that existing methods for blob and superpixel extraction achieve high accuracy if the patches are extracted along the edges and not around the detected regions.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: There is growing interest in multilabel image classification due to its critical role in web-based image analytics-based applications, such as large-scale image retrieval and browsing. Matrix completion (MC) has recently been introduced as a method for transductive (semisupervised) multilabel classification, and has several distinct advantages, including robustness to missing data and background noise in both feature and label space. However, it is limited by only considering data represented by a single-view feature, which cannot precisely characterize images containing several semantic concepts. To utilize multiple features taken from different views, we have to concatenate the different features as a long vector. However, this concatenation is prone to over-fitting and often leads to very high time complexity in MC-based image classification. Therefore, we propose to weightedly combine the MC outputs of different views, and present the multiview MC (MVMC) framework for transductive multilabel image classification. To learn the view combination weights effectively, we apply a cross-validation strategy on the labeled set. In particular, MVMC splits the labeled set into two parts, and predicts the labels of one part using the known labels of the other part. The predicted labels are then used to learn the view combination coefficients. In the learning process, we adopt the average precision (AP) loss, which is particular suitable for multilabel image classification, since the ranking-based criteria are critical for evaluating a multilabel classification system. A least squares loss formulation is also presented for the sake of efficiency, and the robustness of the algorithm based on the AP loss compared with the other losses is investigated. Experimental evaluation on two real-world data sets (PASCAL VOC’ 07 and MIR Flickr) demonstrate the effectiveness of MVMC for transductive (semisupervised) multilabel image classification, and show that MVMC can exploit - omplementary properties of different features and output-consistent labels for improved multilabel image classification.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: Face sketch synthesis has wide applications in digital entertainment and law enforcement. Although there is much research on face sketch synthesis, most existing algorithms cannot handle some nonfacial factors, such as hair style, hairpins, and glasses if these factors are excluded in the training set. In addition, previous methods only work on well controlled conditions and fail on images with different backgrounds and sizes as the training set. To this end, this paper presents a novel method that combines both the similarity between different image patches and prior knowledge to synthesize face sketches. Given training photo-sketch pairs, the proposed method learns a photo patch feature dictionary from the training photo patches and replaces the photo patches with their sparse coefficients during the searching process. For a test photo patch, we first obtain its sparse coefficient via the learnt dictionary and then search its nearest neighbors (candidate patches) in the whole training photo patches with sparse coefficients. After purifying the nearest neighbors with prior knowledge, the final sketch corresponding to the test photo can be obtained by Bayesian inference. The contributions of this paper are as follows: 1) we relax the nearest neighbor search area from local region to the whole image without too much time consuming and 2) our method can produce nonfacial factors that are not contained in the training set and is robust against image backgrounds and can even ignore the alignment and image size aspects of test photos. Our experimental results show that the proposed method outperforms several state-of-the-arts in terms of perceptual and objective metrics.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: We design a system framework for streaming scalable internet protocol television (IPTV) content to heterogenous clients. The backbone bandwidth is optimally allocated between source and parity data layers that are delivered to the client population. The assignment of stream layers to clients is done based on their access link data rate and packet loss characteristics, and is part of the optimization. We design three techniques for jointly computing the optimal number of multicast sessions, their respective source and parity rates, and client membership, either exactly or approximatively, at lower complexity. The latter is achieved via an iterative coordinate descent algorithm that only marginally underperforms relative to the exact analytic solution. Through experiments, we study the advantages of our framework over common IPTV systems that deliver the same source and parity streams to every client. We observe substantial gains in video quality in terms of both its average value and standard deviation over the client population. In addition, for energy efficiency, we propose to move the parity data generation part to the edge of the backbone network, where each client connects to its IPTV stream. We analytically study the conditions under which such an approach delivers energy savings relative to the conventional case of source and parity data generation at the IPTV streaming server. Finally, we demonstrate that our system enables more consistent streaming performance, when the clients’ access link packet loss distribution is varied, relative to the two baseline methods used in our investigation, and maintains the same performance as an ideal system that serves each client independently.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: We describean active contour framework with accurate shape and size constraints on the vessel cross-sectional planes to produce the vessel segmentation. It starts with a multiscale vessel axis tracing in a 3D computed tomography (CT) data, followed by vessel boundary delineation on the cross-sectional planes derived from the extracted axis. The vessel boundary surface is deformed under constrained movements on the cross sections and is voxelized to produce the final vascular segmentation. The novelty of this paper lies in the accurate contour point detection of thin vessels based on the CT scanning model, in the efficient implementation of missing contour points in the problematic regions and in the active contour model with accurate shape and size constraints. The main advantage of our framework is that it avoids disconnected and incomplete segmentation of the vessels in the problematic regions that contain touching vessels (vessels in close proximity to each other), diseased portions (pathologic structure attached to a vessel), and thin vessels. It is particularly suitable for accurate segmentation of thin and low contrast vessels. Our method is evaluated and demonstrated on CT data sets from our partner site, and its results are compared with three related methods. Our method is also tested on two publicly available databases and its results are compared with the recently published method. The applicability of the proposed method to some challenging clinical problems, the segmentation of the vessels in the problematic regions, is demonstrated with good results on both quantitative and qualitative experimentations; our segmentation algorithm can delineate vessel boundaries that have level of variability similar to those obtained manually.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-02
    Description: Spoofing using photographs or videos is one of the most common methods of attacking face recognition and verification systems. In this paper, we propose a real-time and nonintrusive method based on the diffusion speed of a single image to address this problem. In particular, inspired by the observation that the difference in surface properties between a live face and a fake one is efficiently revealed in the diffusion speed, we exploit antispoofing features by utilizing the total variation flow scheme. More specifically, we propose defining the local patterns of the diffusion speed, the so-called local speed patterns, as our features, which are input into the linear SVM classifier to determine whether the given face is fake or not. One important advantage of the proposed method is that, in contrast to previous approaches, it accurately identifies diverse malicious attacks regardless of the medium of the image, e.g., paper or screen. Moreover, the proposed method does not require any specific user action. Experimental results on various data sets show that the proposed method is effective for face liveness detection as compared with previous approaches proposed in studies in the literature.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-09
    Description: We propose that the dynamics of an action in video data forms a sparse self-similar manifold in the space-time volume, which can be fully characterized by a linear rank decomposition. Inspired by the recurrence plot theory, we introduce the concept of Joint Self-Similarity Volume (Joint-SSV) to model this sparse action manifold, and hence propose a new optimized rank-1 tensor approximation of the Joint-SSV to obtain compact low-dimensional descriptors that very accurately characterize an action in a video sequence. We show that these descriptor vectors make it possible to recognize actions without explicitly aligning the videos in time in order to compensate for speed of execution or differences in video frame rates. Moreover, we show that the proposed method is generic, in the sense that it can be applied using different low-level features, such as silhouettes, tracked points, histogram of oriented gradients, and so forth. Therefore, our method does not necessarily require explicit tracking of features in the space-time volume. Our experimental results on five public data sets demonstrate that our method produces promising results and outperforms many baseline methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-09
    Description: Principal component analysis (PCA) is widely used to extract features and reduce dimensionality in various computer vision and image/video processing tasks. Conventional approaches either lack robustness to outliers and corrupted data or are designed for one-dimensional signals. To address this problem, we propose a robust PCA model for two-dimensional images incorporating structured sparse priors, referred to as structured sparse 2D-PCA. This robust model considers the prior of structured and grouped pixel values in two dimensions. As the proposed formulation is jointly nonconvex and nonsmooth, which is difficult to tackle by joint optimization, we develop a two-stage alternating minimization approach to solve the problem. This approach iteratively learns the projection matrices by bidirectional decomposition and utilizes the proximal method to obtain the structured sparse outliers. By considering the structured sparsity prior, the proposed model becomes less sensitive to noisy data and outliers in two dimensions. Moreover, the computational cost indicates that the robust two-dimensional model is capable of processing quarter common intermediate format video in real time, as well as handling large-size images and videos, which is often intractable with other robust PCA approaches that involve image-to-vector conversion. Experimental results on robust face reconstruction, video background subtraction data set, and real-world videos show the effectiveness of the proposed model compared with conventional 2D-PCA and other robust PCA algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-09
    Description: In propagating edits for image editing, some edits are intended to affect limited local regions, while others act globally over the entire image. However, the ambiguity problem in propagating edits is not adequately addressed in existing methods. Thus, tedious user input requirements remain since the user must densely or repeatedly input control samples to suppress ambiguity. In this paper, we address this challenge to propagate edits suitably by marking edits for local or global propagation and determining their reasonable propagation scopes automatically. Thus, our approach avoids propagation conflicts, effectively resolving the ambiguity problem. With the reduction of ambiguity, our method allows fewer and less-precise control samples than existing methods. Furthermore, we provide a uniform framework to propagate local and global edits simultaneously, helping the user to quickly obtain the intended results with reduced labor. With our unified framework, the potentially ambiguous interaction between local and global edits (evident in existing methods that propagate these two edit types in series) is resolved. We experimentally demonstrate the effectiveness of our method compared with existing methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-09
    Description: Low rank and sparse representation based methods, which make few specific assumptions about the background, have recently attracted wide attention in background modeling. With these methods, moving objects in the scene are modeled as pixel-wised sparse outliers. However, in many practical scenarios, the distributions of these moving parts are not truly pixel-wised sparse but structurally sparse. Meanwhile a robust analysis mechanism is required to handle background regions or foreground movements with varying scales. Based on these two observations, we first introduce a class of structured sparsity-inducing norms to model moving objects in videos. In our approach, we regard the observed sequence as being constituted of two terms, a low-rank matrix (background) and a structured sparse outlier matrix (foreground). Next, in virtue of adaptive parameters for dynamic videos, we propose a saliency measurement to dynamically estimate the support of the foreground. Experiments on challenging well known data sets demonstrate that the proposed approach outperforms the state-of-the-art methods and works effectively on a wide range of complex videos.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-13
    Description: A novel saliency detection algorithm for video sequences based on the random walk with restart (RWR) is proposed in this paper. We adopt RWR to detect spatially and temporally salient regions. More specifically, we first find a temporal saliency distribution using the features of motion distinctiveness, temporal consistency, and abrupt change. Among them, the motion distinctiveness is derived by comparing the motion profiles of image patches. Then, we employ the temporal saliency distribution as a restarting distribution of the random walker. In addition, we design the transition probability matrix for the walker using the spatial features of intensity, color, and compactness. Finally, we estimate the spatiotemporal saliency distribution by finding the steady-state distribution of the walker. The proposed algorithm detects foreground salient objects faithfully, while suppressing cluttered backgrounds effectively, by incorporating the spatial transition matrix and the temporal restarting distribution systematically. Experimental results on various video sequences demonstrate that the proposed algorithm outperforms conventional saliency detection algorithms qualitatively and quantitatively.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-13
    Description: This paper introduces a novel classification strategy based on the monogenic scale space for target recognition in Synthetic Aperture Radar (SAR) image. The proposed method exploits monogenic signal theory, a multidimensional generalization of the analytic signal, to capture the characteristics of SAR image, e.g., broad spectral information and simultaneous spatial localization. The components derived from the monogenic signal at different scales are then applied into a recently developed framework, sparse representation-based classification (SRC). Moreover, to deal with the data set, whose target classes are not linearly separable, the classification via kernel combination is proposed, where the multiple components of the monogenic signal are jointly considered into a unifying framework for target recognition. The novelty of this paper comes from: 1) the development of monogenic feature via uniformly downsampling, normalization, and concatenation of the components at various scales; 2) the development of score-level fusion for SRCs; and 3) the development of composite kernel learning for classification. In particular, the comparative experimental studies under nonliteral operating conditions, e.g., structural modifications, random noise corruption, and variations in depression angle, are performed. The comparative experimental studies of various algorithms, including the linear support vector machine and the kernel version, the SRC and the variants, kernel SRC, kernel linear representation, and sparse representation of monogenic signal, are performed too. The feasibility of the proposed method has been successfully verified using Moving and Stationary Target Acquiration and Recognition database. The experimental results demonstrate that significant improvement for recognition accuracy can be achieved by the proposed method in comparison with the baseline algorithms.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-13
    Description: Texture characterization is a central element in many image processing applications. Multifractal analysis is a useful signal and image processing tool, yet, the accurate estimation of multifractal parameters for image texture remains a challenge. This is due in the main to the fact that current estimation procedures consist of performing linear regressions across frequency scales of the 2D dyadic wavelet transform, for which only a few such scales are computable for images. The strongly non-Gaussian nature of multifractal processes, combined with their complicated dependence structure, makes it difficult to develop suitable models for parameter estimation. Here, we propose a Bayesian procedure that addresses the difficulties in the estimation of the multifractality parameter. The originality of the procedure is threefold. The construction of a generic semiparametric statistical model for the logarithm of wavelet leaders; the formulation of Bayesian estimators that are associated with this model and the set of parameter values admitted by multifractal theory; the exploitation of a suitable Whittle approximation within the Bayesian model which enables the otherwise infeasible evaluation of the posterior distribution associated with the model. Performance is assessed numerically for several 2D multifractal processes, for several image sizes and a large range of process parameters. The procedure yields significant benefits over current benchmark estimators in terms of estimation performance and ability to discriminate between the two most commonly used classes of multifractal process models. The gains in performance are particularly pronounced for small image sizes, notably enabling for the first time the analysis of image patches as small as $64 times 64$ pixels.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-13
    Description: Existing blind image quality assessment (BIQA) methods are mostly opinion-aware. They learn regression models from training images with associated human subjective scores to predict the perceptual quality of test images. Such opinion-aware methods, however, require a large amount of training samples with associated human subjective scores and of a variety of distortion types. The BIQA models learned by opinion-aware methods often have weak generalization capability, hereby limiting their usability in practice. By comparison, opinion-unaware methods do not need human subjective scores for training, and thus have greater potential for good generalization capability. Unfortunately, thus far no opinion-unaware BIQA method has shown consistently better quality prediction accuracy than the opinion-aware methods. Here, we aim to develop an opinion-unaware BIQA method that can compete with, and perhaps outperform, the existing opinion-aware methods. By integrating the features of natural image statistics derived from multiple cues, we learn a multivariate Gaussian model of image patches from a collection of pristine natural images. Using the learned multivariate Gaussian model, a Bhattacharyya-like distance is used to measure the quality of each image patch, and then an overall quality score is obtained by average pooling. The proposed BIQA method does not need any distorted sample images nor subjective quality scores for training, yet extensive experiments demonstrate its superior quality-prediction performance to the state-of-the-art opinion-aware BIQA methods. The MATLAB source code of our algorithm is publicly available at www.comp.polyu.edu.hk / $sim $ cslzhang/IQA/ILNIQE/ILNIQE.htm.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-05-13
    Description: Brightness and color are two basic visual features integrated by the human visual system (HVS) to gain a better understanding of color natural scenes. Aiming to combine these two cues to maximize the reliability of boundary detection in natural scenes, we propose a new framework based on the color-opponent mechanisms of a certain type of color-sensitive double-opponent (DO) cells in the primary visual cortex (V1) of HVS. This type of DO cells has oriented receptive field with both chromatically and spatially opponent structure. The proposed framework is a feedforward hierarchical model, which has direct counterpart to the color-opponent mechanisms involved in from the retina to V1. In addition, we employ the spatial sparseness constraint (SSC) of neural responses to further suppress the unwanted edges of texture elements. Experimental results show that the DO cells we modeled can flexibly capture both the structured chromatic and achromatic boundaries of salient objects in complex scenes when the cone inputs to DO cells are unbalanced. Meanwhile, the SSC operator further improves the performance by suppressing redundant texture edges. With competitive contour detection accuracy, the proposed model has the additional advantage of quite simple implementation with low computational cost.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-01-24
    Description: The Hough transform is a popular technique used in the field of image processing and computer vision. With a Hough transform technique, not only the normal angle and distance of a line but also the line-segment’s length and midpoint (centroid) can be extracted by analysing the voting distribution around a peak in the Hough space. In this paper, a method based on minimum-entropy analysis is proposed to extract the set of parameters of a line segment. In each column around a peak in Hough space, the voting values specify probabilistic distributions. The corresponding entropies and statistical means are computed. The line-segment’s normal angle and length are simultaneously computed by fitting a quadratic polynomial curve to the voting entropies. The line-segment’s midpoint and normal distance are computed by fitting and interpolating a linear curve to the voting means. The proposed method is tested on simulated images for detection accuracy by providing comparative results. Experimental results on real-world images verify the method as well. The proposed method for line-segment detection is both accurate and robust in the presence of quantization error, background noise, or pixel disturbances.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-01-24
    Description: This paper presents a concept of noise similarity (NS), which can be used to refine the comparison of noisy patch and enhance the denoising power of the nonlocal means (NLM) filter. The fact behind this concept is that the similarity of noisy patch should depend on not only the underlying signal (noise free patches), but also the noise. Based on the concept of noise similarity, we derived a double NS (DNS) model, which converts the denoising problem into the problem of reducing two kinds of noise: one is the superimposed additive noise; the other is the deviation error, defined as another kind of noise denoting the difference between similar pixels on their true intensities. The former corresponds to noise suppression, while the latter corresponds to the restoration of image details. To evaluate the effectiveness of the DNS model, we proposed an iterative version of the NLM filter, where the two noise similarities can work collaboratively in the framework of maximum a posterior. Finally, the experimental results demonstrate that the proposed approach can provide competitive performance when compared with other state-of-the-art NLM filters.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-01-24
    Description: We present an adaptive figure-ground segmentation algorithm that is capable of extracting foreground objects in a generic environment. Starting from an interactively assigned background mask, an initial background prior is defined and multiple soft-label partitions are generated from different foreground priors by progressive patch merging. These partitions are fused to produce a foreground probability map. The probability map is then binarized via threshold sweeping to create multiple hard-label candidates. A set of segmentation hypotheses is formed using different evaluation scores. From this set, the hypothesis with maximal local stability is propagated as the new background prior, and the segmentation process is repeated until convergence. Similarity voting is used to select a winner set, and the corresponding hypotheses are fused to yield the final segmentation result. Experiments indicate that our method performs at or above the current state-of-the-art on several data sets, with particular success on challenging scenes that contain irregular or multiple-connected foregrounds.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-01-24
    Description: Discrete Fourier transform (DFT) is the most widely used method for determining the frequency spectra of digital signals. In this paper, a 2D sliding DFT (2D SDFT) algorithm is proposed for fast implementation of the DFT on 2D sliding windows. The proposed 2D SDFT algorithm directly computes the DFT bins of the current window using the precalculated bins of the previous window. Since the proposed algorithm is designed to accelerate the sliding transform process of a 2D input signal, it can be directly applied to computer vision and image processing applications. The theoretical analysis shows that the computational requirement of the proposed 2D SDFT algorithm is the lowest among existing 2D DFT algorithms. Moreover, the output of the 2D SDFT is mathematically equivalent to that of the traditional DFT at all pixel positions.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-01-24
    Description: Recently, a new probability model dubbed the Laplacian transparent composite model (LPTCM) was developed for DCT coefficients, which could identify outlier coefficients in addition to providing superior modeling accuracy. In this paper, we aim at exploring its applications to image compression. To this end, we propose an efficient nonpredictive image compression system, where quantization (including both hard-decision quantization (HDQ) and soft-decision quantization (SDQ)) and entropy coding are completely redesigned based on the LPTCM. When tested over standard test images, the proposed system achieves overall coding results that are among the best and similar to those of H.264 or HEVC intra (predictive) coding, in terms of rate versus visual quality. On the other hand, in terms of rate versus objective quality, it significantly outperforms baseline JPEG by more than 4.3 dB in PSNR on average, with a moderate increase on complexity, and ECEB, the state-of-the-art nonpredictive image coding, by 0.75 dB when SDQ is OFF (i.e., HDQ case), with the same level of computational complexity, and by 1 dB when SDQ is ON, at the cost of slight increase in complexity. In comparison with H.264 intracoding, our system provides an overall 0.4-dB gain or so, with dramatically reduced computational complexity; in comparison with HEVC intracoding, it offers comparable coding performance in the high-rate region or for complicated images, but with only less than 5% of the HEVC intracoding complexity. In addition, our proposed system also offers multiresolution capability, which, together with its comparatively high coding efficiency and low complexity, makes it a good alternative for real-time image processing applications.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-03-27
    Description: This paper presents estimation of head pose angles from a single 2D face image using a 3D face model morphed from a reference face model. A reference model refers to a 3D face of a person of the same ethnicity and gender as the query subject. The proposed scheme minimizes the disparity between the two sets of prominent facial features on the query face image and the corresponding points on the 3D face model to estimate the head pose angles. The 3D face model used is morphed from a reference model to be more specific to the query face in terms of the depth error at the feature points. The morphing process produces a 3D face model more specific to the query image when multiple 2D face images of the query subject are available for training. The proposed morphing process is computationally efficient since the depth of a 3D face model is adjusted by a scalar depth parameter at feature points. Optimal depth parameters are found by minimizing the disparity between the 2D features of the query face image and the corresponding features on the morphed 3D model projected onto 2D space. The proposed head pose estimation technique was evaluated on two benchmarking databases: 1) the USF Human-ID database for depth estimation and 2) the Pointing’04 database for head pose estimation. Experiment results demonstrate that head pose estimation errors in nodding and shaking angles are as low as 7.93° and 4.65° on average for a single 2D input face image.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-03-27
    Description: The rapid development of 3D technology and computer vision applications has motivated a thrust of methodologies for depth acquisition and estimation. However, existing hardware and software acquisition methods have limited performance due to poor depth precision, low resolution, and high computational cost. In this paper, we present a computationally efficient method to estimate dense depth maps from sparse measurements. There are three main contributions. First, we provide empirical evidence that depth maps can be encoded much more sparsely than natural images using common dictionaries, such as wavelets and contourlets. We also show that a combined wavelet–contourlet dictionary achieves better performance than using either dictionary alone. Second, we propose an alternating direction method of multipliers (ADMM) for depth map reconstruction. A multiscale warm start procedure is proposed to speed up the convergence. Third, we propose a two-stage randomized sampling scheme to optimally choose the sampling locations, thus maximizing the reconstruction performance for a given sampling budget. Experimental results show that the proposed method produces high-quality dense depth estimates, and is robust to noisy measurements. Applications to real data in stereo matching are demonstrated.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-03-27
    Description: This paper presents a complete proof that the bilateral filter can be implemented recursively, as long as: 1) the spatial filter can be implemented recursively and 2) the range filter can be decomposed into a recursive product. As a result, an $O(ND)$ solution can be obtained for bilateral filtering, where $N$ is the image size and $D$ is the dimensionality.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-03-27
    Description: A new approach to blind image quality assessment (BIQA), requiring no training, is proposed in this paper. The approach is named as blind image quality evaluator based on scales and works by evaluating the global difference of the query image analyzed at different scales with the query image at original resolution. The approach is based on the ability of the natural images to exhibit redundant information over various scales. A distorted image is considered as a deviation from the natural image and bereft of the redundancy present in the original image. The similarity of the original resolution image with its down-scaled version will decrease more when the image is distorted more. Therefore, the dissimilarities of an image with its low-resolution versions are cumulated in the proposed method. We dissolve the query image into its scale-space and measure the global dissimilarity with the co-occurrence histograms of the original and its scaled images. These scaled images are the low pass versions of the original image. The dissimilarity, called low pass error, is calculated by comparing the low pass versions across scales with the original image. The high pass versions of the image in different scales are obtained by Wavelet decomposition and their dissimilarity from the original image is also calculated. This dissimilarity, called high pass error, is computed with the variance and gradient histograms and weighted by the contrast sensitivity function to make it perceptually effective. These two kinds of dissimilarities are combined together to derive the quality score of the query image. This method requires absolutely no training with the distorted image, pristine images, or subjective human scores to predict the perceptual quality but uses the intrinsic global change of the query image across scales. The performance of the proposed method is evaluated across six publicly available databases and found to be competitive with the state-of-the-art techniques.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-03-27
    Description: Robust principal component analysis (RPCA) is a new emerging method for exact recovery of corrupted low-rank matrices. It assumes that the real data matrix has low rank and the error matrix is sparse. This paper presents a method called double nuclear norm-based matrix decomposition (DNMD) for dealing with the image data corrupted by continuous occlusion. The method uses a unified low-rank assumption to characterize the real image data and continuous occlusion. Specifically, we assume all image vectors form a low-rank matrix, and each occlusion-induced error image is a low-rank matrix as well. Compared with RPCA, the low-rank assumption of DNMD is more intuitive for describing occlusion. Moreover, DNMD is solved by alternating direction method of multipliers. Our algorithm involves only one operator: the singular value shrinkage operator. DNMD, as a transductive method, is further extended into inductive DNMD (IDNMD). Both DNMD and IDNMD use nuclear norm for measuring the continuous occlusion-induced error, while many previous methods use $L_{1}$ , $L_{2}$ , or other M-estimators. Extensive experiments on removing occlusion from face images and background modeling from surveillance videos demonstrate the effectiveness of the proposed methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-04-11
    Description: This paper presents a unified variational formulation for joint object segmentation and stereo matching, which takes both accuracy and efficiency into account. In our approach, depth-map consists of compact objects, each object is represented through three different aspects: 1) the perimeter in image space; 2) the slanted object depth plane; and 3) the planar bias, which is to add an additional level of detail on top of each object plane in order to model depth variations within an object. Compared with traditional high quality solving methods in low level, we use a convex formulation of the multilabel Potts Model with PatchMatch stereo techniques to generate depth-map at each image in object level and show that accurate multiple view reconstruction can be achieved with our formulation by means of induced homography without discretization or staircasing artifacts. Our model is formulated as an energy minimization that is optimized via a fast primal-dual algorithm, which can handle several hundred object depth segments efficiently. Performance evaluations in the Middlebury benchmark data sets show that our method outperforms the traditional integer-valued disparity strategy as well as the original PatchMatch algorithm and its variants in subpixel accurate disparity estimation. The proposed algorithm is also evaluated and shown to produce consistently good results for various real-world data sets (KITTI benchmark data sets and multiview benchmark data sets).
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-04-11
    Description: We propose a data-dependent denoising procedure to restore noisy images. Different from existing denoising algorithms which search for patches from either the noisy image or a generic database, the new algorithm finds patches from a database that contains relevant patches. We formulate the denoising problem as an optimal filter design problem and make two contributions. First, we determine the basis function of the denoising filter by solving a group sparsity minimization problem. The optimization formulation generalizes existing denoising algorithms and offers systematic analysis of the performance. Improvement methods are proposed to enhance the patch search process. Second, we determine the spectral coefficients of the denoising filter by considering a localized Bayesian prior. The localized prior leverages the similarity of the targeted database, alleviates the intensive Bayesian computation, and links the new method to the classical linear minimum mean squared error estimation. We demonstrate applications of the proposed method in a variety of scenarios, including text images, multiview images, and face images. Experimental results show the superiority of the new algorithm over existing methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-04-15
    Description: Watermarking algorithms have been widely applied to the field of image forensics recently. One of these very forensic applications is the protection of images against tampering. For this purpose, we need to design a watermarking algorithm fulfilling two purposes in case of image tampering: 1) detecting the tampered area of the received image and 2) recovering the lost information in the tampered zones. State-of-the-art techniques accomplish these tasks using watermarks consisting of check bits and reference bits. Check bits are used for tampering detection, whereas reference bits carry information about the whole image. The problem of recovering the lost reference bits still stands. This paper is aimed at showing that having the tampering location known, image tampering can be modeled and dealt with as an erasure error. Therefore, an appropriate design of channel code can protect the reference bits against tampering. In the present proposed method, the total watermark bit-budget is dedicated to three groups: 1) source encoder output bits; 2) channel code parity bits; and 3) check bits. In watermark embedding phase, the original image is source coded and the output bit stream is protected using appropriate channel encoder. For image recovery, erasure locations detected by check bits help channel erasure decoder to retrieve the original source encoded image. Experimental results show that our proposed scheme significantly outperforms recent techniques in terms of image quality for both watermarked and recovered image. The watermarked image quality gain is achieved through spending less bit-budget on watermark, while image recovery quality is considerably improved as a consequence of consistent performance of designed source and channel codes.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2015-04-15
    Description: In this paper, we propose a machine learning-based fast coding unit (CU) depth decision method for High Efficiency Video Coding (HEVC), which optimizes the complexity allocation at CU level with given rate-distortion (RD) cost constraints. First, we analyze quad-tree CU depth decision process in HEVC and model it as a three-level of hierarchical binary decision problem. Second, a flexible CU depth decision structure is presented, which allows the performances of each CU depth decision be smoothly transferred between the coding complexity and RD performance. Then, a three-output joint classifier consists of multiple binary classifiers with different parameters is designed to control the risk of false prediction. Finally, a sophisticated RD-complexity model is derived to determine the optimal parameters for the joint classifier, which is capable of minimizing the complexity in each CU depth at given RD degradation constraints. Comparative experiments over various sequences show that the proposed CU depth decision algorithm can reduce the computational complexity from 28.82% to 70.93%, and 51.45% on average when compared with the original HEVC test model. The Bjøntegaard delta peak signal-to-noise ratio and Bjøntegaard delta bit rate are −0.061 dB and 1.98% on average, which is negligible. The overall performance of the proposed algorithm outperforms those of the state-of-the-art schemes.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-12-25
    Description: This paper proposes a local descriptor called quaternionic local ranking binary pattern (QLRBP) for color images. Different from traditional descriptors that are extracted from each color channel separately or from vector representations, QLRBP works on the quaternionic representation (QR) of the color image that encodes a color pixel using a quaternion. QLRBP is able to handle all color channels directly in the quaternionic domain and include their relations simultaneously. Applying a Clifford translation to QR of the color image, QLRBP uses a reference quaternion to rank QRs of two color pixels, and performs a local binary coding on the phase of the transformed result to generate local descriptors of the color image. Experiments demonstrate that the QLRBP outperforms several state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-12-25
    Description: Fine-grained image categorization is a challenging task aiming at distinguishing objects belonging to the same basic-level category, e.g., leaf or mushroom. It is a useful technique that can be applied for species recognition, face verification, and so on. Most of the existing methods either have difficulties to detect discriminative object components automatically, or suffer from the limited amount of training data in each sub-category. To solve these problems, this paper proposes a new fine-grained image categorization model. The key is a dense graph mining algorithm that hierarchically localizes discriminative object parts in each image. More specifically, to mimic the human hierarchical perception mechanism, a superpixel pyramid is generated for each image. Thereby, graphlets from each layer are constructed to seamlessly capture object components. Intuitively, graphlets representative to each super-/sub-category is densely distributed in their feature space. Thus, a dense graph mining algorithm is developed to discover graphlets representative to each super-/sub-category. Finally, the discovered graphlets from pairwise images are integrated into an image kernel for fine-grained recognition. Theoretically, the learned kernel can generalize several state-of-the-art image kernels. Experiments on nine image sets demonstrate the advantage of our method. Moreover, the discovered graphlets from each sub-category accurately capture those tiny discriminative object components, e.g., bird claws, heads, and bodies.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-19
    Description: This paper develops a human action recognition method for human silhouette sequences based on supervised temporal t-stochastic neighbor embedding (ST-tSNE) and incremental learning. Inspired by the SNE and its variants, ST-tSNE is proposed to learn the underlying relationship between action frames in a manifold, where the class label information and temporal information are introduced to well represent those frames from the same action class. As to the incremental learning, an important step for action recognition, we introduce three methods to perform the low-dimensional embedding of new data. Two of them are motivated by local methods, locally linear embedding and locality preserving projection. Those two techniques are proposed to learn explicit linear representations following the local neighbor relationship, and their effectiveness is investigated for preserving the intrinsic action structure. The rest one is based on manifold-oriented stochastic neighbor projection to find a linear projection from high-dimensional to low-dimensional space capturing the underlying pattern manifold. Extensive experimental results and comparisons with the state-of-the-art methods demonstrate the effectiveness and robustness of the proposed ST-tSNE and incremental learning methods in the human action silhouette analysis.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-27
    Description: Objective image quality assessment (IQA) plays an important role in the development of multimedia applications. Prediction of IQA metric should be consistent with human perception. The release of the newest IQA database (TID2013) challenges most of the widely used quality metrics (e.g., peak-to-noise-ratio and structure similarity index). We propose a new methodology to build the metric model using a regression approach. The new IQA score is set to be the nonlinear combination of features extracted from several difference of Gaussian (DOG) frequency bands, which mimics the human visual system (HVS). Experimental results show that the random forest regression model trained by the proposed DOG feature is highly correspondent to the HVS and is also robust when tested by available databases.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-27
    Description: Exploring the multimedia techniques to assist scientists for their research is an interesting and meaningful topic. In this paper, we focus on the large-scale aurora image retrieval by leveraging the bag-of-visual words (BoVW) framework. To refine the unsuitable representation and improve the retrieval performance, the BoVW model is modified by embedding the polar information. The superiority of the proposed polar embedding method lies in two aspects. On the one hand, the polar meshing scheme is conducted to determine the interest points, which is more suitable for images captured by circular fisheye lens. Especially for the aurora image, the extracted polar scale-invariant feature transform (polar-SIFT) feature can also reflect the geomagnetic longitude and latitude, and thus facilitates the further data analysis. On the other hand, a binary polar deep local binary pattern (polar-DLBP) descriptor is proposed to enhance the discriminative power of visual words. Together with the 64-bit polar-SIFT code obtained via Hamming embedding, the multifeature index is performed to reduce the impact of false positive matches. Extensive experiments are conducted on the large-scale aurora image data set. The experimental result indicates that the proposed method improves the retrieval accuracy significantly with acceptable efficiency and memory cost. In addition, the effectiveness of the polar-SIFT scheme and polar-DLBP integration are separately demonstrated.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-27
    Description: Numerous recent approaches attempt to remove image blur due to camera shake, either with one or multiple input images, by explicitly solving an inverse and inherently ill-posed deconvolution problem. If the photographer takes a burst of images, a modality available in virtually all modern digital cameras, we show that it is possible to combine them to get a clean sharp version. This is done without explicitly solving any blur estimation and subsequent inverse problem. The proposed algorithm is strikingly simple: it performs a weighted average in the Fourier domain, with weights depending on the Fourier spectrum magnitude. The method can be seen as a generalization of the align and average procedure, with a weighted average, motivated by hand-shake physiology and theoretically supported, taking place in the Fourier domain. The method’s rationale is that camera shake has a random nature, and therefore, each image in the burst is generally blurred differently. Experiments with real camera data, and extensive comparisons, show that the proposed Fourier burst accumulation algorithm achieves state-of-the-art results an order of magnitude faster, with simplicity for on-board implementation on camera phones. Finally, we also present experiments in real high dynamic range (HDR) scenes, showing how the method can be straightforwardly extended to HDR photography.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-27
    Description: Salient region detection is a challenging problem and an important topic in computer vision. It has a wide range of applications, such as object recognition and segmentation. Many approaches have been proposed to detect salient regions using different visual cues, such as compactness, uniqueness, and objectness. However, each visual cue-based method has its own limitations. After analyzing the advantages and limitations of different visual cues, we found that compactness and local contrast are complementary to each other. In addition, local contrast can very effectively recover incorrectly suppressed salient regions using compactness cues. Motivated by this, we propose a bottom-up salient region detection method that integrates compactness and local contrast cues. Furthermore, to produce a pixel-accurate saliency map that more uniformly covers the salient objects, we propagate the saliency information using a diffusion process. Our experimental results on four benchmark data sets demonstrate the effectiveness of the proposed method. Our method produces more accurate saliency maps with better precision-recall curve and higher F-Measure than other 19 state-of-the-arts approaches on ASD, CSSD, and ECSSD data sets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-27
    Description: Multi-exposure image fusion (MEF) is considered an effective quality enhancement technique widely adopted in consumer electronics, but little work has been dedicated to the perceptual quality assessment of multi-exposure fused images. In this paper, we first build an MEF database and carry out a subjective user study to evaluate the quality of images generated by different MEF algorithms. There are several useful findings. First, considerable agreement has been observed among human subjects on the quality of MEF images. Second, no single state-of-the-art MEF algorithm produces the best quality for all test images. Third, the existing objective quality models for general image fusion are very limited in predicting perceived quality of MEF images. Motivated by the lack of appropriate objective models, we propose a novel objective image quality assessment (IQA) algorithm for MEF images based on the principle of the structural similarity approach and a novel measure of patch structural consistency. Our experimental results on the subjective database show that the proposed model well correlates with subjective judgments and significantly outperforms the existing IQA models for general image fusion. Finally, we demonstrate the potential application of the proposed model by automatically tuning the parameters of MEF algorithms. 1 The subjective database and the MATLAB code of the proposed model will be made available online. Preliminary results of Section III were presented at the 6th International Workshop on Quality of Multimedia Experience , Singapore, 2014.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the bag-of-visual word model. Several applications, including, for example, visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget while attaining a target level of efficiency. In this paper, we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can conveniently be adopted to support the analyze-then-compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs the visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the compress-then-analyze (CTA) paradigm. In this paper, we experimentally compare the ATC and the CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: 1) homography estimation and 2) content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with the CTA, especially in bandwidth limited scenarios.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: One of the potentially effective means for large-scale 3D scene reconstruction is to reconstruct the scene in a global manner, rather than incrementally, by fully exploiting available auxiliary information on the imaging condition, such as camera location by Global Positioning System (GPS), orientation by inertial measurement unit (or compass), focal length from EXIF, and so on. However, such auxiliary information, though informative and valuable, is usually too noisy to be directly usable. In this paper, we present an approach by taking advantage of such noisy auxiliary information to improve structure from motion solving. More specifically, we introduce two effective iterative global optimization algorithms initiated with such noisy auxiliary information. One is a robust rotation averaging algorithm to deal with contaminated epipolar graph, the other is a robust scene reconstruction algorithm to deal with noisy GPS data for camera centers initialization. We found that by exclusively focusing on the estimated inliers at the current iteration, the optimization process initialized by such noisy auxiliary information could converge well and efficiently. Our proposed method is evaluated on real images captured by unmanned aerial vehicle, StreetView car, and conventional digital cameras. Extensive experimental results show that our method performs similarly or better than many of the state-of-art reconstruction approaches, in terms of reconstruction accuracy and completeness, but is more efficient and scalable for large-scale image data sets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2015-07-18
    Description: While surveillance video is the biggest source of unstructured Big Data today, the emergence of high-efficiency video coding (HEVC) standard is poised to have a huge role in lowering the costs associated with transmission and storage. Among the benefits of HEVC over the legacy MPEG-4 Advanced Video Coding (AVC), is a staggering 40 percent or more bitrate reduction at the same visual quality. Given the bandwidth limitations, video data are compressed essentially by removing spatial and temporal correlations that exist in its uncompressed form. This causes compressed data, which are already de-correlated, to serve as a vital resource for machine learning with significantly fewer samples for training. In this paper, an efficient approach to foreground extraction/segmentation is proposed using novel spatio-temporal de-correlated block features extracted directly from the HEVC compressed video. Most related techniques, in contrast, work on uncompressed images claiming significant storage and computational resources not only for the decoding process prior to initialization but also for the feature selection/extraction and background modeling stage following it. The proposed approach has been qualitatively and quantitatively evaluated against several other state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: Modeling relationship between visual words in feature encoding is important in image classification. Recent methods consider this relationship in either image or feature space, and most of them incorporate only pairwise relationship (between visual words). However, in situations involving large variability in images, one cannot capture intrinsic invariance of intra-class images using low-order pairwise relationship. The result is not robust to larger variations in images. In addition, as the number of potential pairings grows exponentially with the number of visual words, the task of learning becomes computationally expensive. To overcome these two limitations, we propose an efficient classification framework that exploits high-order topology of visual words in the feature space, as follows. First, we propose a search algorithm that seeks dependence between the visual words. This dependence is used to construct higher order topology in the feature space. Then, the local features are encoded according to this higher order topology to improve the image classification. Experiments involving four common data sets, namely PASCAL VOC 2007, 15 Scenes, Caltech 101, and UIUC Sport Event, demonstrate that the dependence search significantly improves the efficiency of higher order topological construction, and consequently increases the image classification in all these data sets.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: Motion estimation, i.e., optical flow, of fluid-like and dynamic texture (DT) images/videos is an important challenge, particularly for understanding outdoor scene changes created by objects and/or natural phenomena. Most optical flow models use smoothness-based constraints using terms such as fluidity from the fluid dynamics framework, with constraints typically being incompressibility and low Reynolds numbers ( $Re$ ). Such constraints are assumed to impede the clear capture of locally abrupt image intensity and motion changes, i.e., discontinuities and/or high $Re$ over time. This paper exploits novel physics-based optical flow models/constraints for both smooth and discontinuous changes using a wave generation theory that imposes no constraint on $Re$ or compressibility of an image sequence. Iterated two-step optimization between local and global optimization is also used: first, an objective function with varying multiple sine/cosine bases with new local image properties, i.e., orientation and frequency, and with a novel transformed dispersion relationship equation are used. Second, the statistical property of image features is used to globally optimize model parameters. Experiments on synthetic and real DT image sequences with smooth and discontinuous motions demonstrate that the proposed locally and globally varying models outperform the previous optical flow models.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: In this paper, we address the problem of recovering degraded images using multivariate Gaussian mixture model (GMM) as a prior. The GMM framework in our method for image restoration is based on the assumption that the accumulation of similar patches in a neighborhood are derived from a multivariate Gaussian probability distribution with a specific covariance and mean. Previous methods of image restoration with GMM have not considered spatial (geometric) distance between patches in clustering. Our conducted experiments show that in the case of constraining Gaussian estimates into a finite-sized windows, the patch clusters are more likely to be derived from the estimated multivariate Gaussian distributions, i.e., the proposed statistical patch-based model provides a better goodness-of-fit to statistical properties of natural images. A novel approach for computing aggregation weights for image reconstruction from recovered patches is introduced which is based on similarity degree of each patch to the estimated Gaussian clusters. The results admit that in the case of image denoising, our method is highly comparable with the state-of-the-art methods, and our image interpolation method outperforms previous state-of-the-art methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: This paper presents a texture flow estimation method that uses an appearance-space clustering and a correspondence search in the space of deformed exemplars. To estimate the underlying texture flow, such as scale, orientation, and texture label, most existing approaches require a certain amount of user interactions. Strict assumptions on a geometric model further limit the flow estimation to such a near-regular texture as a gradient-like pattern. We address these problems by extracting distinct texture exemplars in an unsupervised way and using an efficient search strategy on a deformation parameter space. This enables estimating a coherent flow in a fully automatic manner, even when an input image contains multiple textures of different categories. A set of texture exemplars that describes the input texture image is first extracted via a medoid-based clustering in appearance space. The texture exemplars are then matched with the input image to infer deformation parameters. In particular, we define a distance function for measuring a similarity between the texture exemplar and a deformed target patch centered at each pixel from the input image, and then propose to use a randomized search strategy to estimate these parameters efficiently. The deformation flow field is further refined by adaptively smoothing the flow field under guidance of a matching confidence score. We show that a local visual similarity, directly measured from appearance space, explains local behaviors of the flow very well, and the flow field can be estimated very efficiently when the matching criterion meets the randomized search strategy. Experimental results on synthetic and natural images show that the proposed method outperforms existing methods.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-07-18
    Description: This paper presents a novel preprocessing method of color-to-gray document image conversion. In contrast to the conventional methods designed for natural images that aim to preserve the contrast between different classes in the converted gray image, the proposed conversion method reduces as much as possible the contrast (i.e., intensity variance) within the text class. It is based on learning a linear filter from a predefined data set of text and background pixels that: 1) when applied to background pixels, minimizes the output response and 2) when applied to text pixels, maximizes the output response, while minimizing the intensity variance within the text class. Our proposed method (called learning-based color-to-gray) is conceived to be used as preprocessing for document image binarization. A data set of 46 historical document images is created and used to evaluate subjectively and objectively the proposed method. The method demonstrates drastically its effectiveness and impact on the performance of state-of-the-art binarization methods. Four other Web-based image data sets are created to evaluate the scalability of the proposed method.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-07-18
    Description: In order to improve 3D video coding efficiency, we propose methods to estimate rendered view distortion in synthesized views as a function of the depth map quantization error. Our approach starts by calculating the geometric error caused by the depth map error based on the camera parameters. Then, we estimate the rendered view distortion based on the local video characteristics. The estimated rendered view distortion is used in the rate-distortion optimized mode selection for depth map coding. A Lagrange multiplier is derived using the proposed distortion metric, which is estimated based on an autoregressive model. Experimental results show the efficiency of the proposed methods, with average savings of 43% in depth map bitrate as compared with encoding the depth maps using the same coding tools but with the rate-distortion optimization based on the conventional distortion metric.
    Print ISSN: 1057-7149
    Electronic ISSN: 1941-0042
    Topics: Electrical Engineering, Measurement and Control Technology
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...