ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (3,196)
  • Elsevier  (3,196)
  • 2015-2019  (2,362)
  • 1995-1999  (834)
  • 1945-1949
  • Pattern Recognition  (502)
  • 110151
  • 2184
  • 3363
  • Computer Science  (3,196)
Collection
  • Articles  (3,196)
Years
Year
Journal
Topic
  • Computer Science  (3,196)
  • 1
    Publication Date: 2018-04-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2018-01-01
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Rameswar Panda, Amran Bhuiyan, Vittorio Murino, Amit K. Roy-Chowdhury〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Existing approaches for person re-identification have concentrated on either designing the best feature representation or learning optimal matching metrics in a static setting where the number of cameras are fixed in a network. Most approaches have neglected the dynamic and open world nature of the re-identification problem, where one or multiple new cameras may be temporarily on-boarded into an existing system to get additional information or added to expand an existing network. To address such a very practical problem, we propose a novel approach for adapting existing multi-camera re-identification frameworks with limited supervision. First, we formulate a domain perceptive re-identification method based on geodesic flow kernel that can effectively find the best source camera (already installed) to adapt with newly introduced target camera(s), without requiring a very expensive training phase. Second, we introduce a transitive inference algorithm for re-identification that can exploit the information from best source camera to improve the accuracy across other camera pairs in a network of multiple cameras. Third, we develop a target-aware sparse prototype selection strategy for finding an informative subset of source camera data for data-efficient learning in resource constrained environments. Our approach can greatly increase the flexibility and reduce the deployment cost of new cameras in many real-world dynamic camera networks. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art unsupervised alternatives whilst being extremely efficient to compute.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chengzu Bai, Ren Zhang, Zeshui Xu, Rui Cheng, Baogang Jin, Jian Chen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Kernel entropy component analysis (KECA) is a recently proposed dimensionality reduction approach, which has showed superiority in many pattern analysis algorithms previously based on principal component analysis (PCA). The optimized KECA (OKECA) is a state-of-the-art extension of KECA and can return projections retaining more expressive power than KECA. However, OKECA is not robust to outliers and has high computational complexities attributed to its inherent properties of L2-norm. To tackle these two problems, we propose a new variant of KECA, namely L1-norm-based KECA (L1-KECA) for data transformation and feature extraction. L1-KECA attempts to find a new kernel decomposition matrix such that the extracted features store the maximum information potential, which is measured by L1-norm. Accordingly, we present a greedy iterative algorithm which has much faster convergence than OKECA's. Additionally, L1-KECA retains OKECA's capability to obtain accurate density estimation with very few features (just one or two). Moreover, a new semi-supervised L1-KECA classifier is developed and employed into the data classification. Extensive experiments on different real-world datasets validate that our model is superior to most existing KECA-based and PCA-based approaches. Code has been also made publicly available.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Samitha Herath, Basura Fernando, Mehrtash Harandi〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper we raise two important question, “〈strong〉1.〈/strong〉 Is temporal information beneficial in recognizing actions from still images? 〈strong〉2.〈/strong〉 Do we know how to take the maximum advantage from them?”. To answer these question we propose a novel transfer learning problem, Temporal To Still Image Learning (〈em〉i.e.〈/em〉, T2SIL) where we learn to derive temporal information from still images. Thereafter, we use a two-stream model where still image action predictions are fused with derived temporal predictions. In T2SIL, the knowledge transferring occurs from temporal representations of videos (〈em〉e.g.〈/em〉, Optical-flow, Dynamic Image representations) to still action images. Along with the T2SIL we propose a new action still image action dataset and a video dataset sharing the same set of classes. We explore three well established transfer learning frameworks (〈em〉i.e.〈/em〉, GANs, Embedding learning and Teacher Student Networks (TSNs)) in place of the temporal knowledge transfer method. The use of derived temporal information from our TSN and Embedding learning improves still image action recognition.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Pooya Ashtari, Fateme Nateghi Haredasht, Hamid Beigy〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many applications. However, these algorithms cannot be directly applied to supervised tasks. This paper thus presents a generative model extending the centroid-based clustering approach to be applicable to classification and regression tasks. Given an arbitrary loss function, the proposed approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk. Entropy-based regularization is also employed to fuzzify the partition and to weight features, enabling the method to capture more complex patterns, identify significant features, and yield better performance facing high-dimensional data. An iterative algorithm based on block coordinate descent scheme is formulated to efficiently find a local optimum. Extensive classification experiments on synthetic, real-world, and high-dimensional datasets demonstrate that the predictive performance of SFP is competitive with state-of-the-art algorithms such as SVM and random forest. SFP has a major advantage over such methods, in that it not only leads to a flexible, nonlinear model but also can exploit any convex loss function in the training phase without compromising computational efficiency.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Younghoon Kim, Hyungrok Do, Seoung Bum Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Graph-based clustering is an efficient method for identifying clusters in local and nonlinear data patterns. Among the existing methods, spectral clustering is one of the most prominent algorithms. However, this method is vulnerable to noise and outliers. This study proposes a robust graph-based clustering method that removes the data nodes of relatively low density. The proposed method calculates the pseudo-density from a similarity matrix, and reconstructs it using a sparse regularization model. In this process, noise and the outer points are determined and removed. Unlike previous edge cutting-based methods, the proposed method is robust to noise while detecting clusters because it cuts out irrelevant nodes. We use a simulation and real-world data to demonstrate the usefulness of the proposed method by comparing it to existing methods in terms of clustering accuracy and robustness to noisy data. The comparison results confirm that the proposed method outperforms the alternatives.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Qiong Wang, Lu Zhang, Wenbin Zou, Kidiyo Kpalma〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we present a novel method for salient object detection in videos. Salient object detection methods based on background prior may miss salient region when the salient object touches the frame borders. To solve this problem, we propose to detect the whole salient object via the adjunction of virtual borders. A guided filter is then applied on the temporal output to integrate the spatial edge information for a better detection of the salient object edges. At last, a global spatio-temporal saliency map is obtained by combining the spatial saliency map and the temporal saliency map together according to the entropy. The proposed method is assessed on three popular datasets (Fukuchi, FBMS and VOS) and compared to several state-of-the-art methods. The experimental results show that the proposed approach outperforms the tested methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Zhuoyao Zhong, Lei Sun, Qiang Huo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Although Faster R-CNN based text detection approaches have achieved promising results, their localization accuracy is not satisfactory in certain cases due to their sub-optimal bounding box regression based localization modules. In this paper, we address this problem and propose replacing the bounding box regression module with a novel LocNet based localization module to improve the localization accuracy of a Faster R-CNN based text detector. Given a proposal generated by a region proposal network (RPN), instead of directly predicting the bounding box coordinates of the concerned text instance, the proposal is enlarged to create a search region so that an “In-Out” conditional probability to each row and column of this search region is assigned, which can then be used to accurately infer the concerned bounding box. Furthermore, we present a simple yet effective two-stage approach to convert the difficult multi-oriented text detection problem to a relatively easier horizontal text detection problem, which makes our approach able to robustly detect multi-oriented text instances with accurate bounding box localization. Experiments demonstrate that the proposed approach boosts the localization accuracy of Faster R-CNN based text detectors significantly. Consequently, our new text detector has achieved superior performance on both horizontal (ICDAR-2011, ICDAR-2013 and MULTILIGUL) and multi-oriented (MSRA-TD500, ICDAR-2015) text detection benchmark tasks.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chunfeng Song, Yongzhen Huang, Yan Huang, Ning Jia, Liang Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Gait recognition is one of the most important techniques for human identification at a distance. Most current gait recognition frameworks consist of several separate steps: silhouette segmentation, feature extraction, feature learning, and similarity measurement. These modules are mutually independent with each part fixed, resulting in a suboptimal performance in challenging conditions. In this paper, we integrate those steps into one framework, i.e., an end-to-end network for gait recognition, named 〈strong〉GaitNet〈/strong〉. It is composed of two convolutional neural networks: one corresponds to gait segmentation, and the other corresponds to classification. The two networks are modeled in one joint learning procedure which can be trained jointly. This strategy greatly simplifies the traditional step-by-step manner and is thus much more efficient for practical applications. Moreover, joint learning can automatically adjust each part to fit the global optimal objective, leading to obvious performance improvement over separate learning. We evaluate our method on three large scale gait datasets, including CASIA-B, SZU RGB-D Gait and a newly built database with complex dynamic outdoor backgrounds. Extensive experimental results show that the proposed method is effective and achieves the state-of-the-art results. The code and data will be released upon request.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chuan-Xian Ren, Xiao-Lin Xu, Zhen Lei〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person re-identification (re-ID) is to match different images of the same pedestrian. It has attracted increasing research interest in pattern recognition and machine learning. Traditionally, person re-ID is formulated as a metric learning problem with binary classification output. However, higher order relationship, such as triplet closeness among the instances, is ignored by such pair-wise based metric learning methods. Thus, the discriminative information hidden in these data is insufficiently explored. This paper proposes a new structured loss function to push the frontier of the person re-ID performance in realistic scenarios. The new loss function introduces two margin parameters. They operate as bounds to remove positive pairs of very small distances and negative pairs of large distances. A trade-off coefficient is assigned to the loss term of negative pairs to alleviate class-imbalance problem. By using a linear function with the margin-based objectives, the gradients 〈em〉w.r.t.〈/em〉 weight matrices are no longer dependent on the iterative loss values in a multiplicative manner. This makes the weights update process robust to large iterative loss values. The new loss function is compatible with many deep learning architectures, thus, it induces new deep network with pair-pruning regularization for metric learning. To evaluate the performance of the proposed model, extensive experiments are conducted on benchmark datasets. The results indicate that the new loss together with the ResNet-50 backbone has excellent feature representation ability for person re-ID.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Shuzhao Li, Huimin Yu, Roland Hu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person attributes are often exploited as mid-level human semantic information to help promote the performance of person re-identification task. In this paper, unlike most existing methods simply taking the attribute learning as a classification problem, we perform it in a different way with the motivation that attributes are related to specific local regions, which refers to the perceptual ability of attributes. We utilize the process of attribute detection to generate corresponding attribute-part detectors, whose invariance to many influences like poses and camera views can be guaranteed. With detected local part regions, our model extracts local part features to handle the body part misalignment problem, which is another major challenge for person re-identification. The local descriptors are further refined by fused attribute information to eliminate interferences caused by detection deviation. Finally, the refined local feature works together with a holistic-level feature to constitute our final feature representation. Extensive experiments on two popular benchmarks with attribute annotations demonstrate the effectiveness of our model and competitive performance compared with state-of-the-art algorithms.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S003132031930319X-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Xin Wei, Hui Wang, Bryan Scotney, Huan Wan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Face recognition has achieved great success owing to the fast development of deep neural networks in the past few years. Different loss functions can be used in a deep neural network resulting in different performance. Most recently some loss functions have been proposed, which have advanced the state of the art. However, they cannot solve the problem of 〈em〉margin bias〈/em〉 which is present in class imbalanced datasets, having the so-called long-tailed distributions. In this paper, we propose to solve the margin bias problem by setting a minimum margin for all pairs of classes. We present a new loss function, Minimum Margin Loss (MML), which is aimed at enlarging the margin of those overclose class centre pairs so as to enhance the discriminative ability of the deep features. MML, together with Softmax Loss and Centre Loss, supervises the training process to balance the margins of all classes irrespective of their class distributions. We implemented MML in Inception-ResNet-v1 and conducted extensive experiments on seven face recognition benchmark datasets, MegaFace, FaceScrub, LFW, SLLFW, YTF, IJB-B and IJB-C. Experimental results show that the proposed MML loss function has led to new state of the art in face recognition, reducing the negative effect of margin bias.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Mahsa Taheri, Zahra Moslehi, Abdolreza Mirzaei, Mehran Safayani〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Measuring distance among data point pairs is a necessary step among numerous counts of algorithms in machine learning, pattern recognition and data mining. In the local perspective, the emphasis of all existing supervised metric learning algorithms is to shrink similar data points and to separate dissimilar ones in the local neighborhoods. This provides learning more appropriate distance metric in dealing with the within-class multi modal data. In this article, a new supervised local metric learning method named 〈em〉Self-Adaptive Local Metric Learning Method〈/em〉 (〈em〉SA-LM〈sup〉2〈/sup〉〈/em〉) has been proposed. The contribution of this method is in five aspects. First, in this method, learning an appropriate metric and defining the radius of local neighborhood are integrated in a joint formulation. Second, unlike the traditional approaches, SA-LM〈sup〉2〈/sup〉 learns the parameter of local neighborhood automatically thorough its formulation. As a result, it is a parameter free method, where it does not require any parameters that would need to be tuned. Third, SA-LM〈sup〉2〈/sup〉 is formulated as a SemiDefinite Program (SDP) with a global convergence guarantee. Fourth, this method does not need the similar set 〈em〉S〈/em〉, the focus here is on local areas’ data points and their separation from dissimilar ones. Finally, results of SA-LM〈sup〉2〈/sup〉 are less influenced by noisy input data points than the other compared global and local algorithms. Results obtained from different experiments indicate the outperformance of this algorithm over its counterparts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Zheng Ma, Jun Cheng, Dapeng Tao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Wearable/portable brain-computer interfaces (BCIs) for the long-term end use are a focus of recent BCI research. A challenge is how to update the BCI to meet changes in electroencephalography (EEG) signals, since the resource are so limited that retraining of traditional well-performed models, such as a support vector machine, is nearly impossible. To cope with this challenge, less-demanding adaptive online learning can be considered. We investigated an adaptive projected sub-gradient method (APSM) that is originated from the set theoretic estimation formulation and the projections onto convex sets theory. APSM provides a unifying framework for both adaptive classification and regression tasks. Coefficients of APSM are adjusted online as data arrive sequentially, with a regularization constraint made by projections onto a fixed closed ball. We extended the general APSM to a shrinkage form, where shrinkage closed balls were used instead of the original fixed one, expecting a more controllable fading effect and better adaptability. The convergence of shrinkage APSM was proved. It was also demonstrated that as shrinkage factor approached to 1, the limit point of shrinkage APSM would approach to the optimal solution with the least norm, which could be especially beneficial for generalization of the classifier. The performance of the proposed method was evaluated, and compared with those of the general APSM, the incremental support vector machine, and the passive aggressive algorithm, through an event-related potential-based BCI experiment. Results showed the advantage of the proposed method over the others on both the online classification performance and the easiness of tuning. Our study revealed the effectiveness of the proposed method for adaptive EEG classification, making it a promising tool for on-device training and updating of wearable/portable BCIs, as well as for application in other related fields, such as EEG-based biometrics.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Franco Manessi, Alessandro Rozza, Mario Manzo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In many different classification tasks it is required to manage structured data, which are usually modeled as graphs. Moreover, these graphs can be dynamic, meaning that the vertices/edges of each graph may change over time. The goal is to exploit existing neural network architectures to model datasets that are best represented with graph structures that change over time. To the best of the authors’ knowledge, this task has not been addressed using these kinds of architectures. Two novel approaches are proposed, which combine Long Short-Term Memory networks and Graph Convolutional Networks to learn long short-term dependencies together with graph structure. The advantage provided by the proposed methods is confirmed by the results achieved on four real world datasets: an increase of up to 12 percentage points in Accuracy and F1 scores for vertex-based semi-supervised classification and up to 2 percentage points in Accuracy and F1 scores for graph-based supervised classification.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Ying Liu, Konstantinos Tountas, Dimitris A. Pados, Stella N. Batalama, Michael J. Medley〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉High-dimensional data usually exhibit intrinsic low-rank structures. With tremendous amount of streaming data generated by ubiquitous sensors in the world of Internet-of-Things, fast detection of such low-rank pattern is of utmost importance to a wide range of applications. In this work, we present an 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace tracking method to capture the low-rank structure of streaming data. The method is based on the 〈em〉L〈/em〉〈sub〉1〈/sub〉-norm principal-component analysis (〈em〉L〈/em〉〈sub〉1〈/sub〉-PCA) theory that offers outlier resistance in subspace calculation. The proposed method updates the 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace as new data are acquired by sensors. In each time slot, the conformity of each datum is measured by the 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace calculated in the previous time slot and used to weigh the datum. Iterative weighted 〈em〉L〈/em〉〈sub〉1〈/sub〉-PCA is then executed through a refining function. The superiority of the proposed 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace tracking method compared to existing approaches is demonstrated through experimental studies in various application fields.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Yuming Fang, Xiaoqiang Zhang, Feiniu Yuan, Nevrez Imamoglu, Haiwen Liu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Image saliency detection has been widely explored in recent decades, but computational modeling of visual attention for video sequences is limited due to complicated temporal saliency extraction and fusion of spatial and temporal saliency. Inspired by Gestalt theory, we introduce a novel spatiotemporal saliency detection model in this study. First, we compute spatial and temporal saliency maps by low-level visual features. And then we merge these two saliency maps for spatiotemporal saliency prediction of video sequences. The spatial saliency map is calculated by extracting three kinds of features including color, luminance, and texture, while the temporal saliency map is computed by extracting motion features estimated from video sequences. A novel adaptive entropy-based uncertainty weighting method is designed to fuse spatial and temporal saliency maps to predict the final spatiotemporal saliency map by Gestalt theory. The Gestalt principle of similarity is used to estimate spatial uncertainty from spatial saliency, while temporal uncertainty is computed from temporal saliency by the Gestalt principle of common fate. Experimental results on three large-scale databases show that our method can predict visual saliency more accurately than the state-of-art spatiotemporal saliency detection algorithms.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2018
    Description: 〈p〉Publication date: March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 87〈/p〉 〈p〉Author(s): Jimin Xiao, Yanchun Xie, Tammam Tillo, Kaizhu Huang, Yunchao Wei, Jiashi Feng〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person search in real-world scenarios is a new challenging computer version task with many meaningful applications. The challenge of this task mainly comes from: (1) unavailable bounding boxes for pedestrians and the model needs to search for the person over the whole gallery images; (2) huge variance of visual appearance of a particular person owing to varying poses, lighting conditions, and occlusions. To address these two critical issues in modern person search applications, we propose a novel Individual Aggregation Network (IAN) that can accurately localize persons by learning to minimize intra-person feature variations. IAN is built upon the state-of-the-art object detection framework, i.e., faster R-CNN, so that high-quality region proposals for pedestrians can be produced in an online manner. In addition, to relieve the negative effect caused by varying visual appearances of the same individual, IAN introduces a novel center loss that can increase the intra-class compactness of feature representations. The engaged center loss encourages persons with the same identity to have similar feature characteristics. Extensive experimental results on two benchmarks, i.e., CUHK-SYSU and PRW, well demonstrate the superiority of the proposed model. In particular, IAN achieves 77.23% mAP and 80.45% top-1 accuracy on CUHK-SYSU, which outperform the state-of-the-art by 1.7% and 1.85%, respectively.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 28 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Rui Ye, Qun Dai, Mei Ling Li〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Traditional machine learning is generally committed to obtaining classifiers which are well-performed over unlabeled test data. This usually relies on two critical assumptions: firstly, sufficient labeled training data are available; secondly, training and testing data are drawn from the same distribution and the same feature space. Unfortunately, in most cases, the actual situation is difficult to meet the above conditions. Transfer learning scheme is naturally proposed to alleviate this problem. In order to get robust classifiers with relatively lower computational costs, we incorporate the rationale of Support Vector Machine (SVM) into transfer learning scheme and propose a novel SVM-based transfer learning model, abbreviated as TrSVM. In this method, support vector sets are extracted to represent the source domain. New training datasets are respectively constructed by combining each support vector set and target labeled dataset. On the basis of these training datasets, a number of new base classifiers can be acquired. Since performance of a classifiers ensemble is generally superior to that of individual classifiers, ensemble selection is utilized in our work. A hybrid transfer learning algorithm, integrating the Genetic Algorithm based Selective Ensemble (GASEN) with TrSVM, is proposed, and abbreviated as TrGASVM, naturally. GASEN is a genetic algorithm-based heuristic algorithm for solving combinatorial optimization problems. It can not only enhance the generalization ability of an ensemble, but also alleviate the local minimum problem of greedy ensemble pruning methods. Since TrGASVM is under frame of TrSVM and GASEN, it inevitably inherits the advantages of both algorithms. The reasonable incorporation of TrSVM with GASEN endows TrGASVM with favorable transfer learning capability, with its effectiveness being demonstrated by the experimental results on three real-world text classification datasets.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): 〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Peizhen Bai, Yan Ge, Fangling Liu, Haiping Lu〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Hongsen Liu, Yang Cong, Chenguang Yang, Yandong Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Accurate 3D object recognition and 6-DOF pose estimation have been pervasively applied to a variety of applications, such as unmanned warehouse, cooperative robots, and manufacturing industry. How to extract a robust and representative feature from the point clouds is an inevitable and important issue. In this paper, an unsupervised feature learning network is introduced to extract 3D keypoint features from point clouds directly, rather than transforming point clouds to voxel grids or projected RGB images, which saves computational time while preserving the object geometric information as well. Specifically, the proposed network features in a stacked point feature encoder, which can stack the local discriminative features within its neighborhoods to the original point-wise feature counterparts. The main framework consists of both offline training phase and online testing phase. In the offline training phase, the stacked point feature encoder is trained first and then generate feature database of all keypoints, which are sampled from synthetic point clouds of multiple model views. In the online testing phase, each feature extracted from the unknown testing scene is matched among the database by using the K-D tree voting strategy. Afterwards, the matching results are achieved by using the hypothesis & verification strategy. The proposed method is extensively evaluated on four public datasets and the results show that ours deliver comparable or even superior performances than the state-of-the-arts in terms of F1-score, Average of the 3D distance (ADD) and Recognition rate.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 28 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Siyue Xie, Haifeng Hu, Yongbo Wu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Facial Expression Recognition (FER) has long been a challenging task in the field of computer vision. In this paper, we present a novel model, named Deep Attentive Multi-path Convolutional Neural Network (DAM-CNN), for FER. Different from most existing models, DAM-CNN can automatically locate expression-related regions in an expressional image and yield a robust image representation for FER. The proposed model contains two novel modules: an attention-based Salient Expressional Region Descriptor (SERD) and the Multi-Path Variation-Suppressing Network (MPVS-Net). SERD can adaptively estimate the importance of different image regions for FER task, while MPVS-Net disentangles expressional information from irrelevant variations. By jointly combining SERD and MPVS-Net, DAM-CNN is able to highlight expression-relevant features and generate a variation-robust representation for expression classification. Extensive experimental results on both constrained datasets (CK+, JAFFE, TFEID) and unconstrained datasets (SFEW, FER2013, BAUM-2i) demonstrate the effectiveness of our DAM-CNN model.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 27 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Wei-Hong Li, Zhuowei Zhong, Wei-Shi Zheng〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person re-identification (re-id) is to match people across disjoint camera views in a multi-camera system, and re-id has been an important technology applied in smart city in recent years. However, the majority of existing person re-id methods assumes all data samples are available in advance for training. However, in a real-world scenario person images detected from multi-camera system are coming sequentially, and thus these methods are not designed for processing sequential data in an online way. While there is a few work on discussing online re-id, most of them require considerable storage of all passed labelled data samples that have been ever observed. In this work, we present an one-pass person re-id model that adapts the re-id model based on each newly observed data and no passed data are required for each update. More specifically, we develop a Sketch online Discriminant Analysis (SoDA) by embedding sketch processing into Fisher discriminant analysis (FDA). SoDA can efficiently keep the main data variations of all passed samples in a low rank matrix when processing sequential data samples, and estimate the approximate within-class variance (i.e. within-class covariance matrix) from the sketch data information. We provide theoretical analysis on the effect of the estimated approximate within-class covariance matrix. In particular, we derive upper and lower bounds on the Fisher discriminant score (i.e. the quotient between between-class variation and within-class variation after feature transformation) in order to investigate how the optimal feature transformation learned by SoDA sequentially approximates the offline FDA that is learned on all observed data. Extensive experimental results have shown the effectiveness of our SoDA and empirically support our theoretical analysis.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Shengcong Chen, Changxing Ding, Minfeng Liu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Brain tumor segmentation from Magnetic Resonance Imaging scans is vital for both the diagnosis and treatment of brain cancers. It is widely accepted that accurate segmentation depends on multi-level information. However, exiting deep architectures for brain tumor segmentation fail to explicitly encourage the models to learn high-quality hierarchical features. In this paper, we propose a series of approaches to enhance the quality of the learnt hierarchical features. Our contributions incorporate four aspects. First, we extend the popular DeepMedic model to Multi-Level DeepMedic to make use of multi-level information for more accurate segmentation. Second, we propose a novel dual-force training scheme to promote the quality of multi-level features learnt from deep models. It is a general training scheme and can be applied to many exiting architectures, e.g., DeepMedic and U-Net. Third, we design a label distribution-based loss function as an auxiliary classifier to encourage the high-level layers of deep models to learn more abstract information. Finally, we propose a novel Multi-Layer Perceptron-based post-processing approach to refine the prediction results of deep models. Extensive experiments are conducted on two most recent brain tumor segmentation datasets, i.e., BRATS 2017 and BRATS 2015 datasets. Results on the two databases indicate that the proposed approaches consistently promote the segmentation performance of the two popular deep models.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Peizhen Bai, Yan Ge, Fangling Liu, Haiping Lu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In recommender systems, the classical matrix factorization model for collaborative filtering only considers joint interactions between users and items. In contrast, context-aware recommender systems (CARS) use contexts to improve recommendation performance. Some early CARS models treat user, item and context equally, unable to capture contextual impact accurately. More recent models perform context operations on users and items separately, leading to “double-counting” of contextual information. This paper proposes a new model, Joint Interaction with Context Operation (JICO), to integrate the joint interaction model with the context operation model, via two layers. The joint interaction layer models interactions between users and items via an interaction tensor. The context operation layer captures contextual information via a contextual operating tensor. We evaluate JICO on four datasets and conduct novel studies, including varying contextual influence and time split recommendation. JICO consistently outperforms competing methods, while providing many useful insights to assist further analysis.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Mingyuan Jiu, Hichem Sahbi〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Deep kernel learning aims at designing nonlinear combinations of multiple standard elementary kernels by training deep networks. This scheme has proven to be effective, but intractable when handling large-scale datasets especially when the depth of the trained networks increases; indeed, the complexity of evaluating these networks scales quadratically w.r.t. the size of training data and linearly w.r.t. the depth of the trained networks. In this paper, we address the issue of efficient computation in Deep Kernel Networks (DKNs) by designing effective maps in the underlying Reproducing Kernel Hilbert Spaces (RKHS). Given a pretrained DKN, our method builds its associated Deep Map Network (DMN) whose inner product approximates the original network while being far more efficient. The design principle of our method is greedy and achieved layer-wise, by finding maps that approximate DKNs at different (input, intermediate and output) layers. This design also considers an extra fine-tuning step based on unsupervised learning, that further enhances the generalization ability of the trained DMNs. When plugged into SVMs, these DMNs turn out to be as accurate as the underlying DKNs while being at least an order of magnitude faster on large-scale datasets, as shown through extensive experiments on the challenging ImageCLEF, COREL5k benchmarks and the Banana dataset.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Yu Zhang, Han Zhang, Xiaobo Chen, Mingxia Liu, Xiaofeng Zhu, Seong-Whan Lee, Dinggang Shen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Sparse representation-based brain functional network modeling often results in large inter-subject variability in the network structure. This could reduce the statistical power in group comparison, or even deteriorate the generalization capability of the individualized diagnosis of brain diseases. Although group sparse representation (GSR) can alleviate such a limitation by increasing network similarity across subjects, it could, in turn, fail in providing satisfactory separability between the subjects from different groups (e.g., patients vs. controls). In this study, we propose to integrate individual functional connectivity (FC) information into the GSR-based network construction framework to achieve higher between-group separability while maintaining the merit of within-group consistency. Our method was based on an observation that the subjects from the same group have generally more similar FC patterns than those from different groups. To this end, we propose our new method, namely “strength and similarity guided GSR (SSGSR)”, which exploits both BOLD signal temporal correlation-based “low-order” FC (LOFC) and inter-subject LOFC-profile similarity-based “high-order” FC (HOFC) as two priors to jointly guide the GSR-based network modeling. Extensive experimental comparisons are carried out, with the rs-fMRI data from mild cognitive impairment (MCI) subjects and healthy controls, between the proposed algorithm and other state-of-the-art brain network modeling approaches. Individualized MCI identification results show that our method could achieve a balance between the individually consistent brain functional network construction and the adequately maintained inter-group brain functional network distinctions, thus leading to a more accurate classification result. Our method also provides a promising and generalized solution for the future connectome-based individualized diagnosis of brain disease.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Haishan Ye, Guangzeng Xie, Luo Luo, Zhihua Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Optimization is an important issue in machine learning because many machine learning models are reformulated as optimization problems. Different kinds of machine learning algorithms mainly focus on minimizing their empirical loss like deep learning, logistic regression, and support vector machine. Because data is explosively growing, it is challenging to deal with a large-scale optimization problem. Recently, stochastic second-order methods have emerged to attract much attention due to their efficiency in each iteration. These methods show good performance on training machine learning algorithms like logistic regression and support vector machine. However, the computational complexity of existing stochastic second-order methods heavily depends on the condition number of the Hessian. In this paper, we propose a new Newton-like method called 〈em〉Preconditioned Newton Conjugate Gradient with Sketched Hessian〈/em〉 (PNCG). The runtime complexity of PNCG is at most 〈em〉logarithmic〈/em〉 in the condition number of the Hessian. PNCG exhibits advantages over existing subsampled Newton methods especially when the Hessian matrix in question is ill-conditioned. We also show that our method has good performance on training machine learning algorithm empirically. The results show consistent improvements in computational efficiency.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Pattathal V. Arun, Ittai Herrmann, Krishna M. Budhiraju, Arnon Karnieli〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Spatial resolution enhancement is a pre-requisite for integrating unmanned aerial vehicle (UAV) datasets with the data from other sources. However, the mobility of UAV platforms, along with radiometric and atmospheric distortions, makes the task difficult. In this paper, various convolutional neural network (CNN) architectures are explored for resolving the issues related to sub-pixel classification and super-resolution of drone-derived datasets. The main contributions of this work are: 1) network-inversion based architectures for super-resolution and sub-pixel mapping of drone-derived images taking into account their spectral-spatial characteristics and the distortions prevalent in them 2) a feature-guided transformation for regularizing the inversion problem 3) loss functions for improving the spectral fidelity and inter-label compatibility of coarser to finer-scale mapping 4) use of multi-size kernel units for avoiding over-fitting. The proposed approach is the first of its kind in using neural network inversion for super-resolution and sub-pixel mapping. Experiments indicate that the proposed super-resolution approach gives better results in comparison with the sparse-code based approaches which generally result in corrupted dictionaries and sparse codes for multispectral aerial images. Also, the proposed use of neural network inversion, for projecting spatial affinities to sub-pixel maps, facilitates the consideration of coarser-scale texture and color information in modeling the finer-scale spatial-correlation. The simultaneous consideration of spectral bands, as proposed in this study, gives better super-resolution results when compared to the individual band enhancements. The proposed use of different data-augmentation strategies, for emulating the distortions, improves the generalization capability of the framework. Sensitivity of the proposed super-resolution and sub-pixel mapping frameworks with regard to the network parameters is thoroughly analyzed. The experiments over various standard datasets as well as those collected from known locations indicate that the proposed frameworks perform better when compared to the prominent published approaches.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320318304217-fx1.jpg" width="221" alt="Image, graphical abstract" title="Image, graphical abstract"〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Farnoosh Ghadiri, Robert Bergevin, Guillaume-Alexandre Bilodeau〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with a high carried object probability and a strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Chao Gou, Hui Zhang, Kunfeng Wang, Fei-Yue Wang, Qiang Ji〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Image-based pupil detection, which aims to find the pupil location in an image, has been an active research topic in computer vision community. Learning-based approaches can achieve preferable results given large amounts of training data with eye center annotations. However, there are limited publicly available datasets with accurate eye center annotations and it is unreliable and time-consuming for manually labeling large amounts of training data. In this paper, inspired by learning from synthetic data in Parallel Vision framework, we introduce a step of parallel imaging built upon Generative Adversarial Networks (GANs) to generate adversarial synthetic images. In particular, we refine the synthetic eye images by the improved SimGAN using adversarial training scheme. For the computational experiments, we further propose a coarse-to-fine pupil detection framework based on shape augmented cascade regression models learning from the adversarial synthetic images. Experiments on benchmark databases of BioID, GI4E, and LFW show that the proposed work performs significantly better over other state-of-the-art methods by leveraging the power of cascade regression and adversarial image synthesis.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Kun-Yi Huang, Chung-Hsien Wu, Ming-Hsiang Su〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Mood disorders, including unipolar depression (UD) and bipolar disorder (BD), have become some of the commonest mental health disorders. The absence of diagnostic markers of BD can cause misdiagnosis of the disorder as UD on initial presentation. Short-term detection, which could be used in early detection and intervention, is desirable. This study proposed an approach for short-term detection of mood disorders based on elicited speech responses. Speech responses of participants were obtained through interviews by a clinician after participants viewed six emotion-eliciting videos. A domain adaptation method based on a hierarchical spectral clustering algorithm was proposed to adapt a labeled emotion database into a collected unlabeled mood database for alleviating the data bias problem in an emotion space. For modeling the local variation of emotions in each response, a convolutional neural network (CNN) with an attention mechanism was used to generate an emotion profile (EP) of each elicited speech response. Finally, long short-term memory (LSTM) was employed to characterize the temporal evolution of EPs of all six speech responses. Moreover, an attention model was applied to the LSTM network for highlighting pertinent speech responses to improve detection performance instead of treating all responses equally. For evaluation, this study elicited emotional speech data from 15 people with BD, 15 people with UD, and 15 healthy controls. Leave-one-group-out cross-validation was employed for the compiled database and proposed method. CNN- and LSTM-based attention models improved the mood disorder detection accuracy of the proposed method by approximately 11%. Furthermore, the proposed method achieved an overall detection accuracy of 75.56%, outperforming support-vector-machine- (62.22%) and CNN-based (66.67%) methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2018
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Chen-Lin Zhang, Jianxin Wu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Nowadays, Convolutional Neural Network (CNN) has achieved great success in various computer vision tasks. However, in classic CNN models, convolution and fully connected (FC) layers just perform linear transformations to their inputs. Non-linearity is often added by activation and pooling layers. It is natural to explore and extend convolution and FC layers non-linearly with affordable costs. In this paper, we first investigate the power mean function, which is proved effective and efficient in SVM kernel learning. Then, we investigate the power mean kernel, which is a non-linear kernel having linear computational complexity with the asymmetric kernel approximation function. Motivated by this scalable kernel, we propose Power Mean Transformation, which nonlinearizes both convolution and FC layers. It only needs a small modification on current CNNs, and improves the performance with a negligible increase of model size and running time. Experiments on various tasks show that Power Mean Transformation can improve classification accuracy, bring generalization ability and add different non-linearity to CNN models. Large performance gain on tiny models shows that Power Mean Transformation is especially effective in resource restricted deep learning scenarios like mobile applications. Finally, we add visualization experiments to illustrate why Power Mean Transformation works.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2018
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Kewei Tang, Zhixun Su, Yang Liu, Wei Jiang, Jie Zhang, Xiyan Sun〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Spectral-clustering based methods have recently attracted considerable attention in the field of subspace segmentation. The approximately block-diagonal graphs achieved by this kind of methods usually contain some noise, i.e., nonzero elements in the off-diagonal region, due to outlier contamination or complex intrinsic structure of the dataset. In the experiment of most previous work, the number of the subspaces is often no more than 10. In this situation, this kind of noise almost has no influence on the segmentation results. However, the segmentation performance could be negatively affected by the noise when the number of subspaces is large, which is quite common in the real-world applications. In this paper, we address the problem of LSN subspace segmentation, i.e., large subspace number subspace segmentation. We first show that the approximately block-diagonal graph with the smaller difference in its diagonal blocks will be more robust to the off-diagonal noise mentioned above. Then, by using the infinity norm to control the bound of the difference in the diagonal blocks, we propose infinity norm minimization for LSN subspace segmentation. Experimental results demonstrate the effectiveness of our method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2018
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Eliezer Flores, Maciel Zortea, Jacob Scharcanski〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Land-use classification in very high spatial resolution images is critical in the remote sensing field. Consequently, remarkable efforts have been conducted towards developing increasingly accurate approaches for this task. In recent years, deep learning has emerged as a dominant paradigm for machine learning, and methodologies based on deep convolutional neural networks have received particular attention from the remote sensing community. These methods typically utilize transfer learning and/or data augmentation to accommodate a small number of labeled images in the publicly available datasets in this field. However, they typically require powerful computers and/or a long time for training. In this work, we propose a simple and novel method for land-use classification in very high spatial resolution images, which efficiently combines transfer learning with a sparse representation. Specifically, the proposed method performs the classification of land-use scenes using a modified version of the well-known sparse representation-based classification method. While this method directly uses the training images to form dictionaries, which are employed to classify test images, our method utilizes a pre-trained deep convolutional neural network and the Gaussian mixture model to generate more robust and compact “dictionaries of deep features.” The effectiveness of the proposed method was evaluated on two publicly available datasets: UC Merced and Brazilian Cerrado–Savana. The experimental results suggest that our method can potentially outperform state-of-the-art techniques for land-use classification in very high spatial resolution images.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Shima Kashef, Hossein Nezamabadi-pour〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In multi-label data, each instance is associated with a set of labels, instead of one label. Similar to single-label data, feature selection plays an important role in improving classification performance. In multi-label classification, each class label might be specified by some particular characteristics of its own which are called label-specific features. In this paper, a fast accurate filter-based feature selection method is exclusively designed for multi-label datasets to find label-specific features. It maps the features to a multi-dimensional space based on a filter method, and selects the most salient features with the help of Pareto-dominance concepts from multi-objective optimization domain. Our proposed method can be used as online feature selection that deals with problems in which features arrive sequentially while the number of data samples is fixed. In this method, the number of features to be selected is specified during the process of feature selection. However, sometimes it is desired to predefine the number of features. For this reason, an extension of the proposed method is presented to solve this problem. To prove the performance of the proposed methods, several experiments are conducted on some multi-label datasets and the results are compared to five well-established multi-label feature selection methods. The results show the superiority of the proposed methods in terms of different multi-label classification criteria and execution time.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Cheng Liu, Chu-Tao Zheng, Sheng Qian, Si Wu, Hau-San Wong〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Multi-task learning (MTL) aims to enhance generalization performance by exploring the inherent structures across tasks. Most existing MTL methods are based on the assumption that the tasks are positively correlated, and utilize the shared structures among tasks to improve learning performance. By contrast, there also exist competitive structure (negative relationships) among tasks in some real-world applications, and conventional MTL methods which explore shared structures across tasks may lead to unsatisfactory performance in this setting. Another challenge, especially in a high dimensional setting, is to exclude irrelevant features (sparse structure) from the final model. For this purpose, this work propose a new method, which is referred to as Sparse Exclusive Lasso (SpEL) for multi-task learning. The proposed SpEL is able to capture the competitive relationship among tasks (competitive structure), while remove unimportant features which are common across the tasks from the final model (sparse structure). Experimental studies on synthetic and real data indicate that the proposed method can significantly improve learning performance by identifying sparse and task-competitive structures simultaneously.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Daniel Gribel, Thibaut Vidal〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉 〈p〉Minimum sum-of-squares clustering (MSSC) is a widely used clustering model, of which the popular 〈span〉K-means〈/span〉 algorithm constitutes a local minimizer. It is well known that the solutions of 〈span〉K-means〈/span〉 can be arbitrarily distant from the true MSSC global optimum, and dozens of alternative heuristics have been proposed for this problem. However, no other algorithm has been predominantly adopted in the literature. This may be related to differences of computational effort, or to the assumption that a near-optimal solution of the MSSC has only a marginal impact on clustering validity.〈/p〉 〈p〉In this article, we dispute this belief. We introduce an efficient population-based metaheuristic that uses 〈span〉K-means〈/span〉 as a local search in combination with problem-tailored crossover, mutation, and diversification operators. This algorithm can be interpreted as a multi-start 〈span〉K-means〈/span〉, in which the initial center positions are carefully sampled based on the search history. The approach is scalable and accurate, outperforming all recent state-of-the-art algorithms for MSSC in terms of solution quality, measured by the depth of local minima. This enhanced accuracy leads to clusters which are significantly closer to the ground truth than those of other algorithms, for overlapping Gaussian-mixture datasets with a large number of features. Therefore, improved global optimization methods appear to be essential to better exploit the MSSC model in high dimension.〈/p〉 〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Pingping Zhang, Wei Liu, Hongyu Wang, Yinjie Lei, Huchuan Lu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Street-level scene segmentation aims to label each pixel of street-view images into specific semantic categories. It has been attracting growing interest due to various real-world applications, especially in the area of autonomous driving. However, this pixel-wise labeling task is very challenging under the complex street-level scenes and large-scale object categories. Motivated by the scene layout of street-view images, in this work we propose a novel Spatial Gated Attention (SGA) module, which automatically highlights the attentive regions for pixel-wise labeling, resulting in effective street-level scene segmentation. The proposed module takes as input the multi-scale feature maps based on a Fully Convolutional Network (FCN) backbone, and produces the corresponding attention mask for each feature map. The learned attention masks can neatly highlight the regions of interest while suppress background clutter. Furthermore, we propose an efficient multi-scale feature interaction mechanism which is able to adaptively aggregate the hierarchical features. Based on the proposed mechanism, the features of different levels are adaptively re-weighted according to the local spatial structure and the surrounding contextual information. Consequently, the proposed modules are able to boost standard FCN architectures and result in an enhanced pixel-wise segmentation for street-level scene images. Extensive experiments on three public available street-level benchmarks demonstrate that the proposed Gated Attention Network (GANet) approach achieves consistently superior performance and outperforms the very recent state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Ya Ju Fan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The autoencoder is an artificial neural network that performs nonlinear dimension reduction and learns hidden representations of unlabeled data. With a linear transfer function it is similar to the principal component analysis (PCA). While both methods use weight vectors for linear transformations, the autoencoder does not come with any indication similar to the eigenvalues in PCA that are paired with eigenvectors. We propose a novel autoencoder node saliency method that examines whether the features constructed by autoencoders exhibit properties related to known class labels. The supervised node saliency ranks the nodes based on their capability of performing a learning task. It is coupled with the normalized entropy difference (NED). We establish a property for NED values to verify classifying behaviors among the top ranked nodes. By applying our methods to real datasets, we demonstrate their ability to provide indications on the performing nodes and explain the learned tasks in autoencoders.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Debasrita Chakraborty, Vaasudev Narayanan, Ashish Ghosh〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉It is obvious to see that most of the datasets do not have exactly equal number of samples for each class. However, there are some tasks like detection of fraudulent transactions, for which class imbalance is overwhelming and one of the classes has very low (even less than 10% of the entire data) amount of samples. These tasks often fall under outlier detection. Moreover, there are some scenarios where there may be multiple subsets of the outlier class. In such cases, it should be treated as a multiple outlier type detection scenario. In this article, we have proposed a system that can efficiently handle all the aforementioned problems. We have used stacked autoencoders to extract features and then used an ensemble of probabilistic neural networks to do a majority voting and detect the outliers. Such a system is seen to have a better and reliable performance as compared to the other outlier detection systems in most of the datasets tested upon. It is seen that use of autoencoders clearly enhanced the outlier detection performance.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Marcel Nwali, Simon Liao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this research, we have developed a new algorithm to compute the moments defined in a rectangular region. By applying the recurrent formulas, symmetry properties, and particularly the parallelized matrix operations, our proposed computational method can improve the efficiency of computing Legendre, Gegenbauer, and Jacobi moments extensively with highly satisfied accuracy. To verify this new computational algorithm, the image reconstructions from the higher orders of Legendre, Gegenbauer, and Jacobi moments are performed on a testing image sized at 1024 × 1024 with very encouraging results. It took only a few seconds to compute moments and conduct the image reconstructions from the 1000-th order of the Legendre, Gegenbauer, and Jacobi moments with the PSNR values up to 45. By utilizing our new algorithm, image analysis and recognition applications using the higher orders of moments defined in a rectangular region in the range of milliseconds will be possible.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2018
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Yazhou Yang, Marco Loog〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Standard active learning assumes that human annotations are always obtainable whenever new samples are selected. This, however, is unrealistic in many real-world applications where human experts are not readily available at all times. In this paper, we consider the single shot setting: all the required samples should be chosen in a single shot and no human annotation can be exploited during the selection process. We propose a new method, Active Learning through Random Labeling (ALRL), which substitutes single human annotator for multiple, what we will refer to as, pseudo annotators. These pseudo annotators always provide uniform and random labels whenever new unlabeled samples are queried. This random labeling enables standard active learning algorithms to also exhibit the exploratory behavior needed for single shot active learning. The exploratory behavior is further enhanced by selecting the most representative sample via minimizing nearest neighbor distance between unlabeled samples and queried samples. Experiments on real-world datasets demonstrate that the proposed method outperforms several state-of-the-art approaches.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Jun Shi, Xiao Zheng, Jinjie Wu, Bangming Gong, Qi Zhang, Shihui Ying〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Histopathological image analysis works as ‘gold standard’ for cancer diagnosis. Its computer-aided approach has attracted considerable attention in the field of digital pathology, which highly depends on the feature representation for histopathological images. The principal component analysis network (PCANet) is a novel unsupervised deep learning framework that has shown its effectiveness for feature representation learning. However, PCA is susceptible to noise and outliers to affect the performance of PCANet. The Grassmann average (GA) is superior to PCA on robustness. In this work, a GA network (GANet) algorithm is proposed by embedding GA algorithm into the PCANet framework. Moreover, since quaternion algebra is an excellent tool to represent color images, a quaternion-based GANet (QGANet) algorithm is further developed to learn effective feature representations containing color information for histopathological images. The experimental results based on three histopathological image datasets indicate that the proposed QGANet achieves the best performance on the classification of color histopathological images among all the compared algorithms.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2018
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Tao Yao, Gang Wang, Lianshan Yan, Xiangwei Kong, Qingtang Su, Caiming Zhang, Qi Tian〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Hashing based cross-media method has been become an increasingly popular technique in facilitating large-scale multimedia retrieval task, owing to its effectiveness and efficiency. Most existing cross-media hashing methods learn hash functions in a batch based mode. However, in practical applications, data points often emerge in a streaming manner, which makes batch based hashing methods loss their efficiency. In this paper, we propose an Online Latent Semantic Hashing (OLSH) method to address this issue. Only newly arriving multimedia data points are utilized to retrain hash functions efficiently and meanwhile preserve the semantic correlations in old data points. Specifically, for learning discriminative hash codes, discrete labels are mapped to a continuous latent semantic space where the relative semantic distances in data points can be measured more accurately. And then, we propose an online optimization scheme towards the challenging task of learning hash functions efficiently on streaming data points, and the computational complexity and memory cost are much less than the size of training dataset at each round. Extensive experiments across many real-world datasets, 〈em〉e.g.〈/em〉 Wiki, Mir-Flickr25K and NUS-WIDE, show the effectiveness and efficiency of the proposed method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Rudolf Haraksim, Javier Galbally, Laurent Beslay〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Nowadays, the majority of fingerprint quality, matching and feature extraction algorithms are developed and trained on fingerprints of adults. Accordingly, the processing of children’s fingerprints presents performance issues derived for the most part from: (1) their smaller size and finer ridge structure; (2) their higher variability over time due to the displacement of minutiae induced by growth. The present article is focused on the second factor. The rapid growth of children fingerprints causes a significant displacement of the minutiae points between samples of the same finger acquired with a few years distance from each other. This displacement results in a decrease of the accuracy of fingerprint recognition systems when the reference and probe sample drift apart in time. This effect is known as biometric ageing. In the present study we propose to address this issue by developing and validating a minutiae-based growth model, derived from a database of over 60,000 children’s fingerprints, acquired in real operational conditions, ranging between 5 and 16 years of age, with a time difference between fingerprint pairs re-enrolments of up to 6 years. We analyze two potential application scenarios for the developed growth model. On one hand, we use the model to grow children’s fingerprints in order to spread out the minutiae points to attain sizes similar to those of a sample captured at a later point in time. On the other hand, we apply the model to rejuvenate fingerprints enrolled at a later stage by contracting the minutiae points so that their location is more similar to those of a sample acquired earlier. In both scenarios, the application of the growth model to produce artificially grown/rejuvenated fingerprint minutiae templates results in a significant improvement of the matching scores compared to the ones produced by original fingerprints.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Zhihui Li, Lina Yao, Xiaojun Chang, Kun Zhan, Jiande Sun, Huaxiang Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Zero-shot complex event detection has been an emerging task in coping with the scarcity of labeled training videos in practice. Aiming to progress beyond the state-of-the-art zero-shot event detection, we propose a new zero-shot event detection approach, which exploits the semantic correlation between an event and concepts. Based on the concept detectors pre-trained from external sources, our method learns the semantic correlation from the concept vocabulary and emphasizes on the most related concepts for the zero-shot event detection. Particularly, a novel Event-Adaptive Concept Integration algorithm is introduced to estimate the effectiveness of semantically related concepts by assigning different weights to them. As opposed to assigning weights by an invariable strategy, we compute the weights of concepts using the area under score curve. The assigned weights are incorporated into the confidence score vector statistically to better characterize the event-concept correlation. Our algorithm is proved to be able to harness the related concepts discriminatively tailored for a target event. Extensive experiments are conducted on the challenging TRECVID event video datasets, which demonstrate the advantage of our approach over the state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Jun Xu, Wangpeng An, Lei Zhang, David Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The use of sparse representation (SR) and collaborative representation (CR) for pattern classification has been widely studied in tasks such as face recognition and object categorization. Despite the success of SR/CR based classifiers, it is still arguable whether it is the ℓ〈sub〉1〈/sub〉-norm sparsity or the ℓ〈sub〉2〈/sub〉-norm collaborative property that brings the success of SR/CR based classification. In this paper, we investigate the use of nonnegative representation (NR) for pattern classification, which is largely ignored by previous work. Our analyses reveal that NR can boost the representation power of homogeneous samples while limiting the representation power of heterogeneous samples, making the representation sparse and discriminative simultaneously and thus providing a more effective solution to representation based classification than SR/CR. Our experiments demonstrate that the proposed NR based classifier (NRC) outperforms previous representation based classifiers. With deep features as inputs, it also achieves state-of-the-art performance on various visual classification tasks.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Xin Liu, Jiajia Geng, Haibin Ling, Yiu-ming Cheung〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Speaker naming has recently received considerable attention in identifying the active speaking character in a movie video, and face cue alone is generally insufficient to achieve reliable performance due to its significant appearance variations. In this paper, we treat the speaker naming task as a group of matched audio-face pair finding problems, and present an efficient attention guided deep audio-face fusion approach to detect the active speakers. First, we start with VGG-encoding of face images and extract the Mel-Frequency Cepstrum Coefficients from audio signals. Then, two efficient audio encoding modules, namely two-layer Long Short-Term Memory encoding and two-dimensional convolution encoding, are addressed to discriminate the high-level audio features. Meanwhile, we train an end-to-end audio-face common attention model to discriminate the face attention vector, featuring adaptively to accommodate various face variations. Further, an efficient factorized bilinear model is presented to deeply fuse the paired audio-face features, whereby the joint audio-face representation can be reliably obtained for speaker naming. Extensive experiments highlight the superiority of the proposed approach and show its very competitive performance with the state-of-the-arts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Raymond Ptucha, Felipe Petroski Such, Suhas Pillai, Frank Brockler, Vatsala Singh, Paul Hutkowski〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The recognition of handwritten text is challenging as there are virtually infinite ways a human can write the same message. Deep learning approaches for handwriting analysis have recently demonstrated breakthrough performance using both lexicon-based architectures and recurrent neural networks. This paper presents a fully convolutional network architecture which outputs arbitrary length symbol streams from handwritten text. A preprocessing step normalizes input blocks to a canonical representation which negates the need for costly recurrent symbol alignment correction. When a lexicon is known, we further introduce a probabilistic character error rate to correct errant word blocks. Our multi-state convolutional method is the first to demonstrate state-of-the-art results on both lexicon-based and arbitrary symbol based handwriting recognition benchmarks.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Yong Shi, Minglong Lei, Hong Yang, Lingfeng Niu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In network embedding, random walks play a fundamental role in preserving network structures. However, random walk methods have two limitations. First, they are unstable when either the sampling frequency or the number of node sequences changes. Second, in highly biased networks, random walks are likely to bias to high-degree nodes and neglect the global structure information. To solve the limitations, we present in this paper a network diffusion embedding method. To solve the first limitation, our method uses a diffusion driven process to capture both depth and breadth information in networks. Temporal information is also included into node sequences to strengthen information preserving. To solve the second limitation, our method uses the network inference method based on information diffusion cascades to capture the global network information. Experiments show that the new proposed method is more robust to highly unbalanced networks and well performed when sampling under each node is rare.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Tomislav Pribanić, Tomislav Petković, Matea Đonlić〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉3D registration is a very active topic, spanning research areas such as computational geometry, computer graphics and pattern recognition. It aims to solve spatial transformation that aligns two point clouds. In this work we propose the use of a single direction sensor, such as an accelerometer or a magnetometer, commonly available on contemporary mobile platforms, such as tablets and smartphones. Both sensors have been heavily investigated earlier, but only for joint use with other sensors, such as gyroscopes and GPS. We show a time-efficient and accurate 3D registration method that takes advantage of only either an accelerometer or a magnetometer. We demonstrate a 3D reconstruction of individual point clouds and the proposed 3D registration method on a tablet equipped with an accelerometer or a magnetometer. However, we point out that the proposed method is not restricted to mobile platforms. Indeed, it can easily be applied in any 3D measurement system that is upgradable with some ubiquitous direction sensor, for example by adding a smartphone equipped with either an accelerometer or a magnetometer. We compare the proposed method against several state-of-the-art methods implemented in the open source Point Cloud Library (PCL). The proposed method outperforms the PCL methods tested, both in terms of processing time and accuracy.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Danyu Lai, Wei Tian, Long Chen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we propose a novel and efficient multi-stage approach, which combines both semi-supervised learning and fine-grained learning to improve the performance of classification model learned only from a few samples. The fine-grained category recognition process utilized in our method is dubbed as MSR. In this process, we cut images into multi-scaled parts to feed into the network to learn more fine-grained features. By assigning these image cuts with dynamic weights, we can reduce the negative impact of background information and thus achieve a more accurate prediction. Furthermore, we present the voted pseudo label (VPL) which is an efficient method of semi-supervised learning. In this approach, for unlabeled data, VPL picks up the classes with non-confused labels verified by the consensus prediction of different classification models. These two methods can be applied to most neural network models and training methods. Inspired from classifier-based adaptation, we also propose a mix deep CNN architecture (MixDCNN). Both the VPL and MSR are integrated with the MixDCNN. Comprehensive experiments demonstrate the effectiveness of VPL and MSR. Without bottles and jars, we achieve the state-of-the-art or even better performance in two fine-grained recognition tasks on the datasets of Stanford Dogs and CUB Birds, with the accuracy of 95.6% and 85.2%, respectively.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2018
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Qiang Wang, Huijie Fan, Gan Sun, Yang Cong, Yandong Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Recently, generative adversarial networks (GANs) have demonstrated high-quality reconstruction in face completion. There is still much room for improvement over the conventional GAN models that do not explicitly address the texture details problem. In this paper, we propose a Laplacian-pyramid-based generative framework for face completion. This framework can produce more realistic results (1) by deriving precise content information of missing face regions in a coarse-to-fine fashion and (2) by propagating the high-frequency details from the surrounding area via a modified residual learning model. Specifically, for the missing regions, we design a Laplacian-pyramid-based convolutional network framework that can predict missing regions under different resolutions; this framework takes advantage of multiscale features shared from low levels and extracted from middle layers for the next finer level. For high-frequency details, we construct a new residual learning network to eliminate color discrepancies between the missing and surrounding regions progressively. Furthermore, a multiloss function is proposed to supervise the generative process. To optimize the model, we train the entire generative model with deep supervision using a joint reconstruction loss, which ensures that the generated image is as realistic as the original. Extensive experiments on benchmark datasets show that the proposed framework exhibits superior performance over state-of-the-art methods in terms of predictive accuracy, both quantitatively and qualitatively.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Weiwei Qian, Shunming Li, Xingxing Jiang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Machine learning-based intelligent fault diagnosis methods have gained extensive popularity and been widely investigated. However, in previous works, a major assumption accepted by default is that the training and testing datasets share the same distribution. Unfortunately, this assumption is mostly invalid in real-world applications for working condition variation of rotating machine can cause the distribution discrepancy between datasets easily, which results in performance degeneration of traditional diagnosis methods. Aiming at it, although some deep learning and transfer learning-based methods are proposed and validated effective recently, the dataset distribution alignments of them mainly focus on marginal distribution alignments, which are not powerful enough in some scenarios. Hence, a novel distribution discrepancy evaluating method called auto-balanced high-order Kullback–Leibler (AHKL) divergence is proposed, which can evaluate both the first and higher-order moment discrepancies and adapt the weights between them dimensionally and automatically. Meanwhile, smooth conditional distribution alignment (SCDA) is also developed, which performs excellently in aligning the conditional distributions through introducing soft labels instead of adopting widely-used pseudo labels. Furthermore, based on AHKL divergence and SCDA, weighted joint distribution alignment (WJDA) is developed for comprehensive joint distribution alignments. Finally, built on WJDA, we construct a novel deep transfer network (DTN) for rotating machine fault diagnosis with working condition variation. Extensive experimental evaluations through 18 transfer learning cases demonstrate its validity, and further comparisons with the state of the arts also validate its superiority.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Gary P.T. Choi, Hei Long Chan, Robin Yong, Sarbin Ranjitkar, Alan Brook, Grant Townsend, Ke Chen, Lok Ming Lui〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Shape analysis is important in anthropology, bioarchaeology and forensic science for interpreting useful information from human remains. In particular, teeth are morphologically stable and hence well-suited for shape analysis. In this work, we propose a framework for tooth morphometry using quasi-conformal theory. Landmark-matching Teichmüller maps are used for establishing a 1-1 correspondence between tooth surfaces with prescribed anatomical landmarks. Then, a quasi-conformal statistical shape analysis model based on the Teichmüller mapping results is proposed for building a tooth classification scheme. We deploy our framework on a dataset of human premolars to analyze the tooth shape variation among genders and ancestries. Experimental results show that our method achieves much higher classification accuracy with respect to both gender and ancestry when compared to the existing methods. Furthermore, our model reveals the underlying tooth shape difference between different genders and ancestries in terms of the local geometric distortion and curvatures. In particular, our experiment suggests that the shape difference between genders is mostly captured by the conformal distortion but not the curvatures, while that between ancestries is captured by both of them.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 27 September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Marco Fiorucci, Francesco Pelosin, Marcello Pelillo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemerédi’s Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called “regular pairs”. Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-〈em〉k〈/em〉 graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2019
    Description: 〈p〉Publication date: February 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 98〈/p〉 〈p〉Author(s): Xuhong Li, Yves Grandvalet, Franck Davoine〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which are at least partially relevant for solving the target task, but would be difficult to extract from the limited amount of data available on the target task. However, besides the initialization with the pre-trained model and the early stopping, there is no mechanism in fine-tuning for retaining the features learned on the source task. In this paper, we investigate several regularization schemes that explicitly promote the similarity of the final solution with the initial model. We show the benefit of having an explicit inductive bias towards the initial model. We eventually recommend that the baseline protocol for transfer learning should rely on a simple 〈em〉L〈/em〉〈sup〉2〈/sup〉 penalty using the pre-trained model as a reference.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 100〈/p〉 〈p〉Author(s): Bailin Yang, Yulong Zhang, Zhenguang Liu, Xiaoheng Jiang, Mingliang Xu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Writing is an important basic skill for humans. To acquire such a skill, pupils often have to practice writing for several hours each day. However, different pupils usually possess distinct writing postures. Bad postures not only affect the speed and quality of writing, but also severely harm the healthy development of pupils’ spine and eyesight. Therefore, it is of key importance to identify or predict pupils’ writing postures and accordingly correct bad ones. In this paper, we formulate the problem of handwriting posture prediction for the first time. Further, we propose a neural network constructed with small convolution kernels to extract features from handwriting, and incorporate unsupervised learning and handwriting data analysis to predict writing postures. Extensive experiments reveal that our approach achieves an accuracy rate of 93.3%, which is significantly higher than the 76.67% accuracy of human experts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Shichao Kan, Linna Zhang, Zhihai He, Yigang Cen, Shiming Chen, Jikun Zhou〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Feature fusion is an important skill to improve the performance in computer vision, the difficult problem of feature fusion is how to learn the complementary properties of different features. We recognize that feature fusion can benefit from kernel metric learning. Thus, a metric learning-based kernel transformer method for feature fusion is proposed in this paper. First, we propose a kernel transformer to convert data from data space to kernel space, which makes feature fusion and metric learning can be performed in the transformed kernel space. Second, in order to realize supervised learning, both triplets and label constraints are embedded into our model. Third, in order to solve the unknown kernel matrices, LogDet divergence is also introduced into our model. Finally, a complete optimization objective function is formed. Based on an alternating direction method of multipliers (ADMM) solver and the Karush-Kuhn-Tucker (KKT) theorem, the proposed optimization problem is solved with the rigorous theoretical analysis. Experimental results on image retrieval demonstrate the effectiveness of the proposed methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Zhen Wang, Jianmin Gao, Rongxi Wang, Zhiyong Gao, Yanjie Liang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Aiming at the poor performance of individual classifier in the field of fault recognition, in this paper, a new ensemble classifier is constructed to improve the classification accuracy by combining multiple classifiers based on Dempster–Shafer Theory (DST). However, in some specific cases, especially when dealing with the combination of conflicting evidences, the DST may produce counter-intuitive results and loss its advantages in combining classifiers. To solve this problem, a new improved combination method is proposed to alleviate the conflicts between evidences and a new ensemble technique is developed for the combination of individual classifiers, which can be well used in the design of accurate classifier ensembles. The main advantage of the proposed combination method is that of making the combination process more efficient and accurate by defining the objective weights and subjective weights of member classifiers’ outputs. To verify the effectiveness of the proposed combination method, four individual classifiers are selected for constructing ensemble classifier and tested on Tennessee-Eastman Process (TEP) datasets and UCI machine learning datasets. The experimental results show that the ensemble classifier can significantly improve the classification accuracy and outperforms all the selected individual classifiers. By comparison with other combination methods based on DST and some state-of-the-art ensemble methods, the proposed combination method shows better abilities in dealing with the combination of individual classifiers and outperforms the others in multiple performance measurements. Finally, the proposed ensemble classifier is applied to the fault recognition in real chemical plant.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2019
    Description: 〈p〉Publication date: February 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 98〈/p〉 〈p〉Author(s): Sergio Muñoz-Romero, Arantza Gorostiaga, Cristina Soguero-Ruiz, Inmaculada Mora-Jiménez, José Luis Rojo-Álvarez〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉There is nowadays an increasing interest in discovering relationships among input variables (also called features) from data to provide better interpretability, which yield more confidence in the solution and provide novel insights about the nature of the problem at hand. We propose a novel feature selection method, called Informative Variable Identifier (IVI), capable of identifying the informative variables and their relationships. It transforms the input-variable space distribution into a coefficient-feature space using existing linear classifiers or a more efficient weight generator that we also propose, Covariance Multiplication Estimator (CME). Informative features and their relationships are determined analyzing the joint distribution of these coefficients with resampling techniques. IVI and CME select the informative variables and then pass them on to any linear or nonlinear classifier. Experiments show that the proposed approach can outperform state-of-art algorithms in terms of feature identification capabilities, and even in classification performance when subsequent classifiers are used.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Shizhe Hu, Xiaoqiang Yan, Yangdong Ye〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉To exploit the complementary information of multi-view data, many weighted multi-view clustering methods have been proposed and have demonstrated impressive performance. However, most of these methods learn the view weights by introducing additional parameters, which can not be easily obtained in practice. Moreover, they all simply apply the learned weights on the original feature representation of each view, which may deteriorate the clustering performance in the case of high-dimensional data with redundancy and noise. In this paper, we extend information bottleneck co-clustering into a multi-view framework and propose a novel dynamic auto-weighted multi-view co-clustering algorithm to learn a group of weights for views with no need for extra weight parameters. By defining the new concept of the discrimination-compression rate, we quantify the importance of each view by evaluating the discriminativeness of the compact features (i.e., feature-wise clusters) of the views. Unlike existing weighted methods that impose weights on the original feature representations of multiple views, we apply the learned weights on the discriminative ones, which can reduce the negative impact of noisy features in high-dimensional data. To solve the optimization problem, a new two-step sequential method is designed. Experimental results on several datasets show the advantages of the proposed algorithm. To our knowledge, this is the first work incorporating weighting scheme into multi-view co-clustering framework.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Yu Zhang, Yin Wang, Xu-Ying Liu, Siya Mi, Min-Ling Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we investigate the large-scale multi-label image classification problem when images with unknown novel classes come in stream during the training stage. It coincides with the practical requirement that usually novel classes are detected and used to update an existing image recognition system. Most existing multi-label image classification methods cannot be directly applied in this scenario, where the training and testing stages must have the same label set. In this paper, we proposed to learn a multi-label classifier and a novel-class detector alternately to solve this problem. The multi-label classifier is learned using a convolutional neural network (CNN) from the images in the known classes. We proposed a recurrent novel-class detector which is learned in the supervised manner to detect the novel class by encoding image features with the multi-label information. In the experiment, our method is evaluated on several large-scale multi-label benchmarks including MS COCO. The results show the proposed method is comparable to most existing multi-label image classification methods, which validate its efficacy when encountering streaming images with unknown classes.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): A. Sasithradevi, S. Mohamed Mansoor Roomi〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The rise in the availability of video content for access via the Internet and the medium of television has resulted in the development of automatic search procedures to retrieve the desired video. Searches can be simplified and hastened by employing automatic classification of videos. This paper proposes a descriptor called the Spatio-Temporal Histogram of Radon Projections (STHRP) for representing the temporal pattern of the contents of a video and demonstrates its application to video classification and retrieval. The first step in STHRP pattern computation is to represent any video as Three Orthogonal Planes (TOPs), i.e., XY, XT and YT, signifying the spatial and temporal contents. Frames corresponding to each plane are partitioned into overlapping blocks. Radon projections are obtained over these blocks at different orientations, resulting in weighted transform coefficients that are normalized and grouped into bins. Linear Discriminant Analysis (LDA) is performed over these coefficients of the TOPs to arrive at a compact description of STHRP pattern. Compared to existing classification and retrieval approaches, the proposed descriptor is highly robust to translation, rotation and illumination variations in videos. To evaluate the capabilities of the invariant STHRP pattern, we analyse the performance by conducting experiments on the UCF-101, HMDB51, 10contexts and TRECVID data sets for classification and retrieval using a bagged tree model. Experimental evaluation of video classification reveals that STHRP pattern can achieve classification rates of 96.15%, 71.7%, 93.24% and 97.3% for the UCF-101, HMDB51,10contexts and TRECVID 2005 data sets respectively. We conducted retrieval experiments on the TRECVID 2005, JHMDB and 10contexts data sets and the results revealed that STHRP pattern is able to provide the videos relevant to the user's query in minimal time (0.05s) with a good precision rate.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 29 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Chenchen Zhao, Yeqiang Qian, Ming Yang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Accurate pedestrian orientation estimation of autonomous driving helps the ego vehicle obtain the intentions of pedestrians in the related environment, which are the base of safety measures such as collision avoidance and prewarning. However, because of relatively small sizes and high-level deformation of pedestrians, common pedestrian orientation estimation models fail to extract sufficient and comprehensive information from them, thus having their performance restricted, especially monocular ones which fail to obtain depth information of objects and related environment. In this paper, a novel monocular pedestrian orientation estimation model, called FFNet, is proposed. Apart from camera captures, the model adds the 2D and 3D dimensions of pedestrians as two other inputs according to the logic relationship between orientation and them. The 2D and 3D dimensions of pedestrians are determined from the camera captures and further utilized through two feedforward links connected to the orientation estimator. The feedforward links strengthen the logicality and interpretability of the network structure of the proposed model. Experiments show that the proposed model has at least 1.72% AOS increase than most state-of-the-art models after identical training processes. The model also has competitive results in orientation estimation evaluation on KITTI dataset.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 100〈/p〉 〈p〉Author(s): Peipei Li, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Age estimation of unknown persons is a challenging pattern analysis task due to the lack of training data and various ageing mechanisms for different individuals. Label distribution learning-based methods usually make distribution assumptions to simplify age estimation. However, since different genders, races and/or any other characteristics may influence facial ageing, age-label distributions are often complicated and difficult to model parametrically. In this paper, we propose a label refinery network (LRN) with two concurrent processes: label distribution refinement and slack regression refinement. The label refinery network aims to learn age-label distributions progressively in an iterative manner. In this way, we can adaptively obtain the specific age-label distributions for different facial images without making strong assumptions on the fixed distribution formulations. To further utilize the correlations among age labels, we propose a slack regression refinery to convert the age-label regression model into an age-interval regression model. Extensive experiments on three popular datasets, namely, MORPH Album2, ChaLearn15 and MegaAge-Asian, demonstrate the superiority of our method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Yafu Xiao, Jing Li, Bo Du, Jia Wu, Jun Chang, Wenfan Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Despite the great success in the computer vision field, visual tracking is still a challenging task. The main obstacle is that the target object often suffers from interference, such as occlusion. As most Siamese network-based trackers mainly sample image patches of target objects for training, the tracking algorithm lacks sufficient information about the surrounding environment. Besides, many Siamese network-based tracking algorithms build a regression only with the target object samples without considering the relationship between target and background, which may deteriorate the performance of trackers. In this paper, we propose a metric correlation Siamese network and multi-class negative sampling tracking method. For the first time, we explore a sampling approach that includes three different kinds of negative samples: virtual negative samples for pre-learning the potential occlusion situation, boundary negative samples to cope with potential tracking drift, and context negative samples to cope with potential incorrect positioning. With the three kinds of negative samples, we also propose a metric correlation method to train a correlation filter that contains metric information for better discrimination. Furthermore, we design a Siamese network-based architecture to embed the metric correlation filter module mentioned above in order to benefit from the powerful representation ability of deep learning. Extensive experiments on challenging OTB100 and VOT2017 datasets demonstrate the competitive performance of the proposed algorithm performs favorably compared with state-of-the-art approaches.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Dongming Wei, Lichi Zhang, Zhengwang Wu, Xiaohuan Cao, Gang Li, Dinggang Shen, Qian Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Deformable brain MR image registration is challenging due to large inter-subject anatomical variation. For example, the highly complex cortical folding pattern makes it hard to accurately align corresponding cortical structures of individual images. In this paper, we propose a novel deep learning way to simplify the difficult registration problem of brain MR images. Specifically, we train a morphological simplification network (MS-Net), which can generate a 〈em〉simple〈/em〉 image with less anatomical details based on the 〈em〉complex〈/em〉 input. With MS-Net, the complexity of the fixed image or the moving image under registration can be reduced gradually, thus building an individual (simplification) trajectory represented by MS-Net outputs. Since the generated images at the ends of the two trajectories (of the fixed and moving images) are so simple and very similar in appearance, they are easy to register. Thus, the two trajectories can act as a bridge to link the fixed and the moving images, and guide their registration. Our experiments show that the proposed method can achieve highly accurate registration performance on different datasets (〈em〉i.e.〈/em〉, NIREP, LPBA, IBSR, CUMC, and MGH). Moreover, the method can be also easily transferred across diverse image datasets and obtain superior accuracy on surface alignment. We propose MS-Net as a powerful and flexible tool to simplify brain MR images and their registration. To our knowledge, this is the first work to simplify brain MR image registration by deep learning, instead of estimating deformation field directly.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Bingbing Zhang, Qilong Wang, Xiaoxiao Lu, Fasheng Wang, Peihua Li〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Feature coding is a key component of the bag of visual words (BoVW) model, which is designed to improve image classification and retrieval performance. In the feature coding process, each feature of an image is nonlinearly mapped via a dictionary of visual words to form a high-dimensional sparse vector. Inspired by the well-known locality-constrained linear coding (LLC), we present a locality-constrained affine subspace coding (LASC) method to address the limitation whereby LLC fails to consider the local geometric structure around visual words. LASC is distinguished from all the other coding methods since it constructs a dictionary consisting of an ensemble of affine subspaces. As such, the local geometric structure of a manifold is explicitly modeled by such a dictionary. In the process of coding, each feature is linearly decomposed and weighted to form the first-order LASC vector with respect to its top-k neighboring subspaces. To further boost performance, we propose the second-order LASC vector based on information geometry. We use the proposed coding method to perform both image classification and image retrieval tasks and the experimental results show that the method achieves superior or competitive performance in comparison to state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Wan-Jin Yu, Zhen-Duo Chen, Xin Luo, Wu Liu, Xin-Shun Xu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Multi-label image classification problem is one of the most important and fundamental problems in computer vision. In an image with multiple labels, the objects usually locate at various positions with different scales and poses. Moreover, some labels are associated with the entire image instead of a small region. Therefore, both the global and local information are important for classification. To effectively extract and make full use of these information, in this paper, we present a novel deep Dual-stream nEtwork for the muLTi-lAbel image classification task, DELTA for short. As its name indicates, it is composed of two streams, i.e., the Multi-Instance network and the Global Priors network. The former is used to extract the multi-scale class-related local instances features by modeling the classification problem in a multi-instance learning framework. The latter is devised to capture the global priors from the input image as the global information. These two streams are fused by the final fusion layer. In this way, DELTA can extract and make full use of both the global and local information for classification. Extensive experiments on three benchmark datasets, i.e., PASCAL VOC 2007, PASCAL VOC 2012 and Microsoft COCO, demonstrate that DELTA significantly outperforms several state-of-the-art methods. Moreover, DELTA can automatically locate the key image patterns that trigger the labels.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Çiğdem Sazak, Carl J. Nelson, Boguslaw Obara〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Wangli Hao, Zhaoxiang Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Hua Yang, Chenghui Huang, Feiyue Wang, Kaiyou Song, Shijiao Zheng, Zhouping Yin〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Although template matching has been widely studied in the fields of image processing and computer vision, current template matching methods still cannot address large-scale changes and rotation changes simultaneously. In this study, we propose a novel adaptive radial ring code histograms (ARRCH) image descriptor for large-scale and rotation-invariant template matching. The image descriptor is constructed by (1) identifying, inside the template, a set of concentric ring regions around a reference point, (2) detecting “stable” pixels based on the ASGO, which is tolerant with respect to large scale change, (3) extracting a rotation-invariant feature for each “stable” pixel, and (4) discretizing the features in a separate histogram for each concentric ring region in the scale space. Finally, an ARRCH image descriptor is obtained by chaining the histograms of all concentric ring regions for each scale. In matching mode, a sliding window approach is used to extract descriptors, which are compared with the template one, and a coarse-to-fine search strategy is employed to detect the scale of the target image. To demonstrate the performance of the ARRCH, several experiments are carried out, including a parameter experiment and a large-scale and rotation change matching experiment, and some applications are presented. The experimental results demonstrate that the proposed method is more resistant to large-scale and rotation differences than previous state-of-the-art matching methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Peng Wang, Lingqiao Liu, Chunhua Shen, Heng Tao Shen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Most video based action recognition approaches create the video-level representation by temporally pooling the features extracted at every frame. The pooling methods they adopt, however, usually completely or partially ignore the dynamic information contained in the temporal domain, which may undermine the discriminative power of the resulting video representation since the video sequence order could unveil the evolution of a specific event or action. To overcome this drawback and explore the importance of incorporating the temporal order information, in this paper we propose a novel temporal pooling approach to aggregate the frame-level features. Inspired by the capacity of Convolutional Neural Networks (CNN) in making use of the internal structure of images for information abstraction, we propose to apply the temporal convolution operation to the frame-level representations to extract the dynamic information. However, directly implementing this idea on the original high-dimensional feature will result in parameter explosion. To handle this issue, we propose to treat the temporal evolution of the feature value at each feature dimension as a 1D signal and learn a unique convolutional filter bank for each 1D signal. By conducting experiments on three challenging video-based action recognition datasets, HMDB51, UCF101, and Hollywood2, we demonstrate that the proposed method is superior to the conventional pooling methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Wei Wei, Bin Zhou, Dawid Połap, Marcin Woźniak〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉 〈p〉Improving CT images by increasing the number of scans, hence increasing the ionizing radiation dose, can increase the probability of inducing cancer in the patient. Using fewer images but improving them by accurate reconstruction is better solution.〈/p〉 〈p〉In this paper, an adaptive variational Partial Differential Equation (PDE) model is proposed for image reconstruction. L2 energy of the image gradient and the Total Variation (TV) are combined to form a new functional, which is introduced to an optimization problem. The dynamic behaviors of the model are formed by a threshold function, and then the L2 term is applied in the lower-density region to increase reconstruction speed, and the TV term is applied in the higher-density region to preserve the most important image features. The threshold function is asymptotically controlled by an evolutionary PDE and is more suitable for complex images. The efficiency and accuracy of the proposed model are demonstrated in numerical experiments.〈/p〉 〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 20 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Linjiang Huang, Yan Huang, Wangli Ouyang, Liang Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Action recognition using pose information has drawn much attention recently. However, most previous approaches treat human pose as a whole or just use pose to extract robust features. Actually, human body parts play an important role in action, and so modeling spatio-temporal information of body parts can effectively assist in classifying actions. In this paper, we propose a Part-aligned Pose-guided Recurrent Network (P〈sup〉2〈/sup〉RN) for action recognition. The model mainly consists of two modules, i.e., part alignment module and part pooling module, which are used for part representation learning and part-related feature fusion, respectively. The part-alignment module incorporates an auto-transformer attention, aiming to capture spatial configuration of body parts and predict pose attention maps. While the part pooling module exploits both symmetry and complementarity of body parts to produce fused body representation. The whole network is a recurrent network which can exploit the body representation and simultaneously model spatio-temporal evolutions of human body parts. Experiments on two publicly available benchmark datasets show the state-of-the-art performance and demonstrate the power of the two proposed modules.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 20 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Xiangrui Li, Andong Wang, Jianfeng Lu, Zhenmin Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Low-rank or sparse tensor recovery finds many applications in computer vision and machine learning. The recently proposed regularized multilinear regression and selection (Remurs) model assumes the true tensor to be simultaneously low-Tucker-rank and sparse, and has been successfully applied in fMRI analysis. However, the statistical performance of Remurs-like models is still lacking. To address this problem, a minimization problem based on a newly defined tensor nuclear-〈em〉l〈/em〉〈sub〉1〈/sub〉-norm is proposed, to recover a simultaneously low-Tucker-rank and sparse tensor from its degraded observations. Then, an M-ADMM-based algorithm is developed to efficiently solve the problem. Further, the statistical performance is analyzed by establishing a deterministic upper bound on the estimation error for general noise. Also, under Gaussian noise, non-asymptotic upper bounds for two specific settings, i.e., noisy tensor decomposition and random Gaussian design, are given. Experiments on synthetic datasets demonstrate that the proposed theorems can precisely predict the scaling behavior of the estimation error.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Dengfeng Chai〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper formulates superpixel segmentation as a pixel labeling problem and proposes a quaternary labeling algorithm to generate superpixel lattice. It is achieved by seaming overlapped patches regularly placed on the image plane. Patch seaming is formulated as a pixel labeling problem, where each label indexes one patch. Once the optimal seaming is completed, all pixels covered by one retained patch constitute one superpixel. Further, four kinds of patches are distinguished and assembled into four layers correspondingly, and the patch indexes are mapped to the quaternary layer indexes. It significantly reduces the number of labels and greatly improves labelling efficiency. Furthermore, an objective function is developed to achieve optimal segmentation. Lattice structure is guaranteed by fixing patch centers to be superpixel centers, compact superpixels are assured by horizontal and vertical constraints enforced on the smooth terms, and coherent superpixels are achieved by iteratively refining the data terms. Extensive experiments on BSDS data set demonstrate that SQL algorithm significantly improves labeling efficiency, outperforms the other superpixel lattice methods, and is competitive with state-of-the-art methods without lattice guarantee. Superpixel lattice allows contextual relationships among superpixels to be easily modeled by either MRFs or CNN.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Surina Borjigin, Prasanna K. Sahoo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we propose a multi-level thresholding model based on gray-level & local-average histogram (GLLA) and Tsallis–Havrda–Charvát entropy for RGB color image. We validate the multi-level thresholding formulation by using the mathematical induction method. We apply particle swarm optimization (PSO) algorithm to obtain the optimal threshold values for each component of a RGB image. By assigning the mean values from each thresholded class, we obtain three segmented component images independently. We conduct the experiments extensively on The Berkeley Segmentation Dataset and Benchmark (BSDS300) and calculate the average four performance indices (〈em〉BDE, PRI, GCE〈/em〉 and 〈em〉VOI〈/em〉) to show the effectiveness and reasonability of the proposed method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Xinge You, Jiamiao Xu, Wei Yuan, Xiao-Yuan Jing, Dacheng Tao, Taiping Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Cross-view classification that means to classify samples from heterogeneous views is a significant yet challenging problem in computer vision. An effective solution to this problem is the multi-view subspace learning (MvSL), which intends to find a common subspace for multi-view data. Although great progress has been made, existing methods usually fail to find a suitable subspace when multi-view data lies on nonlinear manifolds, thus leading to performance deterioration. To circumvent this drawback, we propose Multi-view Common Component Discriminant Analysis (MvCCDA) to handle view discrepancy, discriminability and nonlinearity in a joint manner. Specifically, our MvCCDA incorporates supervised information and local geometric information into the common component extraction process to learn a discriminant common subspace and to discover the nonlinear structure embedded in multi-view data. Optimization and complexity analysis of MvCCDA are also presented for completeness. Our MvCCDA is competitive with the state-of-the-art MvSL based methods on four benchmark datasets, demonstrating its superiority.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Alper Aksac, Tansel Özyer, Reda Alhajj〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we propose a cut-edge algorithm for spatial clustering (CutESC) based on proximity graphs. The CutESC algorithm removes edges when a cut-edge value for the edge’s endpoints is below a threshold. The cut-edge value is calculated by using statistical features and spatial distribution of data based on its neighborhood. Also, the algorithm works without any prior information and preliminary parameter settings while automatically discovering clusters with non-uniform densities, arbitrary shapes, and outliers. However, there is an option which allows users to set two parameters to better adapt clustering solutions for particular problems. To assess advantages of CutESC algorithm, experiments have been conducted using various two-dimensional synthetic, high-dimensional real-world, and image segmentation datasets.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Ernesto Bribiesca〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Generally speaking, a spiral is a 2D curve which winds about a fixed point. Now, we present a new, alternative, and easy way to describe and generate spirals by means of the use of the Slope Chain Code (SCC) [E. Bribiesca, A measure of tortuosity based on chain coding, Pattern Recognition 46 (2013) 716–724]. Thus, each spiral is represented by only one chain. The chain elements produce a finite alphabet which allows us to use grammatical techniques for spiral classification. Spirals are composed of constant straight-line segments and their chain elements are obtained by calculating the slope changes between contiguous straight-line segments (angle of contingence) scaled to a continuous range from 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si43.svg"〉〈mrow〉〈mo〉−〈/mo〉〈mn〉1〈/mn〉〈/mrow〉〈/math〉 (〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si44.svg"〉〈mrow〉〈mo〉−〈/mo〉〈msup〉〈mn〉180〈/mn〉〈mo〉∘〈/mo〉〈/msup〉〈/mrow〉〈/math〉) to 1 (180〈sup〉∘〈/sup〉). The SCC notation is invariant under translation, rotation, optionally under scaling, and it does not use a grid. Other interesting properties can be derived from this notation, such as: the mirror symmetry and inverse spirals, the accumulated slope, the slope change mean, and tortuosity for spirals. We introduce new concepts of projective polygonal paths and osculating polygons. We present a new spiral called the SCC polygonal spiral and its chain which is described by the numerical sequence 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si45.svg"〉〈mfrac〉〈mn〉2〈/mn〉〈mi〉n〈/mi〉〈/mfrac〉〈/math〉 for 〈em〉n〈/em〉 ≥ 3, to the best of our knowledge this is the first time that this spiral and its chain are presented. The importance of this spiral and its chain is that this chain is covering all the slope changes of all the regular polygons composed of 〈em〉n〈/em〉 edges (n-gons). Also, we describe the chain which generates the spiral of Archimedes. Finally, we present some results of different kind of spirals from the real world, including spiral patterns in shells.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Jun Tang, Zhibo Yang, Yongpan Wang, Qi Zheng, Yongchao Xu, Xiang Bai〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉State-of-the-art methods have achieved impressive performances on multi-oriented text detection. Yet, they usually have difficulty in handling curved and dense texts, which are common in commodity images. In this paper, we propose a network for detecting dense and arbitrary-shaped scene text by instance-aware component grouping (ICG), which is a flexible bottom-up method. To address the difficulty in separating dense text instances faced by most bottom-up methods, we propose attractive and repulsive link between text components which forces the network learning to focus more on close text instances, and instance-aware loss that fully exploits context to supervise the network. The final text detection is achieved by a modified minimum spanning tree (MST) algorithm based on the learned attractive and repulsive links. To demonstrate the effectiveness of the proposed method, we introduce a dense and arbitrary-shaped scene text dataset composed of commodity images (DAST1500). Experimental results show that the proposed ICG significantly outperforms state-of-the-art methods on DAST1500 and two curved text datasets: Total-Text and CTW1500, and also achieves very competitive performance on two multi-oriented datasets: ICDAR15 (at 7.1FPS for 1280 × 768 image) and MTWI.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Sandro Cumani, Pietro Laface〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si2.svg"〉〈mrow〉〈mi mathvariant="bold-script"〉O〈/mi〉〈mo〉(〈/mo〉〈msup〉〈mi〉N〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈mo〉)〈/mo〉〈mo〉,〈/mo〉〈/mrow〉〈/math〉 but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si3.svg"〉〈mrow〉〈mi mathvariant="bold-script"〉O〈/mi〉〈mo〉(〈/mo〉〈msup〉〈mi〉N〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈mo〉)〈/mo〉〈/mrow〉〈/math〉 complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Yuan Zhu, Jiufeng Zhou, Hong Yan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we show that graph matching methods based on relaxation labeling, spectral graph theory and tensor theory have the same mathematical form by employing power iteration technique. Besides, the differences among these methods are also fully discussed and can be proven that distinctions have little impact on the final matching result. Moreover, we propose a fast compatibility building procedure to accelerate the preprocessing speed which is considered to be the main time consuming part of graph matching. Finally, several experiments are conducted to verify our findings.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 17 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Inpyo Hong, Youngbae Hwang, Daeyoung Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Image denoising is a fundamental task in computer vision and image processing domain. In recent years, the task has been tackled with deep neural networks by learning the patterns of noises and image patches. However, because of the high diversity of natural image patches and noise distributions, a huge network with a large amount of training data is necessary to obtain a state-of-the-art performance. In this paper, we propose a novel ensemble strategy of exploiting multiple deep neural networks for efficient deep learning of image denoising. We divide the task of image denoising into several local subtasks according to the complexity of clean image patches and conquer each subtask using a network trained on its local space. Then, we combine the local subtasks at test time by applying the set of networks to each noisy patch as a weighted mixture, where the mixture weights are determined by the likelihood of each network for each noisy patch. Our methodology of using locally-learned networks based on patch complexity effectively decreases the diversity of image patches at each single network, and their adaptively-weighted mixture to the input combines the local subtasks efficiently. Extensive experimental results on Berkeley segmentation dataset and standard test images demonstrate that our strategy significantly boosts denoising performance in comparison to using a single network of the same total capacity. Furthermore, our method outperforms previous methods with much smaller training samples and trainable parameters, and so with much reduced time complexity both in training and running.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Myungjun Kim, Dong-gi Lee, Hyunjung Shin〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉A set of data can be obtained from different hierarchical levels in diverse domains, such as multi-levels of genome data in omics, domestic/global indicators in finance, ancestors/descendants in phylogenetics, genealogy, and sociology. Such layered structures are often represented as a hierarchical network. If a set of different data is arranged in such a way, then one can naturally devise a network-based learning algorithm so that information in one layer can be propagated to other layers through interlayer connections. Incorporating individual networks in layers can be considered as an integration in a serial/vertical manner in contrast with parallel integration for multiple independent networks. The hierarchical integration induces several problems on computational complexity, sparseness, and scalability because of a huge-sized matrix. In this paper, we propose two versions of an algorithm, based on semi-supervised learning, for a hierarchically structured network. The naïve version utilizes existing method for matrix sparseness to solve label propagation problems. In its approximate version, the loss in accuracy versus the gain in complexity is exploited by providing analyses on error bounds and complexity. The experimental results show that the proposed algorithms perform well with hierarchically structured data, and, outperform an ordinary semi-supervised learning algorithm.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Xuefei Zhe, Shifeng Chen, Hong Yan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉L2-normalization is an effective method to enhance the discriminant power of deep representation learning. However, without exploiting the geometric properties of the feature space, the generally used gradient based optimization methods are failed to track the global information during training. In this paper, we propose a novel deep metric learning model based on the directional distribution. By defining the loss function based on the von Mises–Fisher distribution, we propose an effective alternative learning algorithm by periodically updating the class centers. The proposed metric learning not only captures the global information about the embedding space but also yields an approximate representation of the class distribution during training. Considering classification and retrieval tasks, our experiments on benchmark datasets demonstrate an improvement from the proposed algorithm. Particularly, with a small number of convolutional layers, a significant accuracy upsurge can be observed compared to widely used gradient-based methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Swarnendu Ghosh, Nibaran Das, Mita Nasipuri〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Convolutional Neural Network has become very common in the field of computer vision in recent years. But it comes with a severe restriction regarding the size of the input image. Most convolutional neural networks are designed in a way so that they can only accept images of a fixed size. This creates several challenges during data acquisition and model deployment. The common practice to overcome this limitation is to reshape the input images so that they can be fed into the networks. Many standard pre-trained networks and datasets come with a provision of working with square images. In this work we analyze 25 different reshaping methods across 6 datasets corresponding to different domains trained on three famous architectures namely Inception-V3, which is an extension of GoogLeNet, the Residual Networks (Resent-18) and the 121-Layer deep DenseNet. While some of the reshaping methods like “interpolation” and “cropping” have been commonly used with convolutional neural networks, some uncommon techniques like “containing”, “tiling” and “mirroring” have also been demonstrated. In total, 450 neural networks were trained from scratch to provide various analyses regarding the convergence of the validation loss and the accuracy obtained on the test data. Statistical measures have been provided to demonstrate the dependence between parameter choices and datasets. Several key observations were noted such as the benefits of using randomized processes, poor performance of the commonly used “cropping” techniques and so on. The paper intends to provide empirical evidence to guide the reader to choose a proper technique of reshaping inputs for their convolutional neural networks. The official code is available in https://github.com/DVLP-CMATERJU/Reshaping-Inputs-for-CNN.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319301505-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Yibao Li, Jing Wang, Bingheng Lu, Darae Jeong, Junseok Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉We propose an efficient and robust algorithm to reconstruct the volumes of multi-labeled objects from sets of cross sections without overlapping regions, artificial gaps, or mismatched interfaces. The algorithm can handle cross sections wherein different regions have different labels. The present study represents a multicomponent extension of our previous work (Li et al. (2015), [1]), wherein we modified the original Cahn–Hilliard (CH) equation by adding a fidelity term to keep the solution close to the single-labeled slice data. The classical CH equation possesses desirable properties, such as smoothing and conservation. The key idea of the present work is to employ a multicomponent CH system to reconstruct multicomponent volumes without self-intersections. We utilize the linearly stabilized convex splitting scheme introduced by Eyre with the Fourier-spectral method so that we can use a large time step and solve the discrete equation quickly. The proposed algorithm is simple and produces smooth volumes that closely preserve the original volume data and do not self-intersect. Numerical results demonstrate the effectiveness and robustness of the proposed method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Tomas Björklund, Attilio Fiandrotti, Mauro Annarumma, Gianluca Francini, Enrico Magli〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this work, we describe a License Plate Recognition (LPR) system designed around convolutional neural networks (CNNs) trained on synthetic images to avoid collecting and annotating the thousands of images required to train a CNN. First, we propose a framework for generating synthetic license plate images, accounting for the key variables required to model the wide range of conditions affecting the aspect of real plates. Then, we describe a modular LPR system designed around two CNNs for plate and character detection enjoying common training procedures and train the CNNs and experiment on three different datasets of real plate images collected from different countries. Our synthetically trained system outperforms multiple competing systems trained on real images, showing that synthetic images are effective at training a CNNs for LPR if the training images have sufficient variance of the key variables controlling the plate aspect.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319301475-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Mark Brown, David Windridge, Jean-Yves Guillemaut〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉We present a family of methods for 2D–3D registration spanning both deterministic and non-deterministic branch-and-bound approaches. Critically, the methods exhibit invariance to the underlying scene primitives, enabling e.g. points and lines to be treated on an equivalent basis, potentially enabling a broader range of problems to be tackled while maximising available scene information, all scene primitives being simultaneously considered. Being a branch-and-bound based approach, the method furthermore enjoys intrinsic guarantees of global optimality; while branch-and-bound approaches have been employed in a number of computer vision contexts, the proposed method represents the first time that this strategy has been applied to the 2D–3D correspondence-free registration problem from points and lines. Within the proposed procedure, deterministic and probabilistic procedures serve to speed up the nested branch-and-bound search while maintaining optimality. Experimental evaluation with synthetic and real data indicates that the proposed approach significantly increases both accuracy and robustness compared to the state of the art.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Juan M. Górriz, Javier Ramirez, MRC AIMS Consortium, John Suckling〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper we derive practical and novel upper bounds for the resubstitution error estimate by assessing the number of linear decision functions within the problem of pattern recognition in neuroimaging. Linear classifiers and regressors have been considered in many fields, where the number of predictors far exceeds the number of training samples available, to overcome the limitations of high complexity models in terms of computation, interpretability and overfitting. Typically in neuroimaging this is the rule rather than the exception, since the dimensionality of each observation (millions of voxels) in relation to the number of available samples (hundred of scans) implies a high risk of overfitting. Based on classical combinatorial geometry, we estimate the number of hyperplanes or linear decision rules and the corresponding distribution-independent performance bounds, comparing it to those obtained by the use of the VC-dimension concept. Experiments on synthetic and neuroimaging data demonstrate the performance of resubstitution error estimators, which are often overlooked in heterogeneous scenarios where their performance is similar to that obtained by cross-validation methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Ying Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper shows that pairwise PageRank orders emerge from two-hop walks. The main tool used here refers to a specially designed sign-mirror function and a parameter curve, whose low-order derivative information implies pairwise PageRank orders with high probability. We study the pairwise correct rate by placing the Google matrix 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si11.svg"〉〈mi mathvariant="bold"〉G〈/mi〉〈/math〉 in a probabilistic framework, where 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si11.svg"〉〈mi mathvariant="bold"〉G〈/mi〉〈/math〉 may be equipped with different random ensembles for model-generated or real-world networks with sparse, small-world, scale-free features, the proof of which is mixed by mathematical and numerical evidence. We believe that the underlying spectral distribution of aforementioned networks is responsible for the high pairwise correct rate. Moreover, the perspective of this paper naturally leads to an 〈em〉O〈/em〉(1) algorithm for any single pairwise PageRank comparison if assuming both 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si61.svg"〉〈mrow〉〈mi mathvariant="bold"〉A〈/mi〉〈mo linebreak="goodbreak"〉=〈/mo〉〈mi mathvariant="bold"〉G〈/mi〉〈mo linebreak="goodbreak"〉−〈/mo〉〈msub〉〈mi mathvariant="bold"〉I〈/mi〉〈mi〉n〈/mi〉〈/msub〉〈mo〉,〈/mo〉〈/mrow〉〈/math〉 where 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si62.svg"〉〈msub〉〈mi mathvariant="bold"〉I〈/mi〉〈mi〉n〈/mi〉〈/msub〉〈/math〉 denotes the identity matrix of order 〈em〉n〈/em〉, and 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si63.svg"〉〈msup〉〈mi mathvariant="bold"〉A〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈/math〉 are ready on hand (e.g., constructed offline in an incremental manner), based on which it is easy to extract the top 〈em〉k〈/em〉 list in 〈em〉O〈/em〉(〈em〉kn〈/em〉), thus making it possible for PageRank algorithm to deal with super large-scale datasets in real time.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Xianglin Guo, Xingyu Xie, Guangcan Liu, Mingqiang Wei, Jun Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Subspace segmentation or clustering remains a challenge of interest in computer vision when handling complex noise existing in high-dimensional data. Most of the current sparse representation or minimum-rank based techniques are constructed on ℓ〈sub〉1〈/sub〉-norm or ℓ〈sub〉2〈/sub〉-norm losses, which is sensitive to outliers. Finite mixture model, as a class of powerful and flexible tools for modeling complex noise, becomes a must. Among all the choices, exponential family mixture is extremely useful in practice due to its universal approximation ability for any continuous distribution and hence covers a broader scope of characteristics of noise distribution. Equipped with such a modeling idea, this paper focuses on the complex noise contaminated subspace clustering problem by using finite mixture of exponential power (MoEP) distributions. We then harness a penalized likelihood function to perform automatic model selection and hence avoid over-fitting. Moreover, we introduce a novel prior on the singular values of representation matrix, which leads to a novel penalty in our nonconvex and nonsmooth optimization. The parameters of the MoEP model can be estimated with a Maximum A Posteriori (MAP) method. Meanwhile, the subspace is computed with joint weighted ℓ〈sub〉〈em〉p〈/em〉〈/sub〉-norm and Schatten-〈em〉q〈/em〉 quasi-norm minimization. Both theoretical and experimental results show the effectiveness of our method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Jacopo Cavazza, Pietro Morerio, Vittorio Murino〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Despite the recent deep learning (DL) revolution, kernel machines still remain powerful methods for action recognition. DL has brought the use of large datasets and this is typically a problem for kernel approaches, which are not scaling up efficiently due to kernel Gram matrices. Nevertheless, kernel methods are still attractive and more generally applicable since they can equally manage different sizes of the datasets, also in cases where DL techniques show some limitations. This work investigates these issues by proposing an explicit approximated representation that, together with a linear model, is an equivalent, yet scalable, implementation of a kernel machine. Our approximation is directly inspired by the exact feature map that is induced by an RBF Gaussian kernel but, unlike the latter, it is finite dimensional and very compact. We justify the soundness of our idea with a theoretical analysis which proves the unbiasedness of the approximation, and provides a vanishing bound for its variance, which is shown to decrease much rapidly than in alternative methods in the literature. In a broad experimental validation, we assess the superiority of our approximation in terms of (1) ease and speed of training, (2) compactness of the model, and (3) improvements with respect to the state-of-the-art performance.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Danyang Zhang, Huadong Ma, Linqiang Pan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper proposes a novel connected components labeling (CCL) approach that introduces a gamma signal to record certain mask pixels’ values to eliminate duplicated pixel checking and regulate the labeling process for higher efficiency. A new block-based two-scan CCL algorithm, Eight-Connected Gamma-Signal-regulated (ECGS) algorithm, is designed and developed by applying this approach to evaluate a block of 2 × 2 pixels (with just 6 mask pixels) in each iteration such that the total number of operations is considerably reduced and the labeling efficiency is significantly improved. The experiments conducted on a public benchmark, YACCLAB (Yet Another Connected Components Labeling Benchmark), have demonstrated that the proposed ECGS algorithm can outperform current state-of-the-art CCL algorithms for a number of digital images.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...