ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (2,799)
  • 2015-2019  (2,799)
  • Pattern Recognition  (502)
  • IEEE Internet Computing Online  (437)
  • 1283
  • 3363
  • Computer Science  (2,799)
  • Political Science
  • 1
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Rameswar Panda, Amran Bhuiyan, Vittorio Murino, Amit K. Roy-Chowdhury〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Existing approaches for person re-identification have concentrated on either designing the best feature representation or learning optimal matching metrics in a static setting where the number of cameras are fixed in a network. Most approaches have neglected the dynamic and open world nature of the re-identification problem, where one or multiple new cameras may be temporarily on-boarded into an existing system to get additional information or added to expand an existing network. To address such a very practical problem, we propose a novel approach for adapting existing multi-camera re-identification frameworks with limited supervision. First, we formulate a domain perceptive re-identification method based on geodesic flow kernel that can effectively find the best source camera (already installed) to adapt with newly introduced target camera(s), without requiring a very expensive training phase. Second, we introduce a transitive inference algorithm for re-identification that can exploit the information from best source camera to improve the accuracy across other camera pairs in a network of multiple cameras. Third, we develop a target-aware sparse prototype selection strategy for finding an informative subset of source camera data for data-efficient learning in resource constrained environments. Our approach can greatly increase the flexibility and reduce the deployment cost of new cameras in many real-world dynamic camera networks. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art unsupervised alternatives whilst being extremely efficient to compute.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chengzu Bai, Ren Zhang, Zeshui Xu, Rui Cheng, Baogang Jin, Jian Chen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Kernel entropy component analysis (KECA) is a recently proposed dimensionality reduction approach, which has showed superiority in many pattern analysis algorithms previously based on principal component analysis (PCA). The optimized KECA (OKECA) is a state-of-the-art extension of KECA and can return projections retaining more expressive power than KECA. However, OKECA is not robust to outliers and has high computational complexities attributed to its inherent properties of L2-norm. To tackle these two problems, we propose a new variant of KECA, namely L1-norm-based KECA (L1-KECA) for data transformation and feature extraction. L1-KECA attempts to find a new kernel decomposition matrix such that the extracted features store the maximum information potential, which is measured by L1-norm. Accordingly, we present a greedy iterative algorithm which has much faster convergence than OKECA's. Additionally, L1-KECA retains OKECA's capability to obtain accurate density estimation with very few features (just one or two). Moreover, a new semi-supervised L1-KECA classifier is developed and employed into the data classification. Extensive experiments on different real-world datasets validate that our model is superior to most existing KECA-based and PCA-based approaches. Code has been also made publicly available.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Samitha Herath, Basura Fernando, Mehrtash Harandi〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper we raise two important question, “〈strong〉1.〈/strong〉 Is temporal information beneficial in recognizing actions from still images? 〈strong〉2.〈/strong〉 Do we know how to take the maximum advantage from them?”. To answer these question we propose a novel transfer learning problem, Temporal To Still Image Learning (〈em〉i.e.〈/em〉, T2SIL) where we learn to derive temporal information from still images. Thereafter, we use a two-stream model where still image action predictions are fused with derived temporal predictions. In T2SIL, the knowledge transferring occurs from temporal representations of videos (〈em〉e.g.〈/em〉, Optical-flow, Dynamic Image representations) to still action images. Along with the T2SIL we propose a new action still image action dataset and a video dataset sharing the same set of classes. We explore three well established transfer learning frameworks (〈em〉i.e.〈/em〉, GANs, Embedding learning and Teacher Student Networks (TSNs)) in place of the temporal knowledge transfer method. The use of derived temporal information from our TSN and Embedding learning improves still image action recognition.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Pooya Ashtari, Fateme Nateghi Haredasht, Hamid Beigy〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Centroid-based methods including k-means and fuzzy c-means are known as effective and easy-to-implement approaches to clustering purposes in many applications. However, these algorithms cannot be directly applied to supervised tasks. This paper thus presents a generative model extending the centroid-based clustering approach to be applicable to classification and regression tasks. Given an arbitrary loss function, the proposed approach, termed Supervised Fuzzy Partitioning (SFP), incorporates labels information into its objective function through a surrogate term penalizing the empirical risk. Entropy-based regularization is also employed to fuzzify the partition and to weight features, enabling the method to capture more complex patterns, identify significant features, and yield better performance facing high-dimensional data. An iterative algorithm based on block coordinate descent scheme is formulated to efficiently find a local optimum. Extensive classification experiments on synthetic, real-world, and high-dimensional datasets demonstrate that the predictive performance of SFP is competitive with state-of-the-art algorithms such as SVM and random forest. SFP has a major advantage over such methods, in that it not only leads to a flexible, nonlinear model but also can exploit any convex loss function in the training phase without compromising computational efficiency.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Younghoon Kim, Hyungrok Do, Seoung Bum Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Graph-based clustering is an efficient method for identifying clusters in local and nonlinear data patterns. Among the existing methods, spectral clustering is one of the most prominent algorithms. However, this method is vulnerable to noise and outliers. This study proposes a robust graph-based clustering method that removes the data nodes of relatively low density. The proposed method calculates the pseudo-density from a similarity matrix, and reconstructs it using a sparse regularization model. In this process, noise and the outer points are determined and removed. Unlike previous edge cutting-based methods, the proposed method is robust to noise while detecting clusters because it cuts out irrelevant nodes. We use a simulation and real-world data to demonstrate the usefulness of the proposed method by comparing it to existing methods in terms of clustering accuracy and robustness to noisy data. The comparison results confirm that the proposed method outperforms the alternatives.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Qiong Wang, Lu Zhang, Wenbin Zou, Kidiyo Kpalma〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we present a novel method for salient object detection in videos. Salient object detection methods based on background prior may miss salient region when the salient object touches the frame borders. To solve this problem, we propose to detect the whole salient object via the adjunction of virtual borders. A guided filter is then applied on the temporal output to integrate the spatial edge information for a better detection of the salient object edges. At last, a global spatio-temporal saliency map is obtained by combining the spatial saliency map and the temporal saliency map together according to the entropy. The proposed method is assessed on three popular datasets (Fukuchi, FBMS and VOS) and compared to several state-of-the-art methods. The experimental results show that the proposed approach outperforms the tested methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Zhuoyao Zhong, Lei Sun, Qiang Huo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Although Faster R-CNN based text detection approaches have achieved promising results, their localization accuracy is not satisfactory in certain cases due to their sub-optimal bounding box regression based localization modules. In this paper, we address this problem and propose replacing the bounding box regression module with a novel LocNet based localization module to improve the localization accuracy of a Faster R-CNN based text detector. Given a proposal generated by a region proposal network (RPN), instead of directly predicting the bounding box coordinates of the concerned text instance, the proposal is enlarged to create a search region so that an “In-Out” conditional probability to each row and column of this search region is assigned, which can then be used to accurately infer the concerned bounding box. Furthermore, we present a simple yet effective two-stage approach to convert the difficult multi-oriented text detection problem to a relatively easier horizontal text detection problem, which makes our approach able to robustly detect multi-oriented text instances with accurate bounding box localization. Experiments demonstrate that the proposed approach boosts the localization accuracy of Faster R-CNN based text detectors significantly. Consequently, our new text detector has achieved superior performance on both horizontal (ICDAR-2011, ICDAR-2013 and MULTILIGUL) and multi-oriented (MSRA-TD500, ICDAR-2015) text detection benchmark tasks.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chunfeng Song, Yongzhen Huang, Yan Huang, Ning Jia, Liang Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Gait recognition is one of the most important techniques for human identification at a distance. Most current gait recognition frameworks consist of several separate steps: silhouette segmentation, feature extraction, feature learning, and similarity measurement. These modules are mutually independent with each part fixed, resulting in a suboptimal performance in challenging conditions. In this paper, we integrate those steps into one framework, i.e., an end-to-end network for gait recognition, named 〈strong〉GaitNet〈/strong〉. It is composed of two convolutional neural networks: one corresponds to gait segmentation, and the other corresponds to classification. The two networks are modeled in one joint learning procedure which can be trained jointly. This strategy greatly simplifies the traditional step-by-step manner and is thus much more efficient for practical applications. Moreover, joint learning can automatically adjust each part to fit the global optimal objective, leading to obvious performance improvement over separate learning. We evaluate our method on three large scale gait datasets, including CASIA-B, SZU RGB-D Gait and a newly built database with complex dynamic outdoor backgrounds. Extensive experimental results show that the proposed method is effective and achieves the state-of-the-art results. The code and data will be released upon request.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chuan-Xian Ren, Xiao-Lin Xu, Zhen Lei〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person re-identification (re-ID) is to match different images of the same pedestrian. It has attracted increasing research interest in pattern recognition and machine learning. Traditionally, person re-ID is formulated as a metric learning problem with binary classification output. However, higher order relationship, such as triplet closeness among the instances, is ignored by such pair-wise based metric learning methods. Thus, the discriminative information hidden in these data is insufficiently explored. This paper proposes a new structured loss function to push the frontier of the person re-ID performance in realistic scenarios. The new loss function introduces two margin parameters. They operate as bounds to remove positive pairs of very small distances and negative pairs of large distances. A trade-off coefficient is assigned to the loss term of negative pairs to alleviate class-imbalance problem. By using a linear function with the margin-based objectives, the gradients 〈em〉w.r.t.〈/em〉 weight matrices are no longer dependent on the iterative loss values in a multiplicative manner. This makes the weights update process robust to large iterative loss values. The new loss function is compatible with many deep learning architectures, thus, it induces new deep network with pair-pruning regularization for metric learning. To evaluate the performance of the proposed model, extensive experiments are conducted on benchmark datasets. The results indicate that the new loss together with the ResNet-50 backbone has excellent feature representation ability for person re-ID.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Shuzhao Li, Huimin Yu, Roland Hu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person attributes are often exploited as mid-level human semantic information to help promote the performance of person re-identification task. In this paper, unlike most existing methods simply taking the attribute learning as a classification problem, we perform it in a different way with the motivation that attributes are related to specific local regions, which refers to the perceptual ability of attributes. We utilize the process of attribute detection to generate corresponding attribute-part detectors, whose invariance to many influences like poses and camera views can be guaranteed. With detected local part regions, our model extracts local part features to handle the body part misalignment problem, which is another major challenge for person re-identification. The local descriptors are further refined by fused attribute information to eliminate interferences caused by detection deviation. Finally, the refined local feature works together with a holistic-level feature to constitute our final feature representation. Extensive experiments on two popular benchmarks with attribute annotations demonstrate the effectiveness of our model and competitive performance compared with state-of-the-art algorithms.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S003132031930319X-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Xin Wei, Hui Wang, Bryan Scotney, Huan Wan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Face recognition has achieved great success owing to the fast development of deep neural networks in the past few years. Different loss functions can be used in a deep neural network resulting in different performance. Most recently some loss functions have been proposed, which have advanced the state of the art. However, they cannot solve the problem of 〈em〉margin bias〈/em〉 which is present in class imbalanced datasets, having the so-called long-tailed distributions. In this paper, we propose to solve the margin bias problem by setting a minimum margin for all pairs of classes. We present a new loss function, Minimum Margin Loss (MML), which is aimed at enlarging the margin of those overclose class centre pairs so as to enhance the discriminative ability of the deep features. MML, together with Softmax Loss and Centre Loss, supervises the training process to balance the margins of all classes irrespective of their class distributions. We implemented MML in Inception-ResNet-v1 and conducted extensive experiments on seven face recognition benchmark datasets, MegaFace, FaceScrub, LFW, SLLFW, YTF, IJB-B and IJB-C. Experimental results show that the proposed MML loss function has led to new state of the art in face recognition, reducing the negative effect of margin bias.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Mahsa Taheri, Zahra Moslehi, Abdolreza Mirzaei, Mehran Safayani〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Measuring distance among data point pairs is a necessary step among numerous counts of algorithms in machine learning, pattern recognition and data mining. In the local perspective, the emphasis of all existing supervised metric learning algorithms is to shrink similar data points and to separate dissimilar ones in the local neighborhoods. This provides learning more appropriate distance metric in dealing with the within-class multi modal data. In this article, a new supervised local metric learning method named 〈em〉Self-Adaptive Local Metric Learning Method〈/em〉 (〈em〉SA-LM〈sup〉2〈/sup〉〈/em〉) has been proposed. The contribution of this method is in five aspects. First, in this method, learning an appropriate metric and defining the radius of local neighborhood are integrated in a joint formulation. Second, unlike the traditional approaches, SA-LM〈sup〉2〈/sup〉 learns the parameter of local neighborhood automatically thorough its formulation. As a result, it is a parameter free method, where it does not require any parameters that would need to be tuned. Third, SA-LM〈sup〉2〈/sup〉 is formulated as a SemiDefinite Program (SDP) with a global convergence guarantee. Fourth, this method does not need the similar set 〈em〉S〈/em〉, the focus here is on local areas’ data points and their separation from dissimilar ones. Finally, results of SA-LM〈sup〉2〈/sup〉 are less influenced by noisy input data points than the other compared global and local algorithms. Results obtained from different experiments indicate the outperformance of this algorithm over its counterparts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Zheng Ma, Jun Cheng, Dapeng Tao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Wearable/portable brain-computer interfaces (BCIs) for the long-term end use are a focus of recent BCI research. A challenge is how to update the BCI to meet changes in electroencephalography (EEG) signals, since the resource are so limited that retraining of traditional well-performed models, such as a support vector machine, is nearly impossible. To cope with this challenge, less-demanding adaptive online learning can be considered. We investigated an adaptive projected sub-gradient method (APSM) that is originated from the set theoretic estimation formulation and the projections onto convex sets theory. APSM provides a unifying framework for both adaptive classification and regression tasks. Coefficients of APSM are adjusted online as data arrive sequentially, with a regularization constraint made by projections onto a fixed closed ball. We extended the general APSM to a shrinkage form, where shrinkage closed balls were used instead of the original fixed one, expecting a more controllable fading effect and better adaptability. The convergence of shrinkage APSM was proved. It was also demonstrated that as shrinkage factor approached to 1, the limit point of shrinkage APSM would approach to the optimal solution with the least norm, which could be especially beneficial for generalization of the classifier. The performance of the proposed method was evaluated, and compared with those of the general APSM, the incremental support vector machine, and the passive aggressive algorithm, through an event-related potential-based BCI experiment. Results showed the advantage of the proposed method over the others on both the online classification performance and the easiness of tuning. Our study revealed the effectiveness of the proposed method for adaptive EEG classification, making it a promising tool for on-device training and updating of wearable/portable BCIs, as well as for application in other related fields, such as EEG-based biometrics.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Franco Manessi, Alessandro Rozza, Mario Manzo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In many different classification tasks it is required to manage structured data, which are usually modeled as graphs. Moreover, these graphs can be dynamic, meaning that the vertices/edges of each graph may change over time. The goal is to exploit existing neural network architectures to model datasets that are best represented with graph structures that change over time. To the best of the authors’ knowledge, this task has not been addressed using these kinds of architectures. Two novel approaches are proposed, which combine Long Short-Term Memory networks and Graph Convolutional Networks to learn long short-term dependencies together with graph structure. The advantage provided by the proposed methods is confirmed by the results achieved on four real world datasets: an increase of up to 12 percentage points in Accuracy and F1 scores for vertex-based semi-supervised classification and up to 2 percentage points in Accuracy and F1 scores for graph-based supervised classification.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Ying Liu, Konstantinos Tountas, Dimitris A. Pados, Stella N. Batalama, Michael J. Medley〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉High-dimensional data usually exhibit intrinsic low-rank structures. With tremendous amount of streaming data generated by ubiquitous sensors in the world of Internet-of-Things, fast detection of such low-rank pattern is of utmost importance to a wide range of applications. In this work, we present an 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace tracking method to capture the low-rank structure of streaming data. The method is based on the 〈em〉L〈/em〉〈sub〉1〈/sub〉-norm principal-component analysis (〈em〉L〈/em〉〈sub〉1〈/sub〉-PCA) theory that offers outlier resistance in subspace calculation. The proposed method updates the 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace as new data are acquired by sensors. In each time slot, the conformity of each datum is measured by the 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace calculated in the previous time slot and used to weigh the datum. Iterative weighted 〈em〉L〈/em〉〈sub〉1〈/sub〉-PCA is then executed through a refining function. The superiority of the proposed 〈em〉L〈/em〉〈sub〉1〈/sub〉-subspace tracking method compared to existing approaches is demonstrated through experimental studies in various application fields.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Yuming Fang, Xiaoqiang Zhang, Feiniu Yuan, Nevrez Imamoglu, Haiwen Liu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Image saliency detection has been widely explored in recent decades, but computational modeling of visual attention for video sequences is limited due to complicated temporal saliency extraction and fusion of spatial and temporal saliency. Inspired by Gestalt theory, we introduce a novel spatiotemporal saliency detection model in this study. First, we compute spatial and temporal saliency maps by low-level visual features. And then we merge these two saliency maps for spatiotemporal saliency prediction of video sequences. The spatial saliency map is calculated by extracting three kinds of features including color, luminance, and texture, while the temporal saliency map is computed by extracting motion features estimated from video sequences. A novel adaptive entropy-based uncertainty weighting method is designed to fuse spatial and temporal saliency maps to predict the final spatiotemporal saliency map by Gestalt theory. The Gestalt principle of similarity is used to estimate spatial uncertainty from spatial saliency, while temporal uncertainty is computed from temporal saliency by the Gestalt principle of common fate. Experimental results on three large-scale databases show that our method can predict visual saliency more accurately than the state-of-art spatiotemporal saliency detection algorithms.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 28 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Rui Ye, Qun Dai, Mei Ling Li〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Traditional machine learning is generally committed to obtaining classifiers which are well-performed over unlabeled test data. This usually relies on two critical assumptions: firstly, sufficient labeled training data are available; secondly, training and testing data are drawn from the same distribution and the same feature space. Unfortunately, in most cases, the actual situation is difficult to meet the above conditions. Transfer learning scheme is naturally proposed to alleviate this problem. In order to get robust classifiers with relatively lower computational costs, we incorporate the rationale of Support Vector Machine (SVM) into transfer learning scheme and propose a novel SVM-based transfer learning model, abbreviated as TrSVM. In this method, support vector sets are extracted to represent the source domain. New training datasets are respectively constructed by combining each support vector set and target labeled dataset. On the basis of these training datasets, a number of new base classifiers can be acquired. Since performance of a classifiers ensemble is generally superior to that of individual classifiers, ensemble selection is utilized in our work. A hybrid transfer learning algorithm, integrating the Genetic Algorithm based Selective Ensemble (GASEN) with TrSVM, is proposed, and abbreviated as TrGASVM, naturally. GASEN is a genetic algorithm-based heuristic algorithm for solving combinatorial optimization problems. It can not only enhance the generalization ability of an ensemble, but also alleviate the local minimum problem of greedy ensemble pruning methods. Since TrGASVM is under frame of TrSVM and GASEN, it inevitably inherits the advantages of both algorithms. The reasonable incorporation of TrSVM with GASEN endows TrGASVM with favorable transfer learning capability, with its effectiveness being demonstrated by the experimental results on three real-world text classification datasets.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): 〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Peizhen Bai, Yan Ge, Fangling Liu, Haiping Lu〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Hongsen Liu, Yang Cong, Chenguang Yang, Yandong Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Accurate 3D object recognition and 6-DOF pose estimation have been pervasively applied to a variety of applications, such as unmanned warehouse, cooperative robots, and manufacturing industry. How to extract a robust and representative feature from the point clouds is an inevitable and important issue. In this paper, an unsupervised feature learning network is introduced to extract 3D keypoint features from point clouds directly, rather than transforming point clouds to voxel grids or projected RGB images, which saves computational time while preserving the object geometric information as well. Specifically, the proposed network features in a stacked point feature encoder, which can stack the local discriminative features within its neighborhoods to the original point-wise feature counterparts. The main framework consists of both offline training phase and online testing phase. In the offline training phase, the stacked point feature encoder is trained first and then generate feature database of all keypoints, which are sampled from synthetic point clouds of multiple model views. In the online testing phase, each feature extracted from the unknown testing scene is matched among the database by using the K-D tree voting strategy. Afterwards, the matching results are achieved by using the hypothesis & verification strategy. The proposed method is extensively evaluated on four public datasets and the results show that ours deliver comparable or even superior performances than the state-of-the-arts in terms of F1-score, Average of the 3D distance (ADD) and Recognition rate.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 28 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Siyue Xie, Haifeng Hu, Yongbo Wu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Facial Expression Recognition (FER) has long been a challenging task in the field of computer vision. In this paper, we present a novel model, named Deep Attentive Multi-path Convolutional Neural Network (DAM-CNN), for FER. Different from most existing models, DAM-CNN can automatically locate expression-related regions in an expressional image and yield a robust image representation for FER. The proposed model contains two novel modules: an attention-based Salient Expressional Region Descriptor (SERD) and the Multi-Path Variation-Suppressing Network (MPVS-Net). SERD can adaptively estimate the importance of different image regions for FER task, while MPVS-Net disentangles expressional information from irrelevant variations. By jointly combining SERD and MPVS-Net, DAM-CNN is able to highlight expression-relevant features and generate a variation-robust representation for expression classification. Extensive experimental results on both constrained datasets (CK+, JAFFE, TFEID) and unconstrained datasets (SFEW, FER2013, BAUM-2i) demonstrate the effectiveness of our DAM-CNN model.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 27 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Wei-Hong Li, Zhuowei Zhong, Wei-Shi Zheng〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Person re-identification (re-id) is to match people across disjoint camera views in a multi-camera system, and re-id has been an important technology applied in smart city in recent years. However, the majority of existing person re-id methods assumes all data samples are available in advance for training. However, in a real-world scenario person images detected from multi-camera system are coming sequentially, and thus these methods are not designed for processing sequential data in an online way. While there is a few work on discussing online re-id, most of them require considerable storage of all passed labelled data samples that have been ever observed. In this work, we present an one-pass person re-id model that adapts the re-id model based on each newly observed data and no passed data are required for each update. More specifically, we develop a Sketch online Discriminant Analysis (SoDA) by embedding sketch processing into Fisher discriminant analysis (FDA). SoDA can efficiently keep the main data variations of all passed samples in a low rank matrix when processing sequential data samples, and estimate the approximate within-class variance (i.e. within-class covariance matrix) from the sketch data information. We provide theoretical analysis on the effect of the estimated approximate within-class covariance matrix. In particular, we derive upper and lower bounds on the Fisher discriminant score (i.e. the quotient between between-class variation and within-class variation after feature transformation) in order to investigate how the optimal feature transformation learned by SoDA sequentially approximates the offline FDA that is learned on all observed data. Extensive experimental results have shown the effectiveness of our SoDA and empirically support our theoretical analysis.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Farnoosh Ghadiri, Robert Bergevin, Guillaume-Alexandre Bilodeau〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with a high carried object probability and a strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 88〈/p〉 〈p〉Author(s): Kun-Yi Huang, Chung-Hsien Wu, Ming-Hsiang Su〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Mood disorders, including unipolar depression (UD) and bipolar disorder (BD), have become some of the commonest mental health disorders. The absence of diagnostic markers of BD can cause misdiagnosis of the disorder as UD on initial presentation. Short-term detection, which could be used in early detection and intervention, is desirable. This study proposed an approach for short-term detection of mood disorders based on elicited speech responses. Speech responses of participants were obtained through interviews by a clinician after participants viewed six emotion-eliciting videos. A domain adaptation method based on a hierarchical spectral clustering algorithm was proposed to adapt a labeled emotion database into a collected unlabeled mood database for alleviating the data bias problem in an emotion space. For modeling the local variation of emotions in each response, a convolutional neural network (CNN) with an attention mechanism was used to generate an emotion profile (EP) of each elicited speech response. Finally, long short-term memory (LSTM) was employed to characterize the temporal evolution of EPs of all six speech responses. Moreover, an attention model was applied to the LSTM network for highlighting pertinent speech responses to improve detection performance instead of treating all responses equally. For evaluation, this study elicited emotional speech data from 15 people with BD, 15 people with UD, and 15 healthy controls. Leave-one-group-out cross-validation was employed for the compiled database and proposed method. CNN- and LSTM-based attention models improved the mood disorder detection accuracy of the proposed method by approximately 11%. Furthermore, the proposed method achieved an overall detection accuracy of 75.56%, outperforming support-vector-machine- (62.22%) and CNN-based (66.67%) methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Debasrita Chakraborty, Vaasudev Narayanan, Ashish Ghosh〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉It is obvious to see that most of the datasets do not have exactly equal number of samples for each class. However, there are some tasks like detection of fraudulent transactions, for which class imbalance is overwhelming and one of the classes has very low (even less than 10% of the entire data) amount of samples. These tasks often fall under outlier detection. Moreover, there are some scenarios where there may be multiple subsets of the outlier class. In such cases, it should be treated as a multiple outlier type detection scenario. In this article, we have proposed a system that can efficiently handle all the aforementioned problems. We have used stacked autoencoders to extract features and then used an ensemble of probabilistic neural networks to do a majority voting and detect the outliers. Such a system is seen to have a better and reliable performance as compared to the other outlier detection systems in most of the datasets tested upon. It is seen that use of autoencoders clearly enhanced the outlier detection performance.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Marcel Nwali, Simon Liao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this research, we have developed a new algorithm to compute the moments defined in a rectangular region. By applying the recurrent formulas, symmetry properties, and particularly the parallelized matrix operations, our proposed computational method can improve the efficiency of computing Legendre, Gegenbauer, and Jacobi moments extensively with highly satisfied accuracy. To verify this new computational algorithm, the image reconstructions from the higher orders of Legendre, Gegenbauer, and Jacobi moments are performed on a testing image sized at 1024 × 1024 with very encouraging results. It took only a few seconds to compute moments and conduct the image reconstructions from the 1000-th order of the Legendre, Gegenbauer, and Jacobi moments with the PSNR values up to 45. By utilizing our new algorithm, image analysis and recognition applications using the higher orders of moments defined in a rectangular region in the range of milliseconds will be possible.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2019
    Description: 〈p〉Publication date: May 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 89〈/p〉 〈p〉Author(s): Jun Shi, Xiao Zheng, Jinjie Wu, Bangming Gong, Qi Zhang, Shihui Ying〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Histopathological image analysis works as ‘gold standard’ for cancer diagnosis. Its computer-aided approach has attracted considerable attention in the field of digital pathology, which highly depends on the feature representation for histopathological images. The principal component analysis network (PCANet) is a novel unsupervised deep learning framework that has shown its effectiveness for feature representation learning. However, PCA is susceptible to noise and outliers to affect the performance of PCANet. The Grassmann average (GA) is superior to PCA on robustness. In this work, a GA network (GANet) algorithm is proposed by embedding GA algorithm into the PCANet framework. Moreover, since quaternion algebra is an excellent tool to represent color images, a quaternion-based GANet (QGANet) algorithm is further developed to learn effective feature representations containing color information for histopathological images. The experimental results based on three histopathological image datasets indicate that the proposed QGANet achieves the best performance on the classification of color histopathological images among all the compared algorithms.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Weiwei Qian, Shunming Li, Xingxing Jiang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Machine learning-based intelligent fault diagnosis methods have gained extensive popularity and been widely investigated. However, in previous works, a major assumption accepted by default is that the training and testing datasets share the same distribution. Unfortunately, this assumption is mostly invalid in real-world applications for working condition variation of rotating machine can cause the distribution discrepancy between datasets easily, which results in performance degeneration of traditional diagnosis methods. Aiming at it, although some deep learning and transfer learning-based methods are proposed and validated effective recently, the dataset distribution alignments of them mainly focus on marginal distribution alignments, which are not powerful enough in some scenarios. Hence, a novel distribution discrepancy evaluating method called auto-balanced high-order Kullback–Leibler (AHKL) divergence is proposed, which can evaluate both the first and higher-order moment discrepancies and adapt the weights between them dimensionally and automatically. Meanwhile, smooth conditional distribution alignment (SCDA) is also developed, which performs excellently in aligning the conditional distributions through introducing soft labels instead of adopting widely-used pseudo labels. Furthermore, based on AHKL divergence and SCDA, weighted joint distribution alignment (WJDA) is developed for comprehensive joint distribution alignments. Finally, built on WJDA, we construct a novel deep transfer network (DTN) for rotating machine fault diagnosis with working condition variation. Extensive experimental evaluations through 18 transfer learning cases demonstrate its validity, and further comparisons with the state of the arts also validate its superiority.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Gary P.T. Choi, Hei Long Chan, Robin Yong, Sarbin Ranjitkar, Alan Brook, Grant Townsend, Ke Chen, Lok Ming Lui〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Shape analysis is important in anthropology, bioarchaeology and forensic science for interpreting useful information from human remains. In particular, teeth are morphologically stable and hence well-suited for shape analysis. In this work, we propose a framework for tooth morphometry using quasi-conformal theory. Landmark-matching Teichmüller maps are used for establishing a 1-1 correspondence between tooth surfaces with prescribed anatomical landmarks. Then, a quasi-conformal statistical shape analysis model based on the Teichmüller mapping results is proposed for building a tooth classification scheme. We deploy our framework on a dataset of human premolars to analyze the tooth shape variation among genders and ancestries. Experimental results show that our method achieves much higher classification accuracy with respect to both gender and ancestry when compared to the existing methods. Furthermore, our model reveals the underlying tooth shape difference between different genders and ancestries in terms of the local geometric distortion and curvatures. In particular, our experiment suggests that the shape difference between genders is mostly captured by the conformal distortion but not the curvatures, while that between ancestries is captured by both of them.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 27 September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Marco Fiorucci, Francesco Pelosin, Marcello Pelillo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉How can we separate structural information from noise in large graphs? To address this fundamental question, we propose a graph summarization approach based on Szemerédi’s Regularity Lemma, a well-known result in graph theory, which roughly states that every graph can be approximated by the union of a small number of random-like bipartite graphs called “regular pairs”. Hence, the Regularity Lemma provides us with a principled way to describe the essential structure of large graphs using a small amount of data. Our paper has several contributions: (i) We present our summarization algorithm which is able to reveal the main structural patterns in large graphs. (ii) We discuss how to use our summarization framework to efficiently retrieve from a database the top-〈em〉k〈/em〉 graphs that are most similar to a query graph. (iii) Finally, we evaluate the noise robustness of our approach in terms of the reconstruction error and the usefulness of the summaries in addressing the graph search task.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2019
    Description: 〈p〉Publication date: February 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 98〈/p〉 〈p〉Author(s): Xuhong Li, Yves Grandvalet, Franck Davoine〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In inductive transfer learning, fine-tuning pre-trained convolutional networks substantially outperforms training from scratch. When using fine-tuning, the underlying assumption is that the pre-trained model extracts generic features, which are at least partially relevant for solving the target task, but would be difficult to extract from the limited amount of data available on the target task. However, besides the initialization with the pre-trained model and the early stopping, there is no mechanism in fine-tuning for retaining the features learned on the source task. In this paper, we investigate several regularization schemes that explicitly promote the similarity of the final solution with the initial model. We show the benefit of having an explicit inductive bias towards the initial model. We eventually recommend that the baseline protocol for transfer learning should rely on a simple 〈em〉L〈/em〉〈sup〉2〈/sup〉 penalty using the pre-trained model as a reference.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 100〈/p〉 〈p〉Author(s): Bailin Yang, Yulong Zhang, Zhenguang Liu, Xiaoheng Jiang, Mingliang Xu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Writing is an important basic skill for humans. To acquire such a skill, pupils often have to practice writing for several hours each day. However, different pupils usually possess distinct writing postures. Bad postures not only affect the speed and quality of writing, but also severely harm the healthy development of pupils’ spine and eyesight. Therefore, it is of key importance to identify or predict pupils’ writing postures and accordingly correct bad ones. In this paper, we formulate the problem of handwriting posture prediction for the first time. Further, we propose a neural network constructed with small convolution kernels to extract features from handwriting, and incorporate unsupervised learning and handwriting data analysis to predict writing postures. Extensive experiments reveal that our approach achieves an accuracy rate of 93.3%, which is significantly higher than the 76.67% accuracy of human experts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Shichao Kan, Linna Zhang, Zhihai He, Yigang Cen, Shiming Chen, Jikun Zhou〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Feature fusion is an important skill to improve the performance in computer vision, the difficult problem of feature fusion is how to learn the complementary properties of different features. We recognize that feature fusion can benefit from kernel metric learning. Thus, a metric learning-based kernel transformer method for feature fusion is proposed in this paper. First, we propose a kernel transformer to convert data from data space to kernel space, which makes feature fusion and metric learning can be performed in the transformed kernel space. Second, in order to realize supervised learning, both triplets and label constraints are embedded into our model. Third, in order to solve the unknown kernel matrices, LogDet divergence is also introduced into our model. Finally, a complete optimization objective function is formed. Based on an alternating direction method of multipliers (ADMM) solver and the Karush-Kuhn-Tucker (KKT) theorem, the proposed optimization problem is solved with the rigorous theoretical analysis. Experimental results on image retrieval demonstrate the effectiveness of the proposed methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Zhen Wang, Jianmin Gao, Rongxi Wang, Zhiyong Gao, Yanjie Liang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Aiming at the poor performance of individual classifier in the field of fault recognition, in this paper, a new ensemble classifier is constructed to improve the classification accuracy by combining multiple classifiers based on Dempster–Shafer Theory (DST). However, in some specific cases, especially when dealing with the combination of conflicting evidences, the DST may produce counter-intuitive results and loss its advantages in combining classifiers. To solve this problem, a new improved combination method is proposed to alleviate the conflicts between evidences and a new ensemble technique is developed for the combination of individual classifiers, which can be well used in the design of accurate classifier ensembles. The main advantage of the proposed combination method is that of making the combination process more efficient and accurate by defining the objective weights and subjective weights of member classifiers’ outputs. To verify the effectiveness of the proposed combination method, four individual classifiers are selected for constructing ensemble classifier and tested on Tennessee-Eastman Process (TEP) datasets and UCI machine learning datasets. The experimental results show that the ensemble classifier can significantly improve the classification accuracy and outperforms all the selected individual classifiers. By comparison with other combination methods based on DST and some state-of-the-art ensemble methods, the proposed combination method shows better abilities in dealing with the combination of individual classifiers and outperforms the others in multiple performance measurements. Finally, the proposed ensemble classifier is applied to the fault recognition in real chemical plant.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2019
    Description: 〈p〉Publication date: February 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 98〈/p〉 〈p〉Author(s): Sergio Muñoz-Romero, Arantza Gorostiaga, Cristina Soguero-Ruiz, Inmaculada Mora-Jiménez, José Luis Rojo-Álvarez〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉There is nowadays an increasing interest in discovering relationships among input variables (also called features) from data to provide better interpretability, which yield more confidence in the solution and provide novel insights about the nature of the problem at hand. We propose a novel feature selection method, called Informative Variable Identifier (IVI), capable of identifying the informative variables and their relationships. It transforms the input-variable space distribution into a coefficient-feature space using existing linear classifiers or a more efficient weight generator that we also propose, Covariance Multiplication Estimator (CME). Informative features and their relationships are determined analyzing the joint distribution of these coefficients with resampling techniques. IVI and CME select the informative variables and then pass them on to any linear or nonlinear classifier. Experiments show that the proposed approach can outperform state-of-art algorithms in terms of feature identification capabilities, and even in classification performance when subsequent classifiers are used.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Shizhe Hu, Xiaoqiang Yan, Yangdong Ye〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉To exploit the complementary information of multi-view data, many weighted multi-view clustering methods have been proposed and have demonstrated impressive performance. However, most of these methods learn the view weights by introducing additional parameters, which can not be easily obtained in practice. Moreover, they all simply apply the learned weights on the original feature representation of each view, which may deteriorate the clustering performance in the case of high-dimensional data with redundancy and noise. In this paper, we extend information bottleneck co-clustering into a multi-view framework and propose a novel dynamic auto-weighted multi-view co-clustering algorithm to learn a group of weights for views with no need for extra weight parameters. By defining the new concept of the discrimination-compression rate, we quantify the importance of each view by evaluating the discriminativeness of the compact features (i.e., feature-wise clusters) of the views. Unlike existing weighted methods that impose weights on the original feature representations of multiple views, we apply the learned weights on the discriminative ones, which can reduce the negative impact of noisy features in high-dimensional data. To solve the optimization problem, a new two-step sequential method is designed. Experimental results on several datasets show the advantages of the proposed algorithm. To our knowledge, this is the first work incorporating weighting scheme into multi-view co-clustering framework.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): Yu Zhang, Yin Wang, Xu-Ying Liu, Siya Mi, Min-Ling Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we investigate the large-scale multi-label image classification problem when images with unknown novel classes come in stream during the training stage. It coincides with the practical requirement that usually novel classes are detected and used to update an existing image recognition system. Most existing multi-label image classification methods cannot be directly applied in this scenario, where the training and testing stages must have the same label set. In this paper, we proposed to learn a multi-label classifier and a novel-class detector alternately to solve this problem. The multi-label classifier is learned using a convolutional neural network (CNN) from the images in the known classes. We proposed a recurrent novel-class detector which is learned in the supervised manner to detect the novel class by encoding image features with the multi-label information. In the experiment, our method is evaluated on several large-scale multi-label benchmarks including MS COCO. The results show the proposed method is comparable to most existing multi-label image classification methods, which validate its efficacy when encountering streaming images with unknown classes.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2019
    Description: 〈p〉Publication date: March 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 99〈/p〉 〈p〉Author(s): A. Sasithradevi, S. Mohamed Mansoor Roomi〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The rise in the availability of video content for access via the Internet and the medium of television has resulted in the development of automatic search procedures to retrieve the desired video. Searches can be simplified and hastened by employing automatic classification of videos. This paper proposes a descriptor called the Spatio-Temporal Histogram of Radon Projections (STHRP) for representing the temporal pattern of the contents of a video and demonstrates its application to video classification and retrieval. The first step in STHRP pattern computation is to represent any video as Three Orthogonal Planes (TOPs), i.e., XY, XT and YT, signifying the spatial and temporal contents. Frames corresponding to each plane are partitioned into overlapping blocks. Radon projections are obtained over these blocks at different orientations, resulting in weighted transform coefficients that are normalized and grouped into bins. Linear Discriminant Analysis (LDA) is performed over these coefficients of the TOPs to arrive at a compact description of STHRP pattern. Compared to existing classification and retrieval approaches, the proposed descriptor is highly robust to translation, rotation and illumination variations in videos. To evaluate the capabilities of the invariant STHRP pattern, we analyse the performance by conducting experiments on the UCF-101, HMDB51, 10contexts and TRECVID data sets for classification and retrieval using a bagged tree model. Experimental evaluation of video classification reveals that STHRP pattern can achieve classification rates of 96.15%, 71.7%, 93.24% and 97.3% for the UCF-101, HMDB51,10contexts and TRECVID 2005 data sets respectively. We conducted retrieval experiments on the TRECVID 2005, JHMDB and 10contexts data sets and the results revealed that STHRP pattern is able to provide the videos relevant to the user's query in minimal time (0.05s) with a good precision rate.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 29 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Chenchen Zhao, Yeqiang Qian, Ming Yang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Accurate pedestrian orientation estimation of autonomous driving helps the ego vehicle obtain the intentions of pedestrians in the related environment, which are the base of safety measures such as collision avoidance and prewarning. However, because of relatively small sizes and high-level deformation of pedestrians, common pedestrian orientation estimation models fail to extract sufficient and comprehensive information from them, thus having their performance restricted, especially monocular ones which fail to obtain depth information of objects and related environment. In this paper, a novel monocular pedestrian orientation estimation model, called FFNet, is proposed. Apart from camera captures, the model adds the 2D and 3D dimensions of pedestrians as two other inputs according to the logic relationship between orientation and them. The 2D and 3D dimensions of pedestrians are determined from the camera captures and further utilized through two feedforward links connected to the orientation estimator. The feedforward links strengthen the logicality and interpretability of the network structure of the proposed model. Experiments show that the proposed model has at least 1.72% AOS increase than most state-of-the-art models after identical training processes. The model also has competitive results in orientation estimation evaluation on KITTI dataset.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: April 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 100〈/p〉 〈p〉Author(s): Peipei Li, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Age estimation of unknown persons is a challenging pattern analysis task due to the lack of training data and various ageing mechanisms for different individuals. Label distribution learning-based methods usually make distribution assumptions to simplify age estimation. However, since different genders, races and/or any other characteristics may influence facial ageing, age-label distributions are often complicated and difficult to model parametrically. In this paper, we propose a label refinery network (LRN) with two concurrent processes: label distribution refinement and slack regression refinement. The label refinery network aims to learn age-label distributions progressively in an iterative manner. In this way, we can adaptively obtain the specific age-label distributions for different facial images without making strong assumptions on the fixed distribution formulations. To further utilize the correlations among age labels, we propose a slack regression refinery to convert the age-label regression model into an age-interval regression model. Extensive experiments on three popular datasets, namely, MORPH Album2, ChaLearn15 and MegaAge-Asian, demonstrate the superiority of our method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Yafu Xiao, Jing Li, Bo Du, Jia Wu, Jun Chang, Wenfan Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Despite the great success in the computer vision field, visual tracking is still a challenging task. The main obstacle is that the target object often suffers from interference, such as occlusion. As most Siamese network-based trackers mainly sample image patches of target objects for training, the tracking algorithm lacks sufficient information about the surrounding environment. Besides, many Siamese network-based tracking algorithms build a regression only with the target object samples without considering the relationship between target and background, which may deteriorate the performance of trackers. In this paper, we propose a metric correlation Siamese network and multi-class negative sampling tracking method. For the first time, we explore a sampling approach that includes three different kinds of negative samples: virtual negative samples for pre-learning the potential occlusion situation, boundary negative samples to cope with potential tracking drift, and context negative samples to cope with potential incorrect positioning. With the three kinds of negative samples, we also propose a metric correlation method to train a correlation filter that contains metric information for better discrimination. Furthermore, we design a Siamese network-based architecture to embed the metric correlation filter module mentioned above in order to benefit from the powerful representation ability of deep learning. Extensive experiments on challenging OTB100 and VOT2017 datasets demonstrate the competitive performance of the proposed algorithm performs favorably compared with state-of-the-art approaches.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Dongming Wei, Lichi Zhang, Zhengwang Wu, Xiaohuan Cao, Gang Li, Dinggang Shen, Qian Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Deformable brain MR image registration is challenging due to large inter-subject anatomical variation. For example, the highly complex cortical folding pattern makes it hard to accurately align corresponding cortical structures of individual images. In this paper, we propose a novel deep learning way to simplify the difficult registration problem of brain MR images. Specifically, we train a morphological simplification network (MS-Net), which can generate a 〈em〉simple〈/em〉 image with less anatomical details based on the 〈em〉complex〈/em〉 input. With MS-Net, the complexity of the fixed image or the moving image under registration can be reduced gradually, thus building an individual (simplification) trajectory represented by MS-Net outputs. Since the generated images at the ends of the two trajectories (of the fixed and moving images) are so simple and very similar in appearance, they are easy to register. Thus, the two trajectories can act as a bridge to link the fixed and the moving images, and guide their registration. Our experiments show that the proposed method can achieve highly accurate registration performance on different datasets (〈em〉i.e.〈/em〉, NIREP, LPBA, IBSR, CUMC, and MGH). Moreover, the method can be also easily transferred across diverse image datasets and obtain superior accuracy on surface alignment. We propose MS-Net as a powerful and flexible tool to simplify brain MR images and their registration. To our knowledge, this is the first work to simplify brain MR image registration by deep learning, instead of estimating deformation field directly.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 24 December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Bingbing Zhang, Qilong Wang, Xiaoxiao Lu, Fasheng Wang, Peihua Li〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Feature coding is a key component of the bag of visual words (BoVW) model, which is designed to improve image classification and retrieval performance. In the feature coding process, each feature of an image is nonlinearly mapped via a dictionary of visual words to form a high-dimensional sparse vector. Inspired by the well-known locality-constrained linear coding (LLC), we present a locality-constrained affine subspace coding (LASC) method to address the limitation whereby LLC fails to consider the local geometric structure around visual words. LASC is distinguished from all the other coding methods since it constructs a dictionary consisting of an ensemble of affine subspaces. As such, the local geometric structure of a manifold is explicitly modeled by such a dictionary. In the process of coding, each feature is linearly decomposed and weighted to form the first-order LASC vector with respect to its top-k neighboring subspaces. To further boost performance, we propose the second-order LASC vector based on information geometry. We use the proposed coding method to perform both image classification and image retrieval tasks and the experimental results show that the method achieves superior or competitive performance in comparison to state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Wan-Jin Yu, Zhen-Duo Chen, Xin Luo, Wu Liu, Xin-Shun Xu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Multi-label image classification problem is one of the most important and fundamental problems in computer vision. In an image with multiple labels, the objects usually locate at various positions with different scales and poses. Moreover, some labels are associated with the entire image instead of a small region. Therefore, both the global and local information are important for classification. To effectively extract and make full use of these information, in this paper, we present a novel deep Dual-stream nEtwork for the muLTi-lAbel image classification task, DELTA for short. As its name indicates, it is composed of two streams, i.e., the Multi-Instance network and the Global Priors network. The former is used to extract the multi-scale class-related local instances features by modeling the classification problem in a multi-instance learning framework. The latter is devised to capture the global priors from the input image as the global information. These two streams are fused by the final fusion layer. In this way, DELTA can extract and make full use of both the global and local information for classification. Extensive experiments on three benchmark datasets, i.e., PASCAL VOC 2007, PASCAL VOC 2012 and Microsoft COCO, demonstrate that DELTA significantly outperforms several state-of-the-art methods. Moreover, DELTA can automatically locate the key image patterns that trigger the labels.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Çiğdem Sazak, Carl J. Nelson, Boguslaw Obara〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Wangli Hao, Zhaoxiang Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Two-stream convolutional neural networks show great promise for action recognition tasks. However, most two-stream based approaches train the appearance and motion subnetworks independently, which may lead to the decline in performance due to the lack of interactions among two streams. To overcome this limitation, we propose a Spatiotemporal Distilled Dense-Connectivity Network (STDDCN) for video action recognition. This network implements both knowledge distillation and dense-connectivity (adapted from DenseNet). Using this STDDCN architecture, we aim to explore interaction strategies between appearance and motion streams along different hierarchies. Specifically, block-level dense connections between appearance and motion pathways enable spatiotemporal interaction at the feature representation layers. Moreover, knowledge distillation among two streams (each treated as a student) and their last fusion (treated as teacher) allows both streams to interact at the high level layers. The special architecture of STDDCN allows it to gradually obtain effective hierarchical spatiotemporal features. Moreover, it can be trained end-to-end. Finally, numerous ablation studies validate the effectiveness and generalization of our model on two benchmark datasets, including UCF101 and HMDB51. Simultaneously, our model achieves promising performances.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Hua Yang, Chenghui Huang, Feiyue Wang, Kaiyou Song, Shijiao Zheng, Zhouping Yin〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Although template matching has been widely studied in the fields of image processing and computer vision, current template matching methods still cannot address large-scale changes and rotation changes simultaneously. In this study, we propose a novel adaptive radial ring code histograms (ARRCH) image descriptor for large-scale and rotation-invariant template matching. The image descriptor is constructed by (1) identifying, inside the template, a set of concentric ring regions around a reference point, (2) detecting “stable” pixels based on the ASGO, which is tolerant with respect to large scale change, (3) extracting a rotation-invariant feature for each “stable” pixel, and (4) discretizing the features in a separate histogram for each concentric ring region in the scale space. Finally, an ARRCH image descriptor is obtained by chaining the histograms of all concentric ring regions for each scale. In matching mode, a sliding window approach is used to extract descriptors, which are compared with the template one, and a coarse-to-fine search strategy is employed to detect the scale of the target image. To demonstrate the performance of the ARRCH, several experiments are carried out, including a parameter experiment and a large-scale and rotation change matching experiment, and some applications are presented. The experimental results demonstrate that the proposed method is more resistant to large-scale and rotation differences than previous state-of-the-art matching methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Peng Wang, Lingqiao Liu, Chunhua Shen, Heng Tao Shen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Most video based action recognition approaches create the video-level representation by temporally pooling the features extracted at every frame. The pooling methods they adopt, however, usually completely or partially ignore the dynamic information contained in the temporal domain, which may undermine the discriminative power of the resulting video representation since the video sequence order could unveil the evolution of a specific event or action. To overcome this drawback and explore the importance of incorporating the temporal order information, in this paper we propose a novel temporal pooling approach to aggregate the frame-level features. Inspired by the capacity of Convolutional Neural Networks (CNN) in making use of the internal structure of images for information abstraction, we propose to apply the temporal convolution operation to the frame-level representations to extract the dynamic information. However, directly implementing this idea on the original high-dimensional feature will result in parameter explosion. To handle this issue, we propose to treat the temporal evolution of the feature value at each feature dimension as a 1D signal and learn a unique convolutional filter bank for each 1D signal. By conducting experiments on three challenging video-based action recognition datasets, HMDB51, UCF101, and Hollywood2, we demonstrate that the proposed method is superior to the conventional pooling methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Wei Wei, Bin Zhou, Dawid Połap, Marcin Woźniak〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉 〈p〉Improving CT images by increasing the number of scans, hence increasing the ionizing radiation dose, can increase the probability of inducing cancer in the patient. Using fewer images but improving them by accurate reconstruction is better solution.〈/p〉 〈p〉In this paper, an adaptive variational Partial Differential Equation (PDE) model is proposed for image reconstruction. L2 energy of the image gradient and the Total Variation (TV) are combined to form a new functional, which is introduced to an optimization problem. The dynamic behaviors of the model are formed by a threshold function, and then the L2 term is applied in the lower-density region to increase reconstruction speed, and the TV term is applied in the higher-density region to preserve the most important image features. The threshold function is asymptotically controlled by an evolutionary PDE and is more suitable for complex images. The efficiency and accuracy of the proposed model are demonstrated in numerical experiments.〈/p〉 〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 20 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Linjiang Huang, Yan Huang, Wangli Ouyang, Liang Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Action recognition using pose information has drawn much attention recently. However, most previous approaches treat human pose as a whole or just use pose to extract robust features. Actually, human body parts play an important role in action, and so modeling spatio-temporal information of body parts can effectively assist in classifying actions. In this paper, we propose a Part-aligned Pose-guided Recurrent Network (P〈sup〉2〈/sup〉RN) for action recognition. The model mainly consists of two modules, i.e., part alignment module and part pooling module, which are used for part representation learning and part-related feature fusion, respectively. The part-alignment module incorporates an auto-transformer attention, aiming to capture spatial configuration of body parts and predict pose attention maps. While the part pooling module exploits both symmetry and complementarity of body parts to produce fused body representation. The whole network is a recurrent network which can exploit the body representation and simultaneously model spatio-temporal evolutions of human body parts. Experiments on two publicly available benchmark datasets show the state-of-the-art performance and demonstrate the power of the two proposed modules.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 20 March 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Xiangrui Li, Andong Wang, Jianfeng Lu, Zhenmin Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Low-rank or sparse tensor recovery finds many applications in computer vision and machine learning. The recently proposed regularized multilinear regression and selection (Remurs) model assumes the true tensor to be simultaneously low-Tucker-rank and sparse, and has been successfully applied in fMRI analysis. However, the statistical performance of Remurs-like models is still lacking. To address this problem, a minimization problem based on a newly defined tensor nuclear-〈em〉l〈/em〉〈sub〉1〈/sub〉-norm is proposed, to recover a simultaneously low-Tucker-rank and sparse tensor from its degraded observations. Then, an M-ADMM-based algorithm is developed to efficiently solve the problem. Further, the statistical performance is analyzed by establishing a deterministic upper bound on the estimation error for general noise. Also, under Gaussian noise, non-asymptotic upper bounds for two specific settings, i.e., noisy tensor decomposition and random Gaussian design, are given. Experiments on synthetic datasets demonstrate that the proposed theorems can precisely predict the scaling behavior of the estimation error.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Dengfeng Chai〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper formulates superpixel segmentation as a pixel labeling problem and proposes a quaternary labeling algorithm to generate superpixel lattice. It is achieved by seaming overlapped patches regularly placed on the image plane. Patch seaming is formulated as a pixel labeling problem, where each label indexes one patch. Once the optimal seaming is completed, all pixels covered by one retained patch constitute one superpixel. Further, four kinds of patches are distinguished and assembled into four layers correspondingly, and the patch indexes are mapped to the quaternary layer indexes. It significantly reduces the number of labels and greatly improves labelling efficiency. Furthermore, an objective function is developed to achieve optimal segmentation. Lattice structure is guaranteed by fixing patch centers to be superpixel centers, compact superpixels are assured by horizontal and vertical constraints enforced on the smooth terms, and coherent superpixels are achieved by iteratively refining the data terms. Extensive experiments on BSDS data set demonstrate that SQL algorithm significantly improves labeling efficiency, outperforms the other superpixel lattice methods, and is competitive with state-of-the-art methods without lattice guarantee. Superpixel lattice allows contextual relationships among superpixels to be easily modeled by either MRFs or CNN.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Surina Borjigin, Prasanna K. Sahoo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we propose a multi-level thresholding model based on gray-level & local-average histogram (GLLA) and Tsallis–Havrda–Charvát entropy for RGB color image. We validate the multi-level thresholding formulation by using the mathematical induction method. We apply particle swarm optimization (PSO) algorithm to obtain the optimal threshold values for each component of a RGB image. By assigning the mean values from each thresholded class, we obtain three segmented component images independently. We conduct the experiments extensively on The Berkeley Segmentation Dataset and Benchmark (BSDS300) and calculate the average four performance indices (〈em〉BDE, PRI, GCE〈/em〉 and 〈em〉VOI〈/em〉) to show the effectiveness and reasonability of the proposed method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Xinge You, Jiamiao Xu, Wei Yuan, Xiao-Yuan Jing, Dacheng Tao, Taiping Zhang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Cross-view classification that means to classify samples from heterogeneous views is a significant yet challenging problem in computer vision. An effective solution to this problem is the multi-view subspace learning (MvSL), which intends to find a common subspace for multi-view data. Although great progress has been made, existing methods usually fail to find a suitable subspace when multi-view data lies on nonlinear manifolds, thus leading to performance deterioration. To circumvent this drawback, we propose Multi-view Common Component Discriminant Analysis (MvCCDA) to handle view discrepancy, discriminability and nonlinearity in a joint manner. Specifically, our MvCCDA incorporates supervised information and local geometric information into the common component extraction process to learn a discriminant common subspace and to discover the nonlinear structure embedded in multi-view data. Optimization and complexity analysis of MvCCDA are also presented for completeness. Our MvCCDA is competitive with the state-of-the-art MvSL based methods on four benchmark datasets, demonstrating its superiority.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Alper Aksac, Tansel Özyer, Reda Alhajj〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we propose a cut-edge algorithm for spatial clustering (CutESC) based on proximity graphs. The CutESC algorithm removes edges when a cut-edge value for the edge’s endpoints is below a threshold. The cut-edge value is calculated by using statistical features and spatial distribution of data based on its neighborhood. Also, the algorithm works without any prior information and preliminary parameter settings while automatically discovering clusters with non-uniform densities, arbitrary shapes, and outliers. However, there is an option which allows users to set two parameters to better adapt clustering solutions for particular problems. To assess advantages of CutESC algorithm, experiments have been conducted using various two-dimensional synthetic, high-dimensional real-world, and image segmentation datasets.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Ernesto Bribiesca〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Generally speaking, a spiral is a 2D curve which winds about a fixed point. Now, we present a new, alternative, and easy way to describe and generate spirals by means of the use of the Slope Chain Code (SCC) [E. Bribiesca, A measure of tortuosity based on chain coding, Pattern Recognition 46 (2013) 716–724]. Thus, each spiral is represented by only one chain. The chain elements produce a finite alphabet which allows us to use grammatical techniques for spiral classification. Spirals are composed of constant straight-line segments and their chain elements are obtained by calculating the slope changes between contiguous straight-line segments (angle of contingence) scaled to a continuous range from 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si43.svg"〉〈mrow〉〈mo〉−〈/mo〉〈mn〉1〈/mn〉〈/mrow〉〈/math〉 (〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si44.svg"〉〈mrow〉〈mo〉−〈/mo〉〈msup〉〈mn〉180〈/mn〉〈mo〉∘〈/mo〉〈/msup〉〈/mrow〉〈/math〉) to 1 (180〈sup〉∘〈/sup〉). The SCC notation is invariant under translation, rotation, optionally under scaling, and it does not use a grid. Other interesting properties can be derived from this notation, such as: the mirror symmetry and inverse spirals, the accumulated slope, the slope change mean, and tortuosity for spirals. We introduce new concepts of projective polygonal paths and osculating polygons. We present a new spiral called the SCC polygonal spiral and its chain which is described by the numerical sequence 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si45.svg"〉〈mfrac〉〈mn〉2〈/mn〉〈mi〉n〈/mi〉〈/mfrac〉〈/math〉 for 〈em〉n〈/em〉 ≥ 3, to the best of our knowledge this is the first time that this spiral and its chain are presented. The importance of this spiral and its chain is that this chain is covering all the slope changes of all the regular polygons composed of 〈em〉n〈/em〉 edges (n-gons). Also, we describe the chain which generates the spiral of Archimedes. Finally, we present some results of different kind of spirals from the real world, including spiral patterns in shells.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Jun Tang, Zhibo Yang, Yongpan Wang, Qi Zheng, Yongchao Xu, Xiang Bai〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉State-of-the-art methods have achieved impressive performances on multi-oriented text detection. Yet, they usually have difficulty in handling curved and dense texts, which are common in commodity images. In this paper, we propose a network for detecting dense and arbitrary-shaped scene text by instance-aware component grouping (ICG), which is a flexible bottom-up method. To address the difficulty in separating dense text instances faced by most bottom-up methods, we propose attractive and repulsive link between text components which forces the network learning to focus more on close text instances, and instance-aware loss that fully exploits context to supervise the network. The final text detection is achieved by a modified minimum spanning tree (MST) algorithm based on the learned attractive and repulsive links. To demonstrate the effectiveness of the proposed method, we introduce a dense and arbitrary-shaped scene text dataset composed of commodity images (DAST1500). Experimental results show that the proposed ICG significantly outperforms state-of-the-art methods on DAST1500 and two curved text datasets: Total-Text and CTW1500, and also achieves very competitive performance on two multi-oriented datasets: ICDAR15 (at 7.1FPS for 1280 × 768 image) and MTWI.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Sandro Cumani, Pietro Laface〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This work focuses on clustering large sets of utterances collected from an unknown number of speakers. Since the number of speakers is unknown, we focus on exact hierarchical agglomerative clustering, followed by automatic selection of the number of clusters. Exact hierarchical clustering of a large number of vectors, however, is a challenging task due to memory constraints, which make it ineffective or unfeasible for large datasets. We propose an exact memory–constrained and parallel implementation of average linkage clustering for large scale datasets, showing that its computational complexity is approximately 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si2.svg"〉〈mrow〉〈mi mathvariant="bold-script"〉O〈/mi〉〈mo〉(〈/mo〉〈msup〉〈mi〉N〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈mo〉)〈/mo〉〈mo〉,〈/mo〉〈/mrow〉〈/math〉 but is much faster (up to 40 times in our experiments), than the Reciprocal Nearest Neighbor chain algorithm, which has 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si3.svg"〉〈mrow〉〈mi mathvariant="bold-script"〉O〈/mi〉〈mo〉(〈/mo〉〈msup〉〈mi〉N〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈mo〉)〈/mo〉〈/mrow〉〈/math〉 complexity. We also propose a very fast silhouette computation procedure that, in linear time, determines the set of clusters. The computational efficiency of our approach is demonstrated on datasets including up to 4 million speaker vectors.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Yuan Zhu, Jiufeng Zhou, Hong Yan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper, we show that graph matching methods based on relaxation labeling, spectral graph theory and tensor theory have the same mathematical form by employing power iteration technique. Besides, the differences among these methods are also fully discussed and can be proven that distinctions have little impact on the final matching result. Moreover, we propose a fast compatibility building procedure to accelerate the preprocessing speed which is considered to be the main time consuming part of graph matching. Finally, several experiments are conducted to verify our findings.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 17 June 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Inpyo Hong, Youngbae Hwang, Daeyoung Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Image denoising is a fundamental task in computer vision and image processing domain. In recent years, the task has been tackled with deep neural networks by learning the patterns of noises and image patches. However, because of the high diversity of natural image patches and noise distributions, a huge network with a large amount of training data is necessary to obtain a state-of-the-art performance. In this paper, we propose a novel ensemble strategy of exploiting multiple deep neural networks for efficient deep learning of image denoising. We divide the task of image denoising into several local subtasks according to the complexity of clean image patches and conquer each subtask using a network trained on its local space. Then, we combine the local subtasks at test time by applying the set of networks to each noisy patch as a weighted mixture, where the mixture weights are determined by the likelihood of each network for each noisy patch. Our methodology of using locally-learned networks based on patch complexity effectively decreases the diversity of image patches at each single network, and their adaptively-weighted mixture to the input combines the local subtasks efficiently. Extensive experimental results on Berkeley segmentation dataset and standard test images demonstrate that our strategy significantly boosts denoising performance in comparison to using a single network of the same total capacity. Furthermore, our method outperforms previous methods with much smaller training samples and trainable parameters, and so with much reduced time complexity both in training and running.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Myungjun Kim, Dong-gi Lee, Hyunjung Shin〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉A set of data can be obtained from different hierarchical levels in diverse domains, such as multi-levels of genome data in omics, domestic/global indicators in finance, ancestors/descendants in phylogenetics, genealogy, and sociology. Such layered structures are often represented as a hierarchical network. If a set of different data is arranged in such a way, then one can naturally devise a network-based learning algorithm so that information in one layer can be propagated to other layers through interlayer connections. Incorporating individual networks in layers can be considered as an integration in a serial/vertical manner in contrast with parallel integration for multiple independent networks. The hierarchical integration induces several problems on computational complexity, sparseness, and scalability because of a huge-sized matrix. In this paper, we propose two versions of an algorithm, based on semi-supervised learning, for a hierarchically structured network. The naïve version utilizes existing method for matrix sparseness to solve label propagation problems. In its approximate version, the loss in accuracy versus the gain in complexity is exploited by providing analyses on error bounds and complexity. The experimental results show that the proposed algorithms perform well with hierarchically structured data, and, outperform an ordinary semi-supervised learning algorithm.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Xuefei Zhe, Shifeng Chen, Hong Yan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉L2-normalization is an effective method to enhance the discriminant power of deep representation learning. However, without exploiting the geometric properties of the feature space, the generally used gradient based optimization methods are failed to track the global information during training. In this paper, we propose a novel deep metric learning model based on the directional distribution. By defining the loss function based on the von Mises–Fisher distribution, we propose an effective alternative learning algorithm by periodically updating the class centers. The proposed metric learning not only captures the global information about the embedding space but also yields an approximate representation of the class distribution during training. Considering classification and retrieval tasks, our experiments on benchmark datasets demonstrate an improvement from the proposed algorithm. Particularly, with a small number of convolutional layers, a significant accuracy upsurge can be observed compared to widely used gradient-based methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Swarnendu Ghosh, Nibaran Das, Mita Nasipuri〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Convolutional Neural Network has become very common in the field of computer vision in recent years. But it comes with a severe restriction regarding the size of the input image. Most convolutional neural networks are designed in a way so that they can only accept images of a fixed size. This creates several challenges during data acquisition and model deployment. The common practice to overcome this limitation is to reshape the input images so that they can be fed into the networks. Many standard pre-trained networks and datasets come with a provision of working with square images. In this work we analyze 25 different reshaping methods across 6 datasets corresponding to different domains trained on three famous architectures namely Inception-V3, which is an extension of GoogLeNet, the Residual Networks (Resent-18) and the 121-Layer deep DenseNet. While some of the reshaping methods like “interpolation” and “cropping” have been commonly used with convolutional neural networks, some uncommon techniques like “containing”, “tiling” and “mirroring” have also been demonstrated. In total, 450 neural networks were trained from scratch to provide various analyses regarding the convergence of the validation loss and the accuracy obtained on the test data. Statistical measures have been provided to demonstrate the dependence between parameter choices and datasets. Several key observations were noted such as the benefits of using randomized processes, poor performance of the commonly used “cropping” techniques and so on. The paper intends to provide empirical evidence to guide the reader to choose a proper technique of reshaping inputs for their convolutional neural networks. The official code is available in https://github.com/DVLP-CMATERJU/Reshaping-Inputs-for-CNN.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319301505-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Yibao Li, Jing Wang, Bingheng Lu, Darae Jeong, Junseok Kim〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉We propose an efficient and robust algorithm to reconstruct the volumes of multi-labeled objects from sets of cross sections without overlapping regions, artificial gaps, or mismatched interfaces. The algorithm can handle cross sections wherein different regions have different labels. The present study represents a multicomponent extension of our previous work (Li et al. (2015), [1]), wherein we modified the original Cahn–Hilliard (CH) equation by adding a fidelity term to keep the solution close to the single-labeled slice data. The classical CH equation possesses desirable properties, such as smoothing and conservation. The key idea of the present work is to employ a multicomponent CH system to reconstruct multicomponent volumes without self-intersections. We utilize the linearly stabilized convex splitting scheme introduced by Eyre with the Fourier-spectral method so that we can use a large time step and solve the discrete equation quickly. The proposed algorithm is simple and produces smooth volumes that closely preserve the original volume data and do not self-intersect. Numerical results demonstrate the effectiveness and robustness of the proposed method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Tomas Björklund, Attilio Fiandrotti, Mauro Annarumma, Gianluca Francini, Enrico Magli〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this work, we describe a License Plate Recognition (LPR) system designed around convolutional neural networks (CNNs) trained on synthetic images to avoid collecting and annotating the thousands of images required to train a CNN. First, we propose a framework for generating synthetic license plate images, accounting for the key variables required to model the wide range of conditions affecting the aspect of real plates. Then, we describe a modular LPR system designed around two CNNs for plate and character detection enjoying common training procedures and train the CNNs and experiment on three different datasets of real plate images collected from different countries. Our synthetically trained system outperforms multiple competing systems trained on real images, showing that synthetic images are effective at training a CNNs for LPR if the training images have sufficient variance of the key variables controlling the plate aspect.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319301475-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Mark Brown, David Windridge, Jean-Yves Guillemaut〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉We present a family of methods for 2D–3D registration spanning both deterministic and non-deterministic branch-and-bound approaches. Critically, the methods exhibit invariance to the underlying scene primitives, enabling e.g. points and lines to be treated on an equivalent basis, potentially enabling a broader range of problems to be tackled while maximising available scene information, all scene primitives being simultaneously considered. Being a branch-and-bound based approach, the method furthermore enjoys intrinsic guarantees of global optimality; while branch-and-bound approaches have been employed in a number of computer vision contexts, the proposed method represents the first time that this strategy has been applied to the 2D–3D correspondence-free registration problem from points and lines. Within the proposed procedure, deterministic and probabilistic procedures serve to speed up the nested branch-and-bound search while maintaining optimality. Experimental evaluation with synthetic and real data indicates that the proposed approach significantly increases both accuracy and robustness compared to the state of the art.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Juan M. Górriz, Javier Ramirez, MRC AIMS Consortium, John Suckling〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper we derive practical and novel upper bounds for the resubstitution error estimate by assessing the number of linear decision functions within the problem of pattern recognition in neuroimaging. Linear classifiers and regressors have been considered in many fields, where the number of predictors far exceeds the number of training samples available, to overcome the limitations of high complexity models in terms of computation, interpretability and overfitting. Typically in neuroimaging this is the rule rather than the exception, since the dimensionality of each observation (millions of voxels) in relation to the number of available samples (hundred of scans) implies a high risk of overfitting. Based on classical combinatorial geometry, we estimate the number of hyperplanes or linear decision rules and the corresponding distribution-independent performance bounds, comparing it to those obtained by the use of the VC-dimension concept. Experiments on synthetic and neuroimaging data demonstrate the performance of resubstitution error estimators, which are often overlooked in heterogeneous scenarios where their performance is similar to that obtained by cross-validation methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): Ying Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper shows that pairwise PageRank orders emerge from two-hop walks. The main tool used here refers to a specially designed sign-mirror function and a parameter curve, whose low-order derivative information implies pairwise PageRank orders with high probability. We study the pairwise correct rate by placing the Google matrix 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si11.svg"〉〈mi mathvariant="bold"〉G〈/mi〉〈/math〉 in a probabilistic framework, where 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si11.svg"〉〈mi mathvariant="bold"〉G〈/mi〉〈/math〉 may be equipped with different random ensembles for model-generated or real-world networks with sparse, small-world, scale-free features, the proof of which is mixed by mathematical and numerical evidence. We believe that the underlying spectral distribution of aforementioned networks is responsible for the high pairwise correct rate. Moreover, the perspective of this paper naturally leads to an 〈em〉O〈/em〉(1) algorithm for any single pairwise PageRank comparison if assuming both 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si61.svg"〉〈mrow〉〈mi mathvariant="bold"〉A〈/mi〉〈mo linebreak="goodbreak"〉=〈/mo〉〈mi mathvariant="bold"〉G〈/mi〉〈mo linebreak="goodbreak"〉−〈/mo〉〈msub〉〈mi mathvariant="bold"〉I〈/mi〉〈mi〉n〈/mi〉〈/msub〉〈mo〉,〈/mo〉〈/mrow〉〈/math〉 where 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si62.svg"〉〈msub〉〈mi mathvariant="bold"〉I〈/mi〉〈mi〉n〈/mi〉〈/msub〉〈/math〉 denotes the identity matrix of order 〈em〉n〈/em〉, and 〈math xmlns:mml="http://www.w3.org/1998/Math/MathML" altimg="si63.svg"〉〈msup〉〈mi mathvariant="bold"〉A〈/mi〉〈mn〉2〈/mn〉〈/msup〉〈/math〉 are ready on hand (e.g., constructed offline in an incremental manner), based on which it is easy to extract the top 〈em〉k〈/em〉 list in 〈em〉O〈/em〉(〈em〉kn〈/em〉), thus making it possible for PageRank algorithm to deal with super large-scale datasets in real time.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Xianglin Guo, Xingyu Xie, Guangcan Liu, Mingqiang Wei, Jun Wang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Subspace segmentation or clustering remains a challenge of interest in computer vision when handling complex noise existing in high-dimensional data. Most of the current sparse representation or minimum-rank based techniques are constructed on ℓ〈sub〉1〈/sub〉-norm or ℓ〈sub〉2〈/sub〉-norm losses, which is sensitive to outliers. Finite mixture model, as a class of powerful and flexible tools for modeling complex noise, becomes a must. Among all the choices, exponential family mixture is extremely useful in practice due to its universal approximation ability for any continuous distribution and hence covers a broader scope of characteristics of noise distribution. Equipped with such a modeling idea, this paper focuses on the complex noise contaminated subspace clustering problem by using finite mixture of exponential power (MoEP) distributions. We then harness a penalized likelihood function to perform automatic model selection and hence avoid over-fitting. Moreover, we introduce a novel prior on the singular values of representation matrix, which leads to a novel penalty in our nonconvex and nonsmooth optimization. The parameters of the MoEP model can be estimated with a Maximum A Posteriori (MAP) method. Meanwhile, the subspace is computed with joint weighted ℓ〈sub〉〈em〉p〈/em〉〈/sub〉-norm and Schatten-〈em〉q〈/em〉 quasi-norm minimization. Both theoretical and experimental results show the effectiveness of our method.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Jacopo Cavazza, Pietro Morerio, Vittorio Murino〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Despite the recent deep learning (DL) revolution, kernel machines still remain powerful methods for action recognition. DL has brought the use of large datasets and this is typically a problem for kernel approaches, which are not scaling up efficiently due to kernel Gram matrices. Nevertheless, kernel methods are still attractive and more generally applicable since they can equally manage different sizes of the datasets, also in cases where DL techniques show some limitations. This work investigates these issues by proposing an explicit approximated representation that, together with a linear model, is an equivalent, yet scalable, implementation of a kernel machine. Our approximation is directly inspired by the exact feature map that is induced by an RBF Gaussian kernel but, unlike the latter, it is finite dimensional and very compact. We justify the soundness of our idea with a theoretical analysis which proves the unbiasedness of the approximation, and provides a vanishing bound for its variance, which is shown to decrease much rapidly than in alternative methods in the literature. In a broad experimental validation, we assess the superiority of our approximation in terms of (1) ease and speed of training, (2) compactness of the model, and (3) improvements with respect to the state-of-the-art performance.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Danyang Zhang, Huadong Ma, Linqiang Pan〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper proposes a novel connected components labeling (CCL) approach that introduces a gamma signal to record certain mask pixels’ values to eliminate duplicated pixel checking and regulate the labeling process for higher efficiency. A new block-based two-scan CCL algorithm, Eight-Connected Gamma-Signal-regulated (ECGS) algorithm, is designed and developed by applying this approach to evaluate a block of 2 × 2 pixels (with just 6 mask pixels) in each iteration such that the total number of operations is considerably reduced and the labeling efficiency is significantly improved. The experiments conducted on a public benchmark, YACCLAB (Yet Another Connected Components Labeling Benchmark), have demonstrated that the proposed ECGS algorithm can outperform current state-of-the-art CCL algorithms for a number of digital images.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Takoua Kefi-Fatteh, Riadh Ksantini, Mohamed-Bécha Kaâniche, Adel Bouhoula〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Low variance direction of the training dataset can carry crucial information when building a performant one-class classifier. Covariance-guided One-Class Support Vector Machine (COSVM) emphasizes the low variance direction of the training dataset which results in higher accuracy. However, in the case of large scale datasets, or sequentially obtained data, it shows a serious performance degradation and requires a large memory and an important training time. Thus, in this paper, we investigate the effectiveness of using the low variance directions in an incremental approach. In fact, incremental learning is more effective when dealing with dynamic or important amount of data. More precisely, we control the possible changes of support vectors after the addition of new data points, while emphasizing the low variance directions of the training data, in order to improve classification performance. An extensive comparison of the incremental COSVM to contemporary batch and incremental one-class classifiers on artificial and real-world datasets demonstrates the advantage and the superiority of our proposed model.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Zhao Pei, Xiaoning Qi, Yanning Zhang, Miao Ma, Yee-Hong Yang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Object tracking in crowded spaces is a challenging but very important task in computer vision applications. However, due to interactions among large-scale pedestrians and common social rules, predicting the complex human mobility in a crowded scene becomes difficult. This paper proposes a novel human trajectory prediction model in a crowded scene called the social-affinity Long Short-Term Memory(LSTM) model. Our model can learn general human mobility patterns and predict individual’ s trajectories based on their past positions, in particular, with the influence of their neighbors in the Social Affinity Map (SAM). The SAM clusters the relative positions of surrounding individuals, and represents the distribution of the relative positions by different bins with semantic descriptions. We formulate the problem of trajectory prediction together with interactions among people as a sequence generation task with social affinity. The proposed model utilizes the LSTM to learn general human moving patterns as well as the Social Affinity Map to connect neighbors with a weight matrix corresponding to SAM bins for learning the social dependencies between correlated pedestrians. By capturing the object’ s past positions and connecting the hidden states of it’ s neighbors in different SAM bins with different elements of the weight matrix, the social-affinity LSTM is able to predict the trajectory of each pedestrian with its own features and neighbors’ influence. We compare the performance of our method with the Social LSTM model on several public datasets. Our model outperforms state-of-the-art methods on these datasets with the best results, especially the datasets with more social affinity phenomena.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Xulun Ye, Jieyu Zhao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉For multi-manifold clustering, it is still a challenging problem on how to learn the cluster number automatically from data. This paper presents a novel nonparametric Bayesian model to cluster the multi-manifold data and estimate the number of submanifolds simultaneously. Our model firstly assumes that every submanifold is a probability distribution defined in the manifold space. Then, we approximate the manifold distribution with a deep neural network. To maintain the data similarity among data, we regularize the data generation process with a modified k-nearest neighbor graph. Though the posterior inference is hard, our model leads to a very efficient deterministic optimization algorithm, which incorporates the mean field variational inference with the Graph regularized Variational Auto-Encoder (Graph-VAE). By applying the Graph-VAE, our model exhibits another advantage of realistic image generation which overcomes the conventional clustering methods. Furthermore, we expand our proposed manifold algorithm with the Dirichlet Process Mixture (DPM) to model the real datasets, in which the manifold data and non-manifold data are coexisting. Experiments on synthetic data verify our theoretical analysis. Clustering results on motion segmentation, coil20 and 3D pedestrian show that our approach can significantly improve the clustering accuracy. The handwritten database experiment demonstrates the image generation capability.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Qiyue Yin, Junge Zhang, Shu Wu, Hexi Li〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Real world data are often represented by multiple distinct feature sets, and some prior knowledge is provided, such as labels of some examples or pairwise constraints between several sample pairs. Accordingly, task of multi-view clustering arises from a complex information aggregation of multiple sources of feature sets and knowledge prior. In this paper, we propose to optimize the cluster indicator, which representing the class labels is an intuitive reflection of the clustering structure. Besides, the prior indicating the same level of semantics can be directly utilized guiding the learned clustering structure. Furthermore, feature selection is embedded into the above process to select views and features in each view, which leads to the most discriminative views and features chosen for every single cluster. To these ends, an objective is accordingly proposed with an efficient optimization strategy and convergence analysis. Extensive experiments demonstrate that our model performs better than the state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Yong Zhang, Qi Wang, Dun-wei Gong, Xian-fang Song〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Unsupervised feature selection plays an important role in machine learning and data mining, which is very challenging because of unavailable class labels. We propose an unsupervised feature selection framework by combining the discriminative information of class labels with the subspace learning in this paper. In the proposed framework, the nonnegative Laplacian embedding is first utilized to produce pseudo labels, so as to improve the classification accuracy. Then, an optimal feature subset is selected by the subspace learning guiding by the discriminative information of class labels, on the premise of maintaining the local structure of data. We develop an iterative strategy for updating similarity matrix and pseudo labels, which can bring about more accurate pseudo labels, and then we provide the convergence of the proposed strategy. Finally, experimental results on six real-world datasets prove the superiority of the proposed approach over seven state-of-the-art ones.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Qing Cai, Huiying Liu, Yiming Qian, Sanping Zhou, Xiaojun Duan, Yee-Hong Yang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The level set model is a popular method for object segmentation. However, most existing level set models perform poorly in color images since they only use grayscale intensity information to defined their energy functions. To address this shortcoming, in this paper, we propose a new saliency-guided level set model (SLSM), which can automatically segment objects in color images guided by visual saliency. Specifically, we first define a global saliency-guided energy term to extract the color objects approximately. Then, by integrating information from different color channels, we define a novel local multichannel based energy term to extract the color objects in detail. In addition, unlike using a length regularization term in the conventional level set models, we achieve segmentation smoothness by incorporating our SLSM into a graph cuts formulation. More importantly, the proposed SLSM is automatically initialized by saliency detection. Finally, the evaluation on public benchmark databases and our collected database demonstrates that the new SLSM consistently outperforms many state-of-the-art level set models and saliency detecting methods in accuracy and robustness.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): 〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2019
    Description: 〈p〉Publication date: September 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 93〈/p〉 〈p〉Author(s): Dongmei Mo, Zhihui Lai〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Ridge regression (RR) and its variants are fundamental methods for multivariable data analysis, which have been widely used to deal with different problems in pattern recognition or classification. However, these methods have their common drawback. That is, the number of the learned projections is limited by the number of class. Moreover, most of these methods do not consider the local structure of the data, which makes them less competitive in the case when data are lying on a lower dimensional manifold. Therefore, in this paper, we propose a robust jointly sparse regression method to integrate the locality geometric structure with generalized orthogonality constraint and joint sparsity into a regression modal to address these problems. The optimization model can be solved by an alternatively iterative algorithm using orthogonal matching pursuit (OMP) and singular value decomposition. Experimental results on face and non-face image database demonstrate the superiority of the proposed method. The matlab code can be found at 〈a href="http://www.scholat.com/laizhihui" target="_blank"〉http://www.scholat.com/laizhihui〈/a〉.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Amalia Luque, Alejandro Carrasco, Alejandro Martín, Ana de las Heras〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉A major issue in the classification of class imbalanced datasets involves the determination of the most suitable performance metrics to be used. In previous work using several examples, it has been shown that imbalance can exert a major impact on the value and meaning of accuracy and on certain other well-known performance metrics. In this paper, our approach goes beyond simply studying case studies and develops a systematic analysis of this impact by simulating the results obtained using binary classifiers. A set of functions and numerical indicators are attained which enables the comparison of the behaviour of several performance metrics based on the binary confusion matrix when they are faced with imbalanced datasets. Throughout the paper, a new way to measure the imbalance is defined which surpasses the Imbalance Ratio used in previous studies. From the simulation results, several clusters of performance metrics have been identified that involve the use of Geometric Mean or Bookmaker Informedness as the best null-biased metrics if their focus on classification successes (dismissing the errors) presents no limitation for the specific application where they are used. However, if classification errors must also be considered, then the Matthews Correlation Coefficient arises as the best choice. Finally, a set of null-biased multi-perspective Class Balance Metrics is proposed which extends the concept of Class Balance Accuracy to other performance metrics.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Amir Atapour-Abarghouei, Samet Akcay, Grégoire Payen de La Garanderie, Toby P. Breckon〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this work, the issue of depth filling is addressed using a self-supervised feature learning model that predicts missing depth pixel values based on the context and structure of the scene. A fully-convolutional generative model is conditioned on the available depth information and full RGB colour information from the scene and trained in an adversarial fashion to complete scene depth. Since ground truth depth is not readily available, synthetic data is instead used with a separate model developed to predict where holes would appear in a sensed (non-synthetic) depth image based on the contents of the RGB image. The resulting synthetic data with realistic holes is utilized in training the depth filling model which makes joint use of a reconstruction loss which employs the Discrete Cosine Transform for more realistic outputs, an adversarial loss which measures the distribution distances via the Wasserstein metric and a bottleneck feature loss that aids in better contextual feature execration. Additionally, the model is adversarially adapted to perform well on naturally-obtained data with no available ground truth. Qualitative and quantitative evaluations demonstrate the efficacy of the approach compared to contemporary depth filling techniques. The strength of the feature learning capabilities of the resulting deep network model is also demonstrated by performing the task of monocular depth estimation using our pre-trained depth hole filling model as the initialization for subsequent transfer learning.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319300743-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Parisa Rastin, Guénaël Cabanes, Basarab Matei, Younès Bennani, Jean-Marc Marty〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Among the variety of algorithms that have been developed for clustering, prototype-based approaches are very popular due to their low computational complexity, allowing real-life applications. In such algorithms, the data set is summarized by a small set of prototypes. Each prototype usually represents a cluster of objects. However, the definition of prototypes for complex objects defined by their relations (relational data) is not an easy task. Few works have been done yet in relational prototype-based clustering. Because relational data are described by a full matrix of dissimilarities, the most important challenge is the computation and memory costs, especially when the number of objects to analyze is very large and for the analysis of data streams (data sets with a dynamic structure varying over time). The combination of these three characteristics (size, complexity and evolution) presents a major challenge and few satisfactory solutions exist at the moment, despite increasingly evident needs. This paper focus on the development of new clustering approaches adapted to big and dynamic relational data. The main idea is to use a set of fixed support points chosen among the objects of the data set, independently from the clusters, and use these support points as a basis for the definition of a representation space, using the Barycentric Coordinates formalism. We demonstrate the qualities of the proposed approaches theoretically and experimentally on a set of artificial and real relational data. We also propose an extension adapted to relational data stream analysis, allowing a dynamic creation and suppression of prototypes to follow the dynamic of the data structure. This dynamic approach is applied on a real data set to detect and follow the dynamic of areas of interest over time in user’s web navigation. We tested different measures of similarity between URLs and different methods of automatic labeling to characterize the clusters. The results are convincing and encouraging, the clusters are homogeneous with clear associated topics. The dynamics of user’s interest can be recorded and visualized for each cluster. Remarkable patterns can be associated to precise events or usual timing and cycles in user’s interest.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2019
    Description: 〈p〉Publication date: July 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 91〈/p〉 〈p〉Author(s): Yinhui Zhang, Zifen He〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉The presence of limited spatio-temporal resolution in dynamic scenes renders segmentation of foreground objects problematic, as it brings negative effects on candidate object missing or motion boundary overfilling caused by large displacements of corresponding points in consecutive frames. To alleviate these problems, our general framework introduces a novel agnostic attribute video object segmentation method that is suitable for segmenting foreground objects in dynamic scenes at low spatio-temporal resolution. We employ a fully connected network (FCN) to facilitate estimation of class-agnostic object proposals based on the semantic classification attributes. Instead of directly deriving a hard classification into objects, we propose a scheme by fusing different top ranked soft scores in the semantic space that allows the model to directly estimate probabilistic foreground hypotheses. A unified conditional random field model is proposed to incorporate the proposal information derived from the soft prediction scores and consequently build up an unary energy functional with additional location and appearance potentials. The pairwise energy functional imposes both spatial and temporal consistency constraints simultaneously on appearance, location and unary potentials. Our experiments on spatio-temporal subsampled video segmentation benchmarks demonstrate the effectiveness of the proposed method for robust segmentation of class-agnostic objects in dynamic scenes despite of abrupt motion and large displacements caused by limited spatio-temporal resolution.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Gonzalo Safont, Addisson Salazar, Luis Vergara, Enriqueta Gómez, Vicente Villanueva〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper presents a novel method that combines coupled hidden Markov models (HMM) and non-Gaussian mixture models based on independent component analyzer mixture models (ICAMM). The proposed method models the joint behavior of a number of synchronized sequential independent component analyzer mixture models (SICAMM), thus we have named it generalized SICAMM (G-SICAMM). The generalization allows for flexible estimation of complex data densities, subspace classification, blind source separation, and accurate modeling of both local and global dynamic interactions. In this work, the structured result obtained by G-SICAMM was used in two ways: classification and interpretation. Classification performance was tested on an extensive number of simulations and a set of real electroencephalograms (EEG) from epileptic patients performing neuropsychological tests. G-SICAMM outperformed the following competitive methods: Gaussian mixture models, HMM, Coupled HMM, ICAMM, SICAMM, and a long short-term memory (LSTM) recurrent neural network. As for interpretation, the structured result returned by G-SICAMM on EEGs was mapped back onto the scalp, providing a set of brain activations. These activations were consistent with the physiological areas activated during the tests, thus proving the ability of the method to deal with different kind of data densities and changing non-stationary and non-linear brain dynamics.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 26 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Rosa Altilio, Paolo Di Lorenzo, Massimo Panella〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this paper we consider the problem of distributed unsupervised clustering, where training data is partitioned over a set of agents, whose interaction happens over a sparse, but connected, communication network. To solve this problem, we recast the well known Expectation Maximization method in a distributed setting, exploiting a recently proposed algorithmic framework for in-network non-convex optimization. The resulting algorithm, termed as Expectation Maximization Consensus, exploits successive local convexifications to split the computation among agents, while hinging on dynamic consensus to diffuse information over the network in real-time. Convergence to local solutions of the distributed clustering problem is then established. Experimental results on well-known datasets illustrate that the proposed method performs better than other distributed Expectation-Maximization clustering approaches, while the method is faster than a centralized Expectation-Maximization procedure and achieves a comparable performance in terms of cluster validity indexes. The latter ones achieve good values in absolute range scales and prove the quality of the obtained clustering results, which can compare favorably with other methods in the literature.〈/p〉〈/div〉 〈h5〉Graphical abstract〈/h5〉 〈div〉〈p〉〈figure〉〈img src="https://ars.els-cdn.com/content/image/1-s2.0-S0031320319301670-fx1.jpg" width="301" alt="Graphical abstract for this article" title=""〉〈/figure〉〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2019
    Description: 〈p〉Publication date: Available online 25 April 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition〈/p〉 〈p〉Author(s): Yuqi Zhang, Yongzhen Huang, Liang Wang, Shiqi Yu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉This paper gives a comprehensive study on gait biometrics via a joint CNN-based method. Gait is a kind of behavioral biometric feature with unique advantages, e.g., long-distance, cross-view and non-cooperative perception and analysis. In this paper, the definition of gait analysis includes gait recognition and gait-based soft biometrics such as gender and age prediction. We propose to investigate these two problems in a joint CNN-based framework which has been seldom reported in the recent literature. The proposed method is efficient in terms of training time, testing time and storage. We achieve the state-of-the-art performance on several gait recognition and soft biometrics benchmarks. Also, we discuss which part of the human body is important and informative for a specific task by network visualization.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2019
    Description: 〈p〉Publication date: August 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 92〈/p〉 〈p〉Author(s): Zhi Gao, Yuwei Wu, Xingyuan Bu, Tan Yu, Junsong Yuan, Yunde Jia〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Recent studies have shown that aggregating convolutional features of a Convolutional Neural Network (CNN) can obtain impressive performance for a variety of computer vision tasks. The Symmetric Positive Definite (SPD) matrix becomes a powerful tool due to its remarkable ability to learn an appropriate statistic representation to characterize the underlying structure of visual features. In this paper, we propose a method of aggregating deep convolutional features into a robust representation through the SPD generation and the SPD transformation under an end-to-end deep network. To this end, several new layers are introduced in our method, including a nonlinear kernel generation layer, a matrix transformation layer, and a vector transformation layer. The nonlinear kernel generation layer is employed to aggregate convolutional features into a kernel matrix which is guaranteed to be an SPD matrix. The matrix transformation layer is designed to project the original SPD representation to a more compact and discriminative SPD manifold. The vectorization and normalization operations are performed in the vector transformation layer to take the upper triangle elements of the SPD representation and carry out the power normalization and 〈em〉l〈/em〉〈sub〉2〈/sub〉 normalization to reduce the redundancy and accelerate the convergence. The SPD matrix in our network can be considered as a mid-level representation bridging convolutional features and high-level semantic features. Results of extensive experiments show that our method notably outperforms state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Filippo Maria Bianchi, Lorenzo Livi, Karl Øyvind Mikalsen, Michael Kampffmeyer, Robert Jenssen〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Learning compressed representations of multivariate time series (MTS) facilitates data analysis in the presence of noise and redundant information, and for a large number of variates and time steps. However, classical dimensionality reduction approaches are designed for vectorial data and cannot deal explicitly with missing values. In this work, we propose a novel autoencoder architecture based on recurrent neural networks to generate compressed representations of MTS. The proposed model can process inputs characterized by variable lengths and it is specifically designed to handle missing data. Our autoencoder learns fixed-length vectorial representations, whose pairwise similarities are aligned to a kernel function that operates in input space and that handles missing values. This allows to learn 〈em〉good〈/em〉 representations, even in the presence of a significant amount of missing data. To show the effectiveness of the proposed approach, we evaluate the quality of the learned representations in several classification tasks, including those involving medical data, and we compare to other methods for dimensionality reduction. Successively, we design two frameworks based on the proposed architecture: one for imputing missing data and another for one-class classification. Finally, we analyze under what circumstances an autoencoder with recurrent layers can learn better compressed representations of MTS than feed-forward architectures.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: November 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 95〈/p〉 〈p〉Author(s): 〈/p〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, Jin Tang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉RGB-Thermal (RGB-T) object tracking receives more and more attention due to the strongly complementary benefits of thermal information to visible data. However, RGB-T research is limited by lacking a comprehensive evaluation platform. In this paper, we propose a large-scale video benchmark dataset for RGB-T tracking. It has three major advantages over existing ones: 1) Its size is sufficiently large for large-scale performance evaluation (total number of frames: 234K, maximum number of frames per sequence: 8K). 2) The alignment between RGB-T sequence pairs is highly accurate, which does not need pre- or post-processing. 3) The occlusion levels are annotated for occlusion-sensitive performance analysis of different tracking algorithms. Moreover, we propose a novel graph-based approach to learn a robust object representation for RGB-T tracking. In particular, the tracked object is represented with a graph with image patches as nodes. Given initial weights of nodes, this graph including graph structure, node weights and edge weights is dynamically learned in a unified optimization framework. Extensive experiments on the large-scale dataset are executed to demonstrate the effectiveness of the proposed tracker against other state-of-the-art tracking methods. We also provide new insights and potential research directions to the field of RGB-T object tracking.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Ganggang Dong, Hongwei Liu, Gangyao Kuang, Jocelyn Chanussot〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Classical sparse modeling requires accurate alignment between the query and the training data. This precondition is disadvantageous for target recognition tasks, where, although the target is present in the images, it is infeasible to perfectly register it during training. In addition, the classical approach is less powerful under unconstrained operating conditions. To solve these problems, this paper presents a new sparse signal modeling strategy in the frequency domain. Because signal energy is mainly concentrated on a small portion of low-frequency components, this set of spectrum carries vital information that can be used to discriminates the class of a target. We generated representations by aggregating low-frequency components. They were then used to build sparse signal models. More specifically, the spectral representation of training data were concatenated to form an over-complete dictionary to encode the counterpart of the query as a linear combination of themselves. Sparsity was harnessed to generate an optimal solution, from which an inference can be made. Multiple comparative analyses were made to demonstrate the advantages of the proposed strategy, especially in unconstrained environments.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Lingling Zhang, Jun Liu, Minnan Luo, Xiaojun Chang, Qinghua Zheng, Alexander G. Hauptmann〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Considering human can learn new object successfully from just one sample, one-shot learning, where each visual class just has one labeled sample for training, has attracted more and more attention. In the past years, most researchers achieve one-shot learning by training a matching network to map a small labeled support set and an unlabeled image to its label. The support set is combined by one image with the same label as unlabeled image and few images with other labels generated by random sampling. This random sampling strategy easily generates massive over-easy support sets in which most labels are less relevant to the label of unlabeled image. It leads to the limitation of matching network for one-shot prediction over indistinguishable label sets. For this issue, we propose a novel metric to evaluate the learning difficulty of support set, where this metric jointly considers the semantic diversity and similarity of visual labels. Based on the metric, we introduce a scheduled sampling strategy to train the matching network from easy to difficult. Extensive experimental results on three datasets, including mini-Imagenet, Birds and Flowers, indicate that our method could achieve significant improvements over other previous methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Haisong Ding, Kai Chen, Qiang Huo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Integrated convolutional neural network (CNN) and deep bidirectional long short-term memory (DBLSTM) based character models have achieved excellent recognition accuracies on optical character recognition (OCR) tasks, along with large amount of model parameters and massive computation cost. To deploy CNN-DBLSTM model in products with 〈strong〉CPU〈/strong〉 server, there is an urgent need to compress and accelerate it as much as possible, especially the CNN part, which dominates both parameters and computation. In this paper, we study teacher-student learning and Tucker decomposition methods to reduce model size and runtime latency for CNN-DBLSTM based character model for OCR. We use teacher-student learning to transfer the knowledge of a large-size teacher model to a small-size compact student model, followed by Tucker decomposition to further compress the student model. For teacher-student learning, we design a novel learning criterion to bring in the guidance of succeeding LSTM layer when matching the CNN-extracted feature sequences of the large teacher and small student models. Experimental results on large scale handwritten and printed OCR tasks show that, using teacher-student learning alone achieves 〈strong〉9.90〈/strong〉 ×  footprint reduction and 〈strong〉15.23〈/strong〉 ×  inference speedup yet without degrading recognition accuracy. Combined with Tucker decomposition method, we can compress and accelerate the model further. The decomposed model achieves 〈strong〉11.89〈/strong〉 ×  footprint reduction and 〈strong〉22.16〈/strong〉 ×  inference speedup while suffering no or only a small recognition accuracy degradation against the large-size baseline model.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Takehiro Kajihara, Takuya Funatomi, Haruyuki Makishima, Takahito Aoto, Hiroyuki Kubo, Shigehito Yamada, Yasuhiro Mukaigawa〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉In this research, we propose a novel registration method for three-dimensional (3D) reconstruction from serial section images. 3D reconstructed data from serial section images provides structural information with high resolution. However, there are three problems in 3D reconstruction: non-rigid deformation, tissue discontinuity, and accumulation of scale change. To solve the non-rigid deformation, we propose a novel non-rigid registration method using blending rigid transforms. To avoid the tissue discontinuity, we propose a target image selection method using the criterion based on the blending of transforms. To solve the scale change of tissue, we propose a scale adjustment method using the tissue area before and after registration. The experimental results demonstrate that our method can represent non-rigid deformation with a small number of control points, and is robust to a variation in staining. The results also demonstrate that our target selection method avoids tissue discontinuity and our scale adjustment reduces scale change.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Nguyen Ngoc Thuy, Sartra Wongthanavasu〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Attribute reduction is a key problem in many areas such as data mining, pattern recognition, machine learning. The problems of finding all reducts as well as finding a minimal reduct in a given data table have been proved to be NP-hard. Therefore, to overcome this difficulty, many heuristic attribute reduction methods have been developed in recent years. In the process of heuristic attribute reduction, accelerating calculation of attribute significance is very important, especially for big data cases. In this paper, we firstly propose attribute significance measures based on stripped quotient sets. Then, by using these measures, we design efficient algorithms for calculating core and reduct, in which the time complexity will be considered in detail. Additionally, we will also give properties directly related to efficiently computing the attribute significance and significantly reducing the data size in the process of calculation. By theoretical and experimental views, we will show that our method can perform efficiently for large-scale data sets.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2019
    Description: 〈p〉Publication date: January 2020〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 97〈/p〉 〈p〉Author(s): Cristiano Cervellera, Danilo Macciò〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉We propose a method based on recursive binary Voronoi trees to learn a nonparametric model of the distribution underlying a given dataset. The obtained model can be used as a general tool both to extract good samples from the original dataset (e.g., for batch selection, bagging, or sample size reduction) or to generate new synthetic ones, also in a conditional fashion (e.g., to deal with imbalanced sets or to reconstruct corrupted points). In order to ensure that the distribution of the new sets, either sampled or generated, follows closely that of the original dataset, we design all the procedures according to a specific measure of distance between distributions. The use of binary recursive Voronoi structures enables the proposed algorithms to be simple, efficient and able to adapt to the shape of the original dataset. Simulation tests showcase the good performance and flexibility of the approach in various learning contexts.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Yu-Feng Yu, Chuan-Xian Ren, Min Jiang, Man-Yu Sun, Dao-Qing Dai, Guodong Guo〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Subspace learning for dimensionality reduction is an important topic in pattern analysis and machine learning, and it has extensive applications in feature representation and image classification. Linear discriminant analysis (LDA) is a well-known subspace learning approach for supervised dimensionality reduction due to its effectiveness and efficacy in discriminant analysis. However, LDA is not stable and suffers from the singularity problem when addressing small sample size and high-dimensional data. In this paper, we develop a novel subspace learning model, named sparse approximation to discriminant projection learning (SADPL), to learn the sparse projection matrix. Different from the traditional LDA-based methods, we learn the projection matrix based on a new objective function rather than the Fisher criterion, which avoids the matrix singularity problem. In order to distinguish which features play an important role in discriminant analysis, we embed a feature selection framework to the subspace learning model to select the informative features. Finally, we can attain a convex objective function which can be solved by an effective optimization algorithm, and theoretically prove the convergence of the proposed optimization algorithm. Extensive experiments on all sorts of image classification tasks, such as face recognition, palmprint recognition, object categorization and texture classification show that our SADPL achieves competitive performance compared to the state-of-the-art methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Jinyuan Zhao, Cunzhao Shi, Fuxi Jia, Yanna Wang, Baihua Xiao〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Binarization is often the first step in many document analysis tasks and plays a key role in the subsequent steps. In this paper, we formulate binarization as an image-to-image generation task and introduce the conditional generative adversarial networks (cGANs) to solve the core problem of multi-scale information combination in binarization task. Our generator consists of two stages: In the first stage, sub-generator 〈em〉G〈/em〉1 learns to extract text pixels from an input image. Different scales of the input image are processed by 〈em〉G〈/em〉1 and corresponding binary images are generated. In the second stage, our sub-generator 〈em〉G〈/em〉2 learns a combination of results at different scales from the first stage and produces the final binary result. We conduct comprehensive experiments of the proposed method on nine public document image binarization datasets. Experimental results show that compared with many classical and state-of-the-art approaches, our method gains promising performance in the accuracy and robustness of binarization.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Weicheng Xie, Xi Jia, Linlin Shen, Meng Yang〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉While weight sparseness-based regularization has been used to learn better deep features for image recognition problems, it introduced a large number of variables for optimization and can easily converge to a local optimum. The L2-norm regularization proposed for face recognition reduces the impact of the noisy information, while expression information is also suppressed during the regularization. A feature sparseness-based regularization that learns deep features with better generalization capability is proposed in this paper. The regularization is integrated into the loss function and optimized with a deep metric learning framework. Through a toy example, it is showed that a simple network with the proposed sparseness outperforms the one with the L2-norm regularization. Furthermore, the proposed approach achieved competitive performances on four publicly available datasets, i.e., FER2013, CK+, Oulu-CASIA and MMI. The state-of-the-art cross-database performances also justify the generalization capability of the proposed approach.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    facet.materialart.
    Unknown
    Elsevier
    Publication Date: 2019
    Description: 〈p〉Publication date: December 2019〈/p〉 〈p〉〈b〉Source:〈/b〉 Pattern Recognition, Volume 96〈/p〉 〈p〉Author(s): Ayan Kumar Bhunia, Ankan Kumar Bhunia, Shuvozit Ghose, Abhirup Das, Partha Pratim Roy, Umapada Pal〈/p〉 〈h5〉Abstract〈/h5〉 〈div〉〈p〉Logo detection in real-world scene images is an important problem with applications in advertisement and marketing. Existing general-purpose object detection methods require large training data with annotations for every logo class. These methods do not satisfy the incremental demand of logo classes necessary for practical deployment since it is practically impossible to have such annotated data for new unseen logo. In this work, we develop an easy-to-implement query-based logo detection and localization system by employing a one-shot learning technique using off the shelf neural network components. Given an image of a query logo, our model searches for logo within a given target image and predicts the possible location of the logo by estimating a binary segmentation mask. The proposed model consists of a conditional branch and a segmentation branch. The former gives a conditional latent representation of the given query logo which is combined with feature maps of the segmentation branch at multiple scales in order to obtain the matching location of the query logo in a target image. Feature matching between the latent query representation and multi-scale feature maps of segmentation branch using simple concatenation operation followed by 1 × 1 convolution layer makes our model scale-invariant. Despite its simplicity, our query-based logo retrieval framework achieved superior performance in FlickrLogos-32 and TopLogos-10 dataset over different existing baseline methods.〈/p〉〈/div〉
    Print ISSN: 0031-3203
    Electronic ISSN: 1873-5142
    Topics: Computer Science
    Published by Elsevier
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...