ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Articles  (11,129)
  • 2015-2019  (11,129)
  • 1945-1949
  • Bioinformatics  (1,636)
  • Algorithms  (852)
  • IEEE Transactions on Pattern Analysis and Machine Intelligence  (791)
  • EURASIP Journal on Wireless Communications and Networking  (691)
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics  (504)
  • 110151
  • 1275
  • 2184
  • 40961
  • 48794
  • Computer Science  (11,129)
Collection
  • Articles  (11,129)
Years
Year
Journal
Topic
  • 1
    Publication Date: 2015-08-08
    Description: : As sequencing becomes cheaper and more widely available, there is a greater need to quickly and effectively analyze large-scale genomic data. While the functionality of AVIA v1.0, whose implementation was based on ANNOVAR, was comparable with other annotation web servers, AVIA v2.0 represents an enhanced web-based server that extends genomic annotations to cell-specific transcripts and protein-level functional annotations. With AVIA’s improved interface, users can better visualize their data, perform comprehensive searches and categorize both coding and non-coding variants. Availability and implementation : AVIA is freely available through the web at http://avia.abcc.ncifcrf.gov . Contact : Hue.Vuong@fnlcr.nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2015-08-08
    Description: : As new methods for multivariate analysis of genome wide association studies become available, it is important to be able to combine results from different cohorts in a meta-analysis. The R package MultiMeta provides an implementation of the inverse-variance-based method for meta-analysis, generalized to an n -dimensional setting. Availability and implementation: The R package MultiMeta can be downloaded from CRAN. Contact: dragana.vuckovic@burlo.trieste.it ; vi1@sanger.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2015-08-12
    Description: We examine a distributed detection problem in a wireless sensor network, where sensor nodes collaborate to detect a Gaussian signal with an unknown change of power, i.e., a scale parameter. Due to power/bandwidth constraints, we consider the case where each sensor quantizes its observation into a binary digit. The binary data are then transmitted through error-prone wireless links to a fusion center, where a generalized likelihood ratio test (GLRT) detector is employed to perform a global decision. We study the design of a binary quantizer based on an asymptotic analysis of the GLRT. Interestingly, the quantization threshold of the quantizer is independent of the unknown scale parameter. Numerical results are included to illustrate the performance of the proposed quantizer and GLRT in binary symmetric channels (BSCs).
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2015-08-13
    Description: More and more hybrid electric vehicles are driven since they offer such advantages as energy savings and better active safety performance. Hybrid vehicles have two or more power driving systems and frequently switch working condition, so controlling stability is very important. In this work, a two-stage Kalman algorithm method is used to fuse data in hybrid vehicle stability testing. First, the RT3102 navigation system and Dewetron system are introduced. Second, a modeling of data fusion is proposed based on the Kalman filter. Then, this modeling is simulated and tested on a sample vehicle, using Carsim and Simulink software to test the results. The results showed the merits of this modeling.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2015-08-17
    Description: In this paper, a comprehensive comparison analysis in terms of outage probability and average symbol error ratio (SER) is presented for cooperative cognitive multiple-input and multiple-output (CC-MIMO) multiuser systems with amplify-and-forward (AF) protocol. Specially, we consider two scenarios where the CC-MIMO multiuser systems have the perfect and imperfect channel state information (CSI). The CC-MIMO multiuser systems consist of one multi-antenna source, one single-antenna relay, and multiple multi-antenna destinations. At the secondary source and destinations, the maximal ratio transmission (MRT) and maximal ratio combining (MRC) are employed, respectively. For such CC-MIMO multiuser systems, we first obtain the exact closed-form expressions of outage probability under the two cases where the CC-MIMO multiuser systems have the perfect and imperfect CSI. Then, to reduce the implementation complexity, the tight lower bounds of outage probability and average SER are derived. Finally, to obtain insight, by using the high signal-to-noise ratio (SNR) approximation, the asymptotic estimations of outage probability are achieved. The numerical results show that the derivations are agreed with the simulations, which validate our derivations. At the same time, the results show that, for the systems without perfect CSI, the achievable diversity order reduces to one, regardless of the number of antennas at the cognitive source and destinations as well as the number of the cognitive destinations. Nevertheless, these key parameters affect the coding gain of the CC-MIMO multiuser systems. When the systems have the perfect CSI (or without feedback delay), the achievable diversity gain is determined by the minimum between the number of source’s antennas and the product of the number of destinations and the number of destination’s antennas. For the effect of PU’s parameters, our results indicate that primary systems only affect the coding gain but not the diversity gain.
    Print ISSN: 1687-1472
    Electronic ISSN: 1687-1499
    Topics: Electrical Engineering, Measurement and Control Technology , Computer Science
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: The goal of cross-domain matching (CDM) is to find correspondences between two sets of objects in different domains in an unsupervised way. CDM has various interesting applications, including photo album summarization where photos are automatically aligned into a designed frame expressed in the Cartesian coordinate system, and temporal alignment which aligns sequences such as videos that are potentially expressed using different features. In this paper, we propose an information-theoretic CDM framework based on squared-loss mutual information (SMI). The proposed approach can directly handle non-linearly related objects/sequences with different dimensions, with the ability that hyper-parameters can be objectively optimized by cross-validation. We apply the proposed method to several real-world problems including image matching, unpaired voice conversion, photo album summarization, cross-feature video and cross-domain video-to-mocap alignment, and Kinect -based action recognition, and experimentally demonstrate that the proposed method is a promising alternative to state-of-the-art CDM methods.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: The skeleton of a 2D shape is an important geometric structure in pattern analysis and computer vision. In this paper we study the skeleton of a 2D shape in a two-manifold $mathcal {M}$ , based on a geodesic metric. We present a formal definition of the skeleton $S(Omega )$ for a shape $Omega$ in $mathcal {M}$ and show several properties that make $S(Omega )$ distinct from its Euclidean counterpart in $mathbb {R}^2$ . We further prove that for a shape sequence $lbrace Omega _irbrace$ that converge to a shape $Omega$ in $mathcal {M}$ , the mapping $Omega righta- row overline{S}(Omega )$ is lower semi-continuous. A direct application of this result is that we can use a set $P$ of sample points to approximate the boundary of a 2D shape $Omega$ in $mathcal {M}$ , and the Voronoi diagram of $P$ inside $Omega subset mathcal {M}$ gives a good approximation to the skeleton $S(Omega )$ . Examples of skeleton computation in topography and brain morphometry are illustrated.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: A widely used approach for locating points on deformable objects in images is to generate feature response images for each point, and then to fit a shape model to these response images. We demonstrate that Random Forest regression-voting can be used to generate high quality response images quickly. Rather than using a generative or a discriminative model to evaluate each pixel, a regressor is used to cast votes for the optimal position of each point. We show that this leads to fast and accurate shape model matching when applied in the Constrained Local Model framework. We evaluate the technique in detail, and compare it with a range of commonly used alternatives across application areas: the annotation of the joints of the hands in radiographs and the detection of feature points in facial images. We show that our approach outperforms alternative techniques, achieving what we believe to be the most accurate results yet published for hand joint annotation and state-of-the-art performance for facial feature point detection.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: We present a novel method to recognise planar structures in a single image and estimate their 3D orientation. This is done by exploiting the relationship between image appearance and 3D structure, using machine learning methods with supervised training data. As such, the method does not require specific features or use geometric cues, such as vanishing points. We employ general feature representations based on spatiograms of gradients and colour, coupled with relevance vector machines for classification and regression. We first show that using hand-labelled training data, we are able to classify pre-segmented regions as being planar or not, and estimate their 3D orientation. We then incorporate the method into a segmentation algorithm to detect multiple planar structures from a previously unseen image.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2015-08-04
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Multiple view segmentation consists in segmenting objects simultaneously in several views. A key issue in that respect and compared to monocular settings is to ensure propagation of segmentation information between views while minimizing complexity and computational cost. In this work, we first investigate the idea that examining measurements at the projections of a sparse set of 3D points is sufficient to achieve this goal. The proposed algorithm softly assigns each of these 3D samples to the scene background if it projects on the background region in at least one view, or to the foreground if it projects on foreground region in all views. Second, we show how other modalities such as depth may be seamlessly integrated in the model and benefit the segmentation. The paper exposes a detailed set of experiments used to validate the algorithm, showing results comparable with the state of art, with reduced computational complexity. We also discuss the use of different modalities for specific situations, such as dealing with a low number of viewpoints or a scene with color ambiguities between foreground and background.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2015-08-05
    Description: Currently deep learning has made great breakthroughs in visual and speech processing, mainly because it draws lessons from the hierarchical mode that brain deals with images and speech. In the field of NLP, a topic model is one of the important ways for modeling documents. Topic models are built on a generative model that clearly does not match the way humans write. In this paper, we propose Event Model, which is unsupervised and based on the language processing mechanism of neurolinguistics, to model documents. In Event Model, documents are descriptions of concrete or abstract events seen, heard, or sensed by people and words are objects in the events. Event Model has two stages: word learning and dimensionality reduction. Word learning is to learn semantics of words based on deep learning. Dimensionality reduction is the process that representing a document as a low dimensional vector by a linear mode that is completely different from topic models. Event Model achieves state-of-the-art results on document retrieval tasks.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Text detection in natural scene images is an important prerequisite for many content-based image analysis tasks, while most current research efforts only focus on horizontal or near horizontal scene text. In this paper, first we present a unified distance metric learning framework for adaptive hierarchical clustering, which can simultaneously learn similarity weights (to adaptively combine different feature similarities) and the clustering threshold (to automatically determine the number of clusters). Then, we propose an effective multi-orientation scene text detection system, which constructs text candidates by grouping characters based on this adaptive clustering. Our text candidates construction method consists of several sequential coarse-to-fine grouping steps: morphology-based grouping via single-link clustering, orientation-based grouping via divisive hierarchical clustering, and projection-based grouping also via divisive clustering. The effectiveness of our proposed system is evaluated on several public scene text databases, e.g., ICDAR Robust Reading Competition data sets (2011 and 2013), MSRA-TD500 and NEOCR. Specifically, on the multi-orientation text data set MSRA-TD500, the $f$ measure of our system is $71$ percent, much better than the state-of-the-art performance. We also construct and release a practical challenging multi-orientation scene text data set (USTB-SV1K), which is available at http://prir.ustb.edu.cn/TexStar/MOMV-text-detection/.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: This study proposes a quantitative measurement of split of the second heart sound (S2) based on nonstationary signal decomposition to deal with overlaps and energy modeling of the subcomponents of S2. The second heart sound includes aortic (A2) and pulmonic (P2) closure sounds. However, the split detection is obscured due to A2-P2 overlap and low energy of P2. To identify such split, HVD method is used to decompose the S2 into a number of components while preserving the phase information. Further, A2s and P2s are localized using smoothed pseudo Wigner-Ville distribution followed by reassignment method. Finally, the split iscalculated by taking the differences between the means of time indices of A2s and P2s. Experiments on total 33 clips of S2 signals are performed for evaluation of the method. The mean ± standard deviation of the split is 34.7 ± 4.6 ms. The method measures the splitefficiently, even when A2-P2 overlap is ≤ 20 ms and the normalized peak temporal ratio of P2 to A2 is low (≥ 0.22). This proposed method thus, demonstrates its robustness by defining split detectability (SDT), the split detection aptness through detecting P2s, by measuring upto 96 percent. Such findings reveal the effectiveness of the method as competent against the other baselines, especially for A2-P2 overlaps and low energy P2.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Post-acquisition denoising of magnetic resonance (MR) images is an important step to improve any quantitative measurement of the acquired data. In this paper, assuming a Rician noise model, a new filtering method based on the linear minimum mean square error (LMMSE) estimation is introduced, which employs the self-similarity property of the MR data to restore the noise-less signal. This method takes into account the structural characteristics of images and the Bayesian mean square error (Bmse) of the estimator to address the denoising problem. In general, a twofold data processing approach is developed; first, the noisy MR data is processed using a patch-based L 2 -norm similarity measure to provide the primary set of samples required for the estimation process. Afterwards, the Bmse of the estimator is derived as the optimization function to analyze the pre-selected samples and minimize the error between the estimated and the underlying signal. Compared to the LMMSE method and also its recently proposed SNR-adapted realization (SNLMMSE), the optimized way of choosing the samples together with the automatic adjustment of the filtering parameters lead to a more robust estimation performance with our approach. Experimental results show the competitive performance of the proposed method in comparison with related state-of-the-art methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2015-08-07
    Description: Large-scale ad hoc analytics of genomic data is popular using the R-programming language supported by over 700 software packages provided by Bioconductor. More recently, analytical jobs are benefitting from on-demand computing and storage, their scalability and their low maintenance cost, all of which are offered by the cloud. While biologists and bioinformaticists can take an analytical job and execute it on their personal workstations, it remains challenging to seamlessly execute the job on the cloud infrastructure without extensive knowledge of the cloud dashboard. How analytical jobs can not only with minimum effort be executed on the cloud, but also how both the resources and data required by the job can be managed is explored in this paper. An open-source light-weight framework for executing R-scripts using Bioconductor packages, referred to as ‘RBioCloud’, is designed and developed. RBioCloud offers a set of simple command-line tools for managing the cloud resources, the data and the execution of the job. Three biological test cases validate the feasibility of RBioCloud. The framework is available from http://www.rbiocloud.com .
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2015-08-07
    Description: Of major interest to translational genomics is the intervention in gene regulatory networks (GRNs) to affect cell behavior; in particular, to alter pathological phenotypes. Owing to the complexity of GRNs, accurate network inference is practically challenging and GRN models often contain considerable amounts of uncertainty. Considering the cost and time required for conducting biological experiments, it is desirable to have a systematic method for prioritizing potential experiments so that an experiment can be chosen to optimally reduce network uncertainty. Moreover, from a translational perspective it is crucial that GRN uncertainty be quantified and reduced in a manner that pertains to the operational cost that it induces, such as the cost of network intervention. In this work, we utilize the concept of mean objective cost of uncertainty (MOCU) to propose a novel framework for optimal experimental design. In the proposed framework, potential experiments are prioritized based on the MOCU expected to remain after conducting the experiment. Based on this prioritization, one can select an optimal experiment with the largest potential to reduce the pertinent uncertainty present in the current network model. We demonstrate the effectiveness of the proposed method via extensive simulations based on synthetic and real gene regulatory networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2015-08-07
    Description: A novel approach to Contact Map Overlap (CMO) problem is proposed using the two dimensional clusters present in the contact maps. Each protein is represented as a set of the non-trivial clusters of contacts extracted from its contact map. The approach involves finding matching regions between the two contact maps using approximate 2D-pattern matching algorithm and dynamic programming technique. These matched pairs of small contact maps are submitted in parallel to a fast heuristic CMO algorithm. The approach facilitates parallelization at this level since all the pairs of contact maps can be submitted to the algorithm in parallel. Then, a merge algorithm is used in order to obtain the overall alignment. As a proof of concept, MSVNS, a heuristic CMO algorithm is used for global as well as local alignment. The divide and conquer approach is evaluated for two benchmark data sets that of Skolnick and Ding et al. It is interesting to note that along with achieving saving of time, better overlap is also obtained for certain protein folds.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Canalizing genes possess broad regulatory power over a wide swath of regulatory processes. On the other hand, it has been hypothesized that the phenomenon of intrinsically multivariate prediction (IMP) is associated with canalization. However, applications have relied on user-selectable thresholds on the IMP score to decide on the presence of IMP. A methodology is developed here that avoids arbitrary thresholds, by providing a statistical test for the IMP score. In addition, the proposed procedure allows the incorporation of prior knowledge if available, which can alleviate the problem of loss of power due to small sample sizes. The issue of multiplicity of tests is addressed by family-wise error rate (FWER) and false discovery rate (FDR) controlling approaches. The proposed methodology is demonstrated by experiments using synthetic and real gene-expression data from studies on melanoma and ionizing radiation (IR) responsive genes. The results with the real data identified DUSP1 and p53, two well-known canalizing genes associated with melanoma and IR response, respectively, as the genes with a clear majority of IMP predictor pairs. This validates the potential of the proposed methodology as a tool for discovery of canalizing genes from binary gene-expression data. The procedure is made available through an R package.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2015-08-08
    Description: Motivation: Stem cell differentiation is largely guided by master transcriptional regulators, but it also depends on the expression of other types of genes, such as cell cycle genes, signaling genes, metabolic genes, trafficking genes, etc. Traditional approaches to understanding gene expression patterns across multiple conditions, such as principal components analysis or K-means clustering, can group cell types based on gene expression, but they do so without knowledge of the differentiation hierarchy. Hierarchical clustering can organize cell types into a tree, but in general this tree is different from the differentiation hierarchy itself. Methods: Given the differentiation hierarchy and gene expression data at each node, we construct a weighted Euclidean distance metric such that the minimum spanning tree with respect to that metric is precisely the given differentiation hierarchy. We provide a set of linear constraints that are provably sufficient for the desired construction and a linear programming approach to identify sparse sets of weights, effectively identifying genes that are most relevant for discriminating different parts of the tree. Results: We apply our method to microarray gene expression data describing 38 cell types in the hematopoiesis hierarchy, constructing a weighted Euclidean metric that uses just 175 genes. However, we find that there are many alternative sets of weights that satisfy the linear constraints. Thus, in the style of random-forest training, we also construct metrics based on random subsets of the genes and compare them to the metric of 175 genes. We then report on the selected genes and their biological functions. Our approach offers a new way to identify genes that may have important roles in stem cell differentiation. Contact: tperkins@ohri.ca Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2015-08-08
    Description: Motivation: Principal component analysis (PCA) is a basic tool often used in bioinformatics for visualization and dimension reduction. However, it is known that PCA may not consistently estimate the true direction of maximal variability in high-dimensional, low sample size settings, which are typical for molecular data. Assuming that the underlying signal is sparse, i.e. that only a fraction of features contribute to a principal component (PC), this estimation consistency can be retained. Most existing sparse PCA methods use L1-penalization, i.e. the lasso , to perform feature selection. But, the lasso is known to lack variable selection consistency in high dimensions and therefore a subsequent interpretation of selected features can give misleading results. Results: We present S4VDPCA, a sparse PCA method that incorporates a subsampling approach, namely stability selection. S4VDPCA can consistently select the truly relevant variables contributing to a sparse PC while also consistently estimate the direction of maximal variability. The performance of the S4VDPCA is assessed in a simulation study and compared to other PCA approaches, as well as to a hypothetical oracle PCA that ‘knows’ the truly relevant features in advance and thus finds optimal, unbiased sparse PCs. S4VDPCA is computationally efficient and performs best in simulations regarding parameter estimation consistency and feature selection consistency. Furthermore, S4VDPCA is applied to a publicly available gene expression data set of medulloblastoma brain tumors. Features contributing to the first two estimated sparse PCs represent genes significantly over-represented in pathways typically deregulated between molecular subgroups of medulloblastoma. Availability and implementation: Software is available at https://github.com/mwsill/s4vdpca . Contact: m.sill@dkfz.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2015-08-08
    Description: Motivation: Glycans play critical roles in many biological processes, and their structural diversity is key for specific protein-glycan recognition. Comparative structural studies of biological molecules provide useful insight into their biological relationships. However, most computational tools are designed for protein structure, and despite their importance, there is no currently available tool for comparing glycan structures in a sequence order- and size-independent manner. Results: A novel method, GS-align, is developed for glycan structure alignment and similarity measurement. GS-align generates possible alignments between two glycan structures through iterative maximum clique search and fragment superposition. The optimal alignment is then determined by the maximum structural similarity score, GS-score, which is size-independent. Benchmark tests against the Protein Data Bank (PDB) N -linked glycan library and PDB homologous/non-homologous N -glycoprotein sets indicate that GS-align is a robust computational tool to align glycan structures and quantify their structural similarity. GS-align is also applied to template-based glycan structure prediction and monosaccharide substitution matrix generation to illustrate its utility. Availability and implementation: http://www.glycanstructure.org/gsalign . Contact: wonpil@ku.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2015-08-08
    Description: Motivation: Impedance-based technologies are advancing methods for measuring proliferation of adherent cell cultures non-invasively and in real time. The analysis of the resulting data has so far been hampered by inappropriate computational methods and the lack of systematic data to evaluate the characteristics of the assay. Results: We used a commercially available system for impedance-based growth measurement (xCELLigence) and compared the reported cell index with data from microscopy. We found that the measured signal correlates linearly with the cell number throughout the time of an experiment with sufficient accuracy in subconfluent cell cultures. The resulting growth curves for various colon cancer cells could be well described with the empirical Richards growth model, which allows for extracting quantitative parameters (such as characteristic cycle times). We found that frequently used readouts like the cell index at a specific time or the area under the growth curve cannot be used to faithfully characterize growth inhibition. We propose to calculate the average growth rate of selected time intervals to accurately estimate time-dependent IC50 values of drugs from growth curves. Contact: nils.bluethgen@charite.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2015-08-22
    Description: Community detection in a complex network is an important problem of much interest in recent years. In general, a community detection algorithm chooses an objective function and captures the communities of the network by optimizing the objective function, and then, one uses various heuristics to solve the optimization problem to extract the interesting communities for the user. In this article, we demonstrate the procedure to transform a graph into points of a metric space and develop the methods of community detection with the help of a metric defined for a pair of points. We have also studied and analyzed the community structure of the network therein. The results obtained with our approach are very competitive with most of the well-known algorithms in the literature, and this is justified over the large collection of datasets. On the other hand, it can be observed that time taken by our algorithm is quite less compared to other methods and justifies the theoretical findings.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2015-08-25
    Description: Motivation: The storage and transmission of high-throughput sequencing data consumes significant resources. As our capacity to produce such data continues to increase, this burden will only grow. One approach to reduce storage and transmission requirements is to compress this sequencing data. Results: We present a novel technique to boost the compression of sequencing that is based on the concept of bucketing similar reads so that they appear nearby in the file. We demonstrate that, by adopting a data-dependent bucketing scheme and employing a number of encoding ideas, we can achieve substantially better compression ratios than existing de novo sequence compression tools, including other bucketing and reordering schemes. Our method, Mince, achieves up to a 45% reduction in file sizes (28% on average) compared with existing state-of-the-art de novo compression schemes. Availability and implementation : Mince is written in C++11, is open source and has been made available under the GPLv3 license. It is available at http://www.cs.cmu.edu/~ckingsf/software/mince . Contact: carlk@cs.cmu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2015-08-25
    Description: : Current methods for motif discovery from chromatin immunoprecipitation followed by sequencing (ChIP-seq) data often identify non-targeted transcription factor (TF) motifs, and are even further limited when peak sequences are similar due to common ancestry rather than common binding factors. The latter aspect particularly affects a large number of proteins from the Cys 2 His 2 zinc finger (C2H2-ZF) class of TFs, as their binding sites are often dominated by endogenous retroelements that have highly similar sequences. Here, we present recognition code-assisted discovery of regulatory elements (RCADE) for motif discovery from C2H2-ZF ChIP-seq data. RCADE combines predictions from a DNA recognition code of C2H2-ZFs with ChIP-seq data to identify models that represent the genuine DNA binding preferences of C2H2-ZF proteins. We show that RCADE is able to identify generalizable binding models even from peaks that are exclusively located within the repeat regions of the genome, where state-of-the-art motif finding approaches largely fail. Availability and implementation: RCADE is available as a webserver and also for download at http://rcade.ccbr.utoronto.ca/ . Supplementary information: Supplementary data are available at Bioinformatics online. Contact: t.hughes@utoronto.ca
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2015-08-25
    Description: Motivation: Phylogenetic estimates from published studies can be archived using general platforms like Dryad (Vision, 2010) or TreeBASE (Sanderson et al. , 1994). Such services fulfill a crucial role in ensuring transparency and reproducibility in phylogenetic research. However, digital tree data files often require some editing (e.g. rerooting) to improve the accuracy and reusability of the phylogenetic statements. Furthermore, establishing the mapping between tip labels used in a tree and taxa in a single common taxonomy dramatically improves the ability of other researchers to reuse phylogenetic estimates. As the process of curating a published phylogenetic estimate is not error-free, retaining a full record of the provenance of edits to a tree is crucial for openness, allowing editors to receive credit for their work and making errors introduced during curation easier to correct. Results : Here, we report the development of software infrastructure to support the open curation of phylogenetic data by the community of biologists. The backend of the system provides an interface for the standard database operations of creating, reading, updating and deleting records by making commits to a git repository. The record of the history of edits to a tree is preserved by git’s version control features. Hosting this data store on GitHub ( http://github.com/ ) provides open access to the data store using tools familiar to many developers. We have deployed a server running the ‘phylesystem-api’, which wraps the interactions with git and GitHub. The Open Tree of Life project has also developed and deployed a JavaScript application that uses the phylesystem-api and other web services to enable input and curation of published phylogenetic statements. Availability and implementation : Source code for the web service layer is available at https://github.com/OpenTreeOfLife/phylesystem-api . The data store can be cloned from: https://github.com/OpenTreeOfLife/phylesystem . A web application that uses the phylesystem web services is deployed at http://tree.opentreeoflife.org/curator . Code for that tool is available from https://github.com/OpenTreeOfLife/opentree . Contact : mtholder@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2015-08-25
    Description: Motivation: Chromatin Immunoprecipitation followed by sequencing (ChIP-seq) detects genome-wide DNA–protein interactions and chromatin modifications, returning enriched regions (ERs), usually associated with a significance score. Moderately significant interactions can correspond to true, weak interactions, or to false positives; replicates of a ChIP-seq experiment can provide co-localised evidence to decide between the two cases. We designed a general methodological framework to rigorously combine the evidence of ERs in ChIP-seq replicates, with the option to set a significance threshold on the repeated evidence and a minimum number of samples bearing this evidence. Results : We applied our method to Myc transcription factor ChIP-seq datasets in K562 cells available in the ENCODE project. Using replicates, we could extend up to 3 times the ER number with respect to single-sample analysis with equivalent significance threshold. We validated the ‘rescued’ ERs by checking for the overlap with open chromatin regions and for the enrichment of the motif that Myc binds with strongest affinity; we compared our results with alternative methods (IDR and jMOSAiCS), obtaining more validated peaks than the former and less peaks than latter, but with a better validation. Availability and implementation : An implementation of the proposed method and its source code under GPLv3 license are freely available at http://www.bioinformatics.deib.polimi.it/MSPC/ and http://mspc.codeplex.com/ , respectively. Contact : marco.morelli@iit.it Supplementary information: Supplementary Material are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2015-08-25
    Description: : We announce the release of kSNP3.0, a program for SNP identification and phylogenetic analysis without genome alignment or the requirement for reference genomes. kSNP3.0 is a significantly improved version of kSNP v2. Availability and implementation : kSNP3.0 is implemented as a package of stand-alone executables for Linux and Mac OS X under the open-source BSD license. The executable packages, source code and a full User Guide are freely available at https://sourceforge.net/projects/ksnp/files/ Contact: barryghall@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2015-08-25
    Description: Motivation: We have created an R package named phylogeo that provides a set of geographic utilities for sequencing-based microbial ecology studies. Although the geographic location of samples is an important aspect of environmental microbiology, none of the major software packages used in processing microbiome data include utilities that allow users to map and explore the spatial dimension of their data. phylogeo solves this problem by providing a set of plotting and mapping functions that can be used to visualize the geographic distribution of samples, to look at the relatedness of microbiomes using ecological distance, and to map the geographic distribution of particular sequences. By extending the popular phyloseq package and using the same data structures and command formats, phylogeo allows users to easily map and explore the geographic dimensions of their data from the R programming language. Availability and Implementation: phylogeo is documented and freely available http://zachcp.github.io/phylogeo Contact : zcharlop@rockefeller.edu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2015-08-25
    Description: : Gener is a development module for programming chemical controllers based on DNA strand displacement. Gener is developed with the aim of providing a simple interface that minimizes the opportunities for programming errors: Gener allows the user to test the computations of the DNA programs based on a simple two-domain strand displacement algebra, the minimal available so far. The tool allows the user to perform stepwise computations with respect to the rules of the algebra as well as exhaustive search of the computation space with different options for exploration and visualization. Gener can be used in combination with existing tools, and in particular, its programs can be exported to Microsoft Research’s DSD tool as well as to LaTeX. Availability and implementation : Gener is available for download at the Cosbi website at http://www.cosbi.eu/research/prototypes/gener as a windows executable that can be run on Mac OS X and Linux by using Mono. Contact : ozan@cosbi.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2015-08-25
    Description: Motivation: Molecular dynamics simulations provide atomic insight into the physicochemical characteristics of lipid membranes and hence, a wide range of force field families capable of modelling various lipid types have been developed in recent years. To model membranes in a biologically realistic lipid composition, simulation systems containing multiple different lipids must be assembled. Results: We present a new web service called MemGen that is capable of setting up simulation systems of heterogenous lipid membranes. MemGen is not restricted to certain lipid force fields or lipid types, but instead builds membranes from uploaded structure files which may contain any kind of amphiphilic molecule. MemGen works with any all-atom or united-atom lipid representation. Availability and implementation: MemGen is freely available without registration at http://memgen.uni-goettingen.de . Contact: jhub@gwdg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2015-08-21
    Description: A three-step iterative method with fifth-order convergence as a new modification of Newton’s method was presented. This method is for finding multiple roots of nonlinear equation with unknown multiplicity m whose multiplicity m is the highest multiplicity. Its order of convergence is analyzed and proved. Results for some numerical examples show the efficiency of the new method.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2015-08-23
    Description: To reduce the energy cost of underwater acoustic sensor networks (UWSNs), the duty cycle (i.e., periodic wake-up and sleep) concept has been used in several medium access control (MAC) protocols. Although these protocols are energy efficient, they sacrifice bandwidth utilization, which leads to lower transmission rate. In order to solve this problem, asynchronous duty cycle with network-coding Asynchronous Duty Cycle with Network-Coding MAC (ADCNC-MAC) is proposed. It contains initialization of the MAC protocol phase and data transmission phase. In the first phase, we use an asynchronous duty cycle to find a rendezvous time for exchanging data. A strategy to select network coder nodes is presented to confirm the number of network coder nodes and distribution in the network coder layer. In the data transmission phase, the network coder nodes transmit using the proposed network-coding-based algorithm and a higher volume of packet will be transmitted to the Sink with the same number of transmissions. Simulation results show that ADCNC-MAC achieves higher power efficiency, improves packet delivery ratio (PDR), and network throughput.
    Print ISSN: 1687-1472
    Electronic ISSN: 1687-1499
    Topics: Electrical Engineering, Measurement and Control Technology , Computer Science
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2015-08-25
    Description: Motivation: In practice, identifying and interpreting the functional impacts of the regulatory relationships between micro-RNA and messenger-RNA is non-trivial. The sheer scale of possible micro-RNA and messenger-RNA interactions can make the interpretation of results difficult. Results: We propose a supervised framework, pMim, built upon concepts of significance combination, for jointly ranking regulatory micro-RNA and their potential functional impacts with respect to a condition of interest. Here, pMim directly tests if a micro-RNA is differentially expressed and if its predicted targets, which lie in a common biological pathway, have changed in the opposite direction. We leverage the information within existing micro-RNA target and pathway databases to stabilize the estimation and annotation of micro-RNA regulation making our approach suitable for datasets with small sample sizes. In addition to outputting meaningful and interpretable results, we demonstrate in a variety of datasets that the micro-RNA identified by pMim, in comparison to simpler existing approaches, are also more concordant with what is described in the literature. Availability and implementation: This framework is implemented as an R function, pMim , in the package sydSeq available from http://www.ellispatrick.com/r-packages . Contact: jean.yang@sydney.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2015-08-25
    Description: Motivation: Cellular mRNA levels originate from the combined action of multiple regulatory processes, which can be recapitulated by the rates of pre-mRNA synthesis, pre-mRNA processing and mRNA degradation. Recent experimental and computational advances set the basis to study these intertwined levels of regulation. Nevertheless, software for the comprehensive quantification of RNA dynamics is still lacking. Results: INSPEcT is an R package for the integrative analysis of RNA- and 4sU-seq data to study the dynamics of transcriptional regulation. INSPEcT provides gene-level quantification of these rates, and a modeling framework to identify which of these regulatory processes are most likely to explain the observed mRNA and pre-mRNA concentrations. Software performance is tested on a synthetic dataset, instrumental to guide the choice of the modeling parameters and the experimental design. Availability and implementation: INSPEcT is submitted to Bioconductor and is currently available as Supplementary Additional File S1 . Contact: mattia.pelizzola@iit.it Supplementary Information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2015-08-25
    Description: Motivation: Experimentally determined gene regulatory networks can be enriched by computational inference from high-throughput expression profiles. However, the prediction of regulatory interactions is severely impaired by indirect and spurious effects, particularly for eukaryotes. Recently, published methods report improved predictions by exploiting the a priori known targets of a regulator (its local topology) in addition to expression profiles. Results: We find that methods exploiting known targets show an unexpectedly high rate of false discoveries. This leads to inflated performance estimates and the prediction of an excessive number of new interactions for regulators with many known targets. These issues are hidden from common evaluation and cross-validation setups, which is due to Simpson’s paradox. We suggest a confidence score recalibration method (CoRe) that reduces the false discovery rate and enables a reliable performance estimation. Conclusions: CoRe considerably improves the results of network inference methods that exploit known targets. Predictions then display the biological process specificity of regulators more correctly and enable the inference of accurate genome-wide regulatory networks in eukaryotes. For yeast, we propose a network with more than 22 000 confident interactions. We point out that machine learning approaches outside of the area of network inference may be affected as well. Availability and implementation: Results, executable code and networks are available via our website http://www.bio.ifi.lmu.de/forschung/CoRe . Contact: robert.kueffner@helmholtz-muenchen.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2015-08-25
    Description: Motivation: Stoichiometric and constraint-based methods of computational strain design have become an important tool for rational metabolic engineering. One of those relies on the concept of constrained minimal cut sets (cMCSs). However, as most other techniques, cMCSs may consider only reaction (or gene) knockouts to achieve a desired phenotype. Results : We generalize the cMCSs approach to constrained regulatory MCSs (cRegMCSs), where up/downregulation of reaction rates can be combined along with reaction deletions. We show that flux up/downregulations can virtually be treated as cuts allowing their direct integration into the algorithmic framework of cMCSs. Because of vastly enlarged search spaces in genome-scale networks, we developed strategies to (optionally) preselect suitable candidates for flux regulation and novel algorithmic techniques to further enhance efficiency and speed of cMCSs calculation. We illustrate the cRegMCSs approach by a simple example network and apply it then by identifying strain designs for ethanol production in a genome-scale metabolic model of Escherichia coli. The results clearly show that cRegMCSs combining reaction deletions and flux regulations provide a much larger number of suitable strain designs, many of which are significantly smaller relative to cMCSs involving only knockouts. Furthermore, with cRegMCSs, one may also enable the fine tuning of desired behaviours in a narrower range. The new cRegMCSs approach may thus accelerate the implementation of model-based strain designs for the bio-based production of fuels and chemicals. Availability and implementation: MATLAB code and the examples can be downloaded at http://www.mpi-magdeburg.mpg.de/projects/cna/etcdownloads.html . Contact : krishna.mahadevan@utoronto.ca or klamt@mpi-magdeburg.mpg.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2015-08-25
    Description: Motivation: Lipids are a large and diverse group of biological molecules with roles in membrane formation, energy storage and signaling. Cellular lipidomes may contain tens of thousands of structures, a staggering degree of complexity whose significance is not yet fully understood. High-throughput mass spectrometry-based platforms provide a means to study this complexity, but the interpretation of lipidomic data and its integration with prior knowledge of lipid biology suffers from a lack of appropriate tools to manage the data and extract knowledge from it. Results: To facilitate the description and exploration of lipidomic data and its integration with prior biological knowledge, we have developed a knowledge resource for lipids and their biology—SwissLipids. SwissLipids provides curated knowledge of lipid structures and metabolism which is used to generate an in silico library of feasible lipid structures. These are arranged in a hierarchical classification that links mass spectrometry analytical outputs to all possible lipid structures, metabolic reactions and enzymes. SwissLipids provides a reference namespace for lipidomic data publication, data exploration and hypothesis generation. The current version of SwissLipids includes over 244 000 known and theoretically possible lipid structures, over 800 proteins, and curated links to published knowledge from over 620 peer-reviewed publications. We are continually updating the SwissLipids hierarchy with new lipid categories and new expert curated knowledge. Availability: SwissLipids is freely available at http://www.swisslipids.org/ . Contact: alan.bridge@isb-sib.ch Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2015-08-25
    Description: Motivation: Both the quantitative real-time polymerase chain reaction (qPCR) and quantitative isothermal amplification (qIA) are standard methods for nucleic acid quantification. Numerous real-time read-out technologies have been developed. Despite the continuous interest in amplification-based techniques, there are only few tools for pre-processing of amplification data. However, a transparent tool for precise control of raw data is indispensable in several scenarios, for example, during the development of new instruments. Results: chipPCR is an R package for the pre-processing and quality analysis of raw data of amplification curves. The package takes advantage of R ’s S 4 object model and offers an extensible environment. chipPCR contains tools for raw data exploration: normalization, baselining, imputation of missing values, a powerful wrapper for amplification curve smoothing and a function to detect the start and end of an amplification curve. The capabilities of the software are enhanced by the implementation of algorithms unavailable in R , such as a 5-point stencil for derivative interpolation. Simulation tools, statistical tests, plots for data quality management, amplification efficiency/quantification cycle calculation, and datasets from qPCR and qIA experiments are part of the package. Core functionalities are integrated in GUIs (web-based and standalone shiny applications), thus streamlining analysis and report generation. Availability and implementation: http://cran.r-project.org/web/packages/chipPCR . Source code: https://github.com/michbur/chipPCR . Contact : stefan.roediger@b-tu.de Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2015-08-25
    Description: : A key to understanding RNA function is to uncover its complex 3D structure. Experimental methods used for determining RNA 3D structures are technologically challenging and laborious, which makes the development of computational prediction methods of substantial interest. Previously, we developed the iFoldRNA server that allows accurate prediction of short (〈50 nt) tertiary RNA structures starting from primary sequences. Here, we present a new version of the iFoldRNA server that permits the prediction of tertiary structure of RNAs as long as a few hundred nucleotides. This substantial increase in the server capacity is achieved by utilization of experimental information such as base-pairing and hydroxyl-radical probing. We demonstrate a significant benefit provided by integration of experimental data and computational methods. Availability and implementation: http://ifoldrna.dokhlab.org Contact: dokh@unc.eu
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2015-08-25
    Description: : The ms-data-core-api is a free, open-source library for developing computational proteomics tools and pipelines. The Application Programming Interface, written in Java, enables rapid tool creation by providing a robust, pluggable programming interface and common data model. The data model is based on controlled vocabularies/ontologies and captures the whole range of data types included in common proteomics experimental workflows, going from spectra to peptide/protein identifications to quantitative results. The library contains readers for three of the most used Proteomics Standards Initiative standard file formats: mzML, mzIdentML, and mzTab. In addition to mzML, it also supports other common mass spectra data formats: dta, ms2, mgf, pkl, apl (text-based), mzXML and mzData (XML-based). Also, it can be used to read PRIDE XML, the original format used by the PRIDE database, one of the world-leading proteomics resources. Finally, we present a set of algorithms and tools whose implementation illustrates the simplicity of developing applications using the library. Availability and implementation: The software is freely available at https://github.com/PRIDE-Utilities/ms-data-core-api . Supplementary information: Supplementary data are available at Bioinformatics online Contact: juan@ebi.ac.uk
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2015-08-25
    Description: : Scanning probe microscopy (SPM) is already a relevant tool in biological research at the nanoscale. We present ‘Flatten plus’, a recent and helpful implementation in the well-known WSxM free software package. ‘Flatten plus’ allows reducing low-frequency noise in SPM images in a semi-automated way preventing the appearance of typical artifacts associated with such filters. Availability and implementation: WSxM is a free software implemented in C++ supported on MS Windows, but it can also be run under Mac or Linux using emulators such as Wine or Parallels. WSxM can be downloaded from http://www.wsxmsolutions.com/ . Contact: ignacio.horcas@wsxmsolutions.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2015-08-25
    Description: : Despite the plethora of methods available for the functional analysis of omics data, obtaining comprehensive-yet detailed understanding of the results remains challenging. This is mainly due to the lack of publicly available tools for the visualization of this type of information. Here we present an R package called GOplot, based on ggplot2, for enhanced graphical representation. Our package takes the output of any general enrichment analysis and generates plots at different levels of detail: from a general overview to identify the most enriched categories (bar plot, bubble plot) to a more detailed view displaying different types of information for molecules in a given set of categories (circle plot, chord plot, cluster plot). The package provides a deeper insight into omics data and allows scientists to generate insightful plots with only a few lines of code to easily communicate the findings. Availability and Implementation: The R package GOplot is available via CRAN-The Comprehensive R Archive Network: http://cran.r-project.org/web/packages/GOplot . The shiny web application of the Venn diagram can be found at: https://wwalter.shinyapps.io/Venn/ . A detailed manual of the package with sample figures can be found at https://wencke.github.io/ Contact: fscabo@cnic.es or mricote@cnic.es
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2015-08-25
    Description: Motivation: Horizontal transfer of transposable (HTT) elements among eukaryotes was discovered in the mid-1980s. As then, 〉300 new cases have been described. New findings about HTT are revealing the evolutionary impact of this phenomenon on host genomes. In order to provide an up to date, interactive and expandable database for such events, we developed the HTT-DB database. Results: HTT-DB allows easy access to most of HTT cases reported along with rich information about each case. Moreover, it allows the user to generate tables and graphs based on searches using Transposable elements and/or host species classification and export them in several formats. Availability and implementation: This database is freely available on the web at http://lpa.saogabriel.unipampa.edu.br:8080/httdatabase . HTT-DB was developed based on Java and MySQL with all major browsers supported. Tools and software packages used are free for personal or non-profit projects. Contact: bdotto82@gmail.com or gabriel.wallau@gmail.com
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2015-08-04
    Description: A novel approach for event summarization and rare event detection is proposed. Unlike conventional methods that deal with event summarization and rare event detection independently, our method solves them in a single framework by transforming them into a graph editing problem. In our approach, a video is represented by a graph, each node of which indicates an event obtained by segmenting the video spatially and temporally. The edges between nodes describe the relationship between events. Based on the degree of relations, edges have different weights. After learning the graph structure, our method finds subgraphs that represent event summarization and rare events in the video by editing the graph, that is, merging its subgraphs or pruning its edges. The graph is edited to minimize a predefined energy model with the Markov Chain Monte Carlo (MCMC) method. The energy model consists of several parameters that represent the causality, frequency, and significance of events. We design a specific energy model that uses these parameters to satisfy each objective of event summarization and rare event detection. The proposed method is extended to obtain event summarization and rare event detection results across multiple videos captured from multiple views. For this purpose, the proposed method independently learns and edits each graph of individual videos for event summarization or rare event detection. Then, the method matches the extracted multiple graphs to each other, and constructs a single composite graph that represents event summarization or rare events from multiple views. Experimental results show that the proposed approach accurately summarizes multiple videos in a fully unsupervised manner . Moreover, the experiments demonstrate that the approach is advantageous in detecting rare transition of events .
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Object tracking has been one of the most important and active research areas in the field of computer vision. A large number of tracking algorithms have been proposed in recent years with demonstrated success. However, the set of sequences used for evaluation is often not sufficient or is sometimes biased for certain types of algorithms. Many datasets do not have common ground-truth object positions or extents, and this makes comparisons among the reported quantitative results difficult. In addition, the initial conditions or parameters of the evaluated tracking algorithms are not the same, and thus, the quantitative results reported in literature are incomparable or sometimes contradictory. To address these issues, we carry out an extensive evaluation of the state-of-the-art online object-tracking algorithms with various evaluation criteria to understand how these methods perform within the same framework. In this work, we first construct a large dataset with ground-truth object positions and extents for tracking and introduce the sequence attributes for the performance analysis. Second, we integrate most of the publicly available trackers into one code library with uniform input and output formats to facilitate large-scale performance evaluation. Third, we extensively evaluate the performance of 31 algorithms on 100 sequences with different initialization settings. By analyzing the quantitative results, we identify effective approaches for robust tracking and provide potential future research directions in this field.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Fused Lasso is a popular regression technique that encodes the smoothness of the data. It has been applied successfully to many applications with a smooth feature structure. However, the computational cost of the existing solvers for fused Lasso is prohibitive when the feature dimension is extremely large. In this paper, we propose novel screening rules that are able to quickly identity the adjacent features with the same coefficients. As a result, the number of variables to be estimated can be significantly reduced, leading to substantial savings in computational cost and memory usage. To the best of our knowledge, the proposed approach is the first attempt to develop screening methods for the fused Lasso problem with general data matrix. Our major contributions are: 1) we derive a new dual formulation of fused Lasso that comes with several desirable properties; 2) we show that the new dual formulation of fused Lasso is equivalent to that of the standard Lasso by two affine transformations; 3) we propose a novel framework for developing effective and efficient screening rules for f used La sso via the m onotonicity of the s ubdifferentials (FLAMS). Some appealing features of FLAMS are: 1) our methods are safe in the sense that the detected adjacent features are guaranteed to have the same coefficients; 2) the dataset needs to be scanned only once to run the screening, whose computational cost is negligible compared to that of solving the fused Lasso; (3) FLAMS is independent of the solvers and can be integrated with any existing solvers. We have evaluated the proposed FLAMS rules on both synthetic and real datasets. The experiments indicate that FLAMS is very effective in identifying the adjacent features with the same coefficients. The speedup gained by FLAMS can be orders of magnitude.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Hidden conditional random fields (HCRFs) are discriminative latent variable models which have been shown to successfully learn the hidden structure of a given classification problem. An Infinite hidden conditional random field is a hidden conditional random field with a countably infinite number of hidden states, which rids us not only of the necessity to specify a priori a fixed number of hidden states available but also of the problem of overfitting. Markov chain Monte Carlo (MCMC) sampling algorithms are often employed for inference in such models. However, convergence of such algorithms is rather difficult to verify, and as the complexity of the task at hand increases the computational cost of such algorithms often becomes prohibitive. These limitations can be overcome by variational techniques. In this paper, we present a generalized framework for infinite HCRF models, and a novel variational inference approach on a model based on coupled Dirichlet Process Mixtures, the HCRF-DPM. We show that the variational HCRF-DPM is able to converge to a correct number of represented hidden states, and performs as well as the best parametric HCRFs—chosen via cross-validation—for the difficult tasks of recognizing instances of agreement, disagreement, and pain in audiovisual sequences.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: In this paper, we address the challenging problem of detecting pedestrians who appear in groups. A new approach is proposed for single-pedestrian detection aided by two-pedestrian detection. A mixture model of two-pedestrian detectors is designed to capture the unique visual cues which are formed by nearby pedestrians but cannot be captured by single-pedestrian detectors. A probabilistic framework is proposed to model the relationship between the configurations estimated by single- and two-pedestrian detectors, and to refine the single-pedestrian detection result using two-pedestrian detection. The two-pedestrian detector can integrate with any single-pedestrian detector. Twenty-five state-of-the-art single-pedestrian detection approaches are combined with the two-pedestrian detector on three widely used public datasets: Caltech, TUD-Brussels, and ETH. Experimental results show that our framework improves all these approaches. The average improvement is $9$ percent on the Caltech-Test dataset, $11$ percent on the TUD-Brussels dataset and $17$ percent on the ETH dataset in terms of average miss rate. The lowest average miss rate is reduced from $37$ to percent on the Caltech-Test dataset, from $55$ to $50$ percent on the TUD-Brussels dataset and from $43$ to $38$ percent on the ETH dataset.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2015-08-07
    Description: The possibility of having information access anytime and anywhere has caused a huge increase of the popularity of wireless networks. Requirements of users and owners have been ever-increasing. However, concerns about the potential health impact of exposure to radio frequency (RF) sources have arisen and are getting accounted for in wireless network planning. In addition to adequate coverage and reduced human exposure, the installation cost of the wireless network is also an important criterion in the planning process. In this paper, a hybrid algorithm is used to optimize indoor wireless network planning while satisfying three demands: maximum coverage, minimal full installation cost (cabling, cable gutters, drilling holes, labor, etc.), and minimal human exposure. For the first time, wireless indoor networks are being optimized based on these advanced and realistic conditions. The algorithm is investigated for three scenarios and for different configurations. The impact of different exposure requirements and cost scenarios is assessed.
    Print ISSN: 1687-1472
    Electronic ISSN: 1687-1499
    Topics: Electrical Engineering, Measurement and Control Technology , Computer Science
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: The Regression Network plugin for Cytoscape ( RegNetC ) implements the RegNet algorithm for the inference of transcriptional association network from gene expression profiles. This algorithm is a model tree-based method to detect the relationship between each gene and the remaining genes simultaneously instead of analyzing individually each pair of genes as correlation-based methods do. Model trees are a very useful technique to estimate the gene expression value by regression models and favours localized similarities over more global similarity, which is one of the major drawbacks of correlation-based methods. Here, we present an integrated software suite, named RegNetC , as a Cytoscape plugin that can operate on its own as well. RegNetC facilitates, according to user-defined parameters, the resulted transcriptional gene association network in .sif format for visualization, analysis and interoperates with other Cytoscape plugins, which can be exported for publication figures. In addition to the network, the RegNetC plugin also provides the quantitative relationships between genes expression values of those genes involved in the inferred network, i.e., those defined by the regression models.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: We introduce a new method for normalization of data acquired by liquid chromatography coupled with mass spectrometry (LC-MS) in label-free differential expression analysis. Normalization of LC-MS data is desired prior to subsequent statistical analysis to adjust variabilities in ion intensities that are not caused by biological differences but experimental bias. There are different sources of bias including variabilities during sample collection and sample storage, poor experimental design, noise, etc. In addition, instrument variability in experiments involving a large number of LC-MS runs leads to a significant drift in intensity measurements. Although various methods have been proposed for normalization of LC-MS data, there is no universally applicable approach. In this paper, we propose a Bayesian normalization model (BNM) that utilizes scan-level information from LC-MS data. Specifically, the proposed method uses peak shapes to model the scan-level data acquired from extracted ion chromatograms (EIC) with parameters considered as a linear mixed effects model. We extended the model into BNM with drift (BNMD) to compensate for the variability in intensity measurements due to long LC-MS runs. We evaluated the performance of our method using synthetic and experimental data. In comparison with several existing methods, the proposed BNM and BNMD yielded significant improvement.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2015-08-07
    Description: Performing clustering analysis is one of the important research topics in cancer discovery using gene expression profiles, which is crucial in facilitating the successful diagnosis and treatment of cancer. While there are quite a number of research works which perform tumor clustering, few of them considers how to incorporate fuzzy theory together with an optimization process into a consensus clustering framework to improve the performance of clustering analysis. In this paper, we first propose a random double clustering based cluster ensemble framework (RDCCE) to perform tumor clustering based on gene expression data. Specifically, RDCCE generates a set of representative features using a randomly selected clustering algorithm in the ensemble, and then assigns samples to their corresponding clusters based on the grouping results. In addition, we also introduce the random double clustering based fuzzy cluster ensemble framework (RDCFCE), which is designed to improve the performance of RDCCE by integrating the newly proposed fuzzy extension model into the ensemble framework. RDCFCE adopts the normalized cut algorithm as the consensus function to summarize the fuzzy matrices generated by the fuzzy extension models, partition the consensus matrix, and obtain the final result. Finally, adaptive RDCFCE (A-RDCFCE) is proposed to optimize RDCFCE and improve the performance of RDCFCE further by adopting a self-evolutionary process (SEPP) for the parameter set. Experiments on real cancer gene expression profiles indicate that RDCFCE and A-RDCFCE works well on these data sets, and outperform most of the state-of-the-art tumor clustering algorithms.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: Named-entity recognition (NER) plays an important role in the development of biomedical databases. However, the existing NER tools produce multifarious named-entities which may result in both curatable and non-curatable markers. To facilitate biocuration with a straightforward approach, classifying curatable named-entities is helpful with regard to accelerating the biocuration workflow. Co-occurrence Interaction Nexus with Named-entity Recognition (CoINNER) is a web-based tool that allows users to identify genes, chemicals, diseases, and action term mentions in the Comparative Toxicogenomic Database (CTD). To further discover interactions, CoINNER uses multiple advanced algorithms to recognize the mentions in the BioCreative IV CTD Track. CoINNER is developed based on a prototype system that annotated gene, chemical, and disease mentions in PubMed abstracts at BioCreative 2012 Track I (literature triage). We extended our previous system in developing CoINNER. The pre-tagging results of CoINNER were developed based on the state-of-the-art named entity recognition tools in BioCreative III. Next, a method based on conditional random fields (CRFs) is proposed to predict chemical and disease mentions in the articles. Finally, action term mentions were collected by latent Dirichlet allocation (LDA). At the BioCreative IV CTD Track, the best F-measures reached for gene/protein, chemical/drug and disease NER were 54 percent while CoINNER achieved a 61.5 percent F-measure. System URL: http://ikmbio.csie.ncku.edu.tw/coinner/introduction.htm.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2015-08-07
    Description: Next-generation short-read sequencing is widely utilized in genomic studies. Biological applications require an alignment step to map sequencing reads to the reference genome, before acquiring expected genomic information. This requirement makes alignment accuracy a key factor for effective biological interpretation. Normally, when accounting for measurement errors and single nucleotide polymorphisms, short read mappings with a few mismatches are generally considered acceptable. However, to further improve the efficiency of short-read sequencing alignment, we propose a method to retrieve additional reliably aligned reads (reads with more than a pre-defined number of mismatches), using a Bayesian-based approach. In this method, we first retrieve the sequence context around the mismatched nucleotides within the already aligned reads; these loci contain the genomic features where sequencing errors occur. Then, using the derived pattern, we evaluate the remaining (typically discarded) reads with more than the allowed number of mismatches, and calculate a score that represents the probability that a specific alignment is correct. This strategy allows the extraction of more reliably aligned reads, therefore improving alignment sensitivity. Implementation: The source code of our tool, ResSeq, can be downloaded from: https://github.com/hrbeubiocenter/Resseq.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In genome assembly graphs, motifs such as tips, bubbles, and cross links are studied in order to find sequencing errors and to understand the nature of the genome. Superbubble, a complex generalization of bubbles, was recently proposed as an important subgraph class for analyzing assembly graphs. At present, a quadratic time algorithm is known. This paper gives an -time algorithm to solve this problem for a graph with $m$ edges.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: The papers in this special section focus on software and databases that are central in bioinformatics and computational biology.. These programs are playing more and more important roles in biology and medical research. These papers cover a broad range of topics, including computational genomics and transcriptomics, analysis of biological networks and interactions, drug design, biomedical signal/image analysis, biomedical text mining and ontologies, biological data mining, visualization and integration, and high performance computing application in bioinformatics.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-07
    Description: In the computational biology community, machine learning algorithms are key instruments for many applications, including the prediction of gene-functions based upon the available biomolecular annotations. Additionally, they may also be employed to compute similarity between genes or proteins. Here, we describe and discuss a software suite we developed to implement and make publicly available some of such prediction methods and a computational technique based upon Latent Semantic Indexing (LSI), which leverages both inferred and available annotations to search for semantically similar genes. The suite consists of three components. BioAnnotationPredictor is a computational software module to predict new gene-functions based upon Singular Value Decomposition of available annotations. SimilBio is a Web module that leverages annotations available or predicted by BioAnnotationPredictor to discover similarities between genes via LSI. The suite includes also SemSim , a new Web service built upon these modules to allow accessing them programmatically. We integrated SemSim in the Bio Search Computing framework (http://www.bioinformatics.deib.polimi.it/bio-seco/seco/), where users can exploit the Search Computing technology to run multi-topic complex queries on multiple integrated Web services. Accordingly, researchers may obtain ranked answers involving the computation of the functional similarity between genes in support of biomedical knowledge discovery.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2015-08-07
    Description: Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personali- ed cancer therapy.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2015-08-08
    Description: Physical transceiver implementations for wireless communication systems usually suffer from transmit-radio frequency (Tx-RF) and receiver-RF (Rx-RF) impairments. In this paper, we aim to design efficient coordinated beamforming for multicell multiuser multi-antenna systems by fully taking into account the residual transceiver impairments. Our design objectives include both spectral efficiency and energy efficiency. In particular, we first derive the closed-form expression of the mean square error (MSE) which includes the impact of transceiver impairments. Based on that, we propose an alternating optimization algorithm to solve the coordinated multicell beamforming problems with the goal of minimizing the worst user MSE, and the sum MSE. Then, by exploiting the relationship between the minimum mean square error (MMSE) and the achievable rate, we develop a new algorithm to address the sum rate maximization problem. This approach is further generalized to solve the more intractable energy efficiency optimization problem. We prove that all the proposed iterative algorithms guarantee to converge to a stationary point. Numerical results show that our proposed schemes achieve a better performance than conventional coordinated beamforming algorithms that were designed ignoring the transceiver impairments.
    Print ISSN: 1687-1472
    Electronic ISSN: 1687-1499
    Topics: Electrical Engineering, Measurement and Control Technology , Computer Science
    Published by Springer
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2015-08-08
    Description: Motivation : The majority of variation identified by genome wide association studies falls in non-coding genomic regions and is hypothesized to impact regulatory elements that modulate gene expression. Here we present a statistically rigorous software tool GREGOR (Genomic Regulatory Elements and Gwas Overlap algoRithm) for evaluating enrichment of any set of genetic variants with any set of regulatory features. Using variants from five phenotypes, we describe a data-driven approach to determine the tissue and cell types most relevant to a trait of interest and to identify the subset of regulatory features likely impacted by these variants. Last, we experimentally evaluate six predicted functional variants at six lipid-associated loci and demonstrate significant evidence for allele-specific impact on expression levels. GREGOR systematically evaluates enrichment of genetic variation with the vast collection of regulatory data available to explore novel biological mechanisms of disease and guide us toward the functional variant at trait-associated loci. Availability and implementation : GREGOR, including source code, documentation, examples, and executables, is available at http://genome.sph.umich.edu/wiki/GREGOR . Contact : cristen@umich.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2015-08-08
    Description: Motivation: Genome and transcriptome analyses can be used to explore cancers comprehensively, and it is increasingly common to have multiple omics data measured from each individual. Furthermore, there are rich functional data such as predicted impact of mutations on protein coding and gene/protein networks. However, integration of the complex information across the different omics and functional data is still challenging. Clinical validation, particularly based on patient outcomes such as survival, is important for assessing the relevance of the integrated information and for comparing different procedures. Results: An analysis pipeline is built for integrating genomic and transcriptomic alterations from whole-exome and RNA sequence data and functional data from protein function prediction and gene interaction networks. The method accumulates evidence for the functional implications of mutated potential driver genes found within and across patients. A driver-gene score (DGscore) is developed to capture the cumulative effect of such genes. To contribute to the score, a gene has to be frequently mutated, with high or moderate mutational impact at protein level, exhibiting an extreme expression and functionally linked to many differentially expressed neighbors in the functional gene network. The pipeline is applied to 60 matched tumor and normal samples of the same patient from The Cancer Genome Atlas breast-cancer project. In clinical validation, patients with high DGscores have worse survival than those with low scores ( P = 0.001). Furthermore, the DGscore outperforms the established expression-based signatures MammaPrint and PAM50 in predicting patient survival. In conclusion, integration of mutation, expression and functional data allows identification of clinically relevant potential driver genes in cancer. Availability and implementation: The documented pipeline including annotated sample scripts can be found in http://fafner.meb.ki.se/biostatwiki/driver-genes/ . Contact: yudi.pawitan@ki.se Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2015-08-08
    Description: Motivation: With improvements in next-generation sequencing technologies and reductions in price, ordered RNA-seq experiments are becoming common. Of primary interest in these experiments is identifying genes that are changing over time or space, for example, and then characterizing the specific expression changes. A number of robust statistical methods are available to identify genes showing differential expression among multiple conditions, but most assume conditions are exchangeable and thereby sacrifice power and precision when applied to ordered data. Results: We propose an empirical Bayes mixture modeling approach called EBSeq-HMM. In EBSeq-HMM, an auto-regressive hidden Markov model is implemented to accommodate dependence in gene expression across ordered conditions. As demonstrated in simulation and case studies, the output proves useful in identifying differentially expressed genes and in specifying gene-specific expression paths. EBSeq-HMM may also be used for inference regarding isoform expression. Availability and implementation: An R package containing examples and sample datasets is available at Bioconductor. Contact: kendzior@biostat.wisc.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    facet.materialart.
    Unknown
    Oxford University Press
    Publication Date: 2015-08-08
    Description: Motivation: Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies. Results: We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm’s robustness and discuss its sensitivity with respect to the free parameters. Availability and implementation: The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics . Contact: vlassis@adobe.com Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2015-08-08
    Description: Motivation: Next-generation high-throughput sequencing has become a state-of-the-art technique in genome assembly. Scaffolding is one of the main stages of the assembly pipeline. During this stage, contigs assembled from the paired-end reads are merged into bigger chains called scaffolds. Because of a high level of statistical noise, chimeric reads, and genome repeats the problem of scaffolding is a challenging task. Current scaffolding software packages widely vary in their quality and are highly dependent on the read data quality and genome complexity. There are no clear winners and multiple opportunities for further improvements of the tools still exist. Results: This article presents an efficient scaffolding algorithm ScaffMatch that is able to handle reads with both short (〈600 bp) and long (〉35 000 bp) insert sizes producing high-quality scaffolds. We evaluate our scaffolding tool with the F score and other metrics (N50, corrected N50) on eight datasets comparing it with the most available packages. Our experiments show that ScaffMatch is the tool of preference for the most datasets. Availability and implementation: The source code is available at http://alan.cs.gsu.edu/NGS/?q=content/scaffmatch . Contact: mandric@cs.gsu.edu Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2015-08-08
    Description: Motivation: Identifying protein subchloroplast localization in chloroplast organelle is very helpful for understanding the function of chloroplast proteins. There have existed a few computational prediction methods for protein subchloroplast localization. However, these existing works have ignored proteins with multiple subchloroplast locations when constructing prediction models, so that they can predict only one of all subchloroplast locations of this kind of multilabel proteins. Results: To address this problem, through utilizing label-specific features and label correlations simultaneously, a novel multilabel classifier was developed for predicting protein subchloroplast location(s) with both single and multiple location sites. As an initial study, the overall accuracy of our proposed algorithm reaches 55.52%, which is quite high to be able to become a promising tool for further studies. Availability and implementation: An online web server for our proposed algorithm named MultiP-SChlo was developed, which are freely accessible at http://biomed.zzuli.edu.cn/bioinfo/multip-schlo/ . Contact: pandaxiaoxi@gmail.com or gzli@tongji.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2015-08-08
    Description: Motivation: Loops in proteins are often involved in biochemical functions. Their irregularity and flexibility make experimental structure determination and computational modeling challenging. Most current loop modeling methods focus on modeling single loops. In protein structure prediction, multiple loops often need to be modeled simultaneously. As interactions among loops in spatial proximity can be rather complex, sampling the conformations of multiple interacting loops is a challenging task. Results: In this study, we report a new method called m ulti-loop Di stance-guided S equential chain- Gro wth Monte Carlo ( M -D i SG ro ) for prediction of the conformations of multiple interacting loops in proteins. Our method achieves an average RMSD of 1.93 Å for lowest energy conformations of 36 pairs of interacting protein loops with the total length ranging from 12 to 24 residues. We further constructed a data set containing proteins with 2, 3 and 4 interacting loops. For the most challenging target proteins with four loops, the average RMSD of the lowest energy conformations is 2.35 Å. Our method is also tested for predicting multiple loops in β-barrel membrane proteins. For outer-membrane protein G, the lowest energy conformation has a RMSD of 2.62 Å for the three extracellular interacting loops with a total length of 34 residues (12, 12 and 10 residues in each loop). Availability and implementation : The software is freely available at: tanto.bioe.uic.edu/m-DiSGro. Contact: jinfeng@stat.fsu.edu or jliang@uic.edu Supplementary information : Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2015-08-08
    Description: Motivation: Network comparison is a computationally intractable problem with important applications in systems biology and other domains. A key challenge is to properly quantify similarity between wiring patterns of two networks in an alignment-free fashion. Also, alignment-based methods exist that aim to identify an actual node mapping between networks and as such serve a different purpose. Various alignment-free methods that use different global network properties (e.g. degree distribution) have been proposed. Methods based on small local subgraphs called graphlets perform the best in the alignment-free network comparison task, due to high level of topological detail that graphlets can capture. Among different graphlet-based methods, Graphlet Correlation Distance (GCD) was shown to be the most accurate for comparing networks. Recently, a new graphlet-based method called NetDis was proposed, which was claimed to be superior. We argue against this, as the performance of NetDis was not properly evaluated to position it correctly among the other alignment-free methods. Results : We evaluate the performance of available alignment-free network comparison methods, including GCD and NetDis. We do this by measuring accuracy of each method (in a systematic precision-recall framework) in terms of how well the method can group (cluster) topologically similar networks. By testing this on both synthetic and real-world networks from different domains, we show that GCD remains the most accurate, noise-tolerant and computationally efficient alignment-free method. That is, we show that NetDis does not outperform the other methods, as originally claimed, while it is also computationally more expensive. Furthermore, since NetDis is dependent on the choice of a network null model (unlike the other graphlet-based methods), we show that its performance is highly sensitive to the choice of this parameter. Finally, we find that its performance is not independent on network sizes and densities, as originally claimed. Contact : natasha@imperial.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2015-08-08
    Description: Motivation: The functional impact of small molecules is increasingly being assessed in different eukaryotic species through large-scale phenotypic screening initiatives. Identifying the targets of these molecules is crucial to mechanistically understand their function and uncover new therapeutically relevant modes of action. However, despite extensive work carried out in model organisms and human, it is still unclear to what extent one can use information obtained in one species to make predictions in other species. Results: Here, for the first time, we explore and validate at a large scale the use of protein homology relationships to predict the targets of small molecules across different species. Our results show that exploiting target homology can significantly improve the predictions, especially for molecules experimentally tested in other species. Interestingly, when considering separately orthology and paralogy relationships, we observe that mapping small molecule interactions among orthologs improves prediction accuracy, while including paralogs does not improve and even sometimes worsens the prediction accuracy. Overall, our results provide a novel approach to integrate chemical screening results across multiple species and highlight the promises and remaining challenges of using protein homology for small molecule target identification. Availability and implementation: Homology-based predictions can be tested on our website http://www.swisstargetprediction.ch . Contact: david.gfeller@unil.ch or vincent.zoete@isb-sib.ch . Supplementary information: Supplementary data are available at Bioinformatics online.
    Print ISSN: 1367-4803
    Electronic ISSN: 1460-2059
    Topics: Biology , Computer Science , Medicine
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2015-06-06
    Description: Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognition tasks. While the sparse, interest-point based approach to recognition is not inconsistent with visual processing in biological systems that operate in ‘saccade and fixate’ regimes, the methodology and emphasis in the human and the computer vision communities remains sharply distinct. Here, we make three contributions aiming to bridge this gap. First, we complement existing state-of-the art large scale dynamic computer vision annotated datasets like Hollywood-2  [1] and UCF Sports  [2] with human eye movements collected under the ecological constraints of visual action and scene context recognition tasks. To our knowledge these are the first large human eye tracking datasets to be collected and made publicly available for video, vision.imar.ro/eyetracking (497,107 frames, each viewed by 19 subjects), unique in terms of their (a) large scale and computer vision relevance, (b) dynamic, video stimuli, (c) task control, as well as free-viewing . Second, we introduce novel dynamic consistency and alignment measures , which underline the remarkable stability of patterns of visual search among subjects. Third, we leverage the significant amount of collected data in order to pursue studies and build automatic, end-to-end trainable computer vision systems based on human eye movements. Our studies not only shed light on the differences between computer vision spatio-temporal interest point image sampling strategies and the human fixations, as well as their impact for visual recognition performance, but also demonstrate that human fixations can be accurately predicted, and when used in an end-to-end automatic system, leveraging some of the advanced computer vision practice, can lead to state of the art results.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: We consider the problem of parameter estimation and energy minimization for a region-based semantic segmentation model. The model divides the pixels of an image into non-overlapping connected regions, each of which is to a semantic class. In the context of energy minimization, the main problem we face is the large number of putative pixel-to-region assignments. We address this problem by designing an accurate linear programming based approach for selecting the best set of regions from a large dictionary. The dictionary is constructed by merging and intersecting segments obtained from multiple bottom-up over-segmentations. The linear program is solved efficiently using dual decomposition. In the context of parameter estimation, the main problem we face is the lack of fully supervised data. We address this issue by developing a principled framework for parameter estimation using diverse data. More precisely, we propose a latent structural support vector machine formulation, where the latent variables model any missing information in the human annotation. Of particular interest to us are three types of annotations: (i) images segmented using generic foreground or background classes; (ii) images with bounding boxes specified for objects; and (iii) images labeled to indicate the presence of a class. Using large, publicly available datasets we show that our methods are able to significantly improve the accuracy of the region-based model.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: In this paper we address the problem of finding the most probable state of a discrete Markov random field (MRF), also known as the MRF energy minimization problem. The task is known to be NP-hard in general and its practical importance motivates numerous approximate algorithms. We propose a submodular relaxation approach (SMR) based on a Lagrangian relaxation of the initial problem. Unlike the dual decomposition approach of Komodakis et al. [29] SMR does not decompose the graph structure of the initial problem but constructs a submodular energy that is minimized within the Lagrangian relaxation. Our approach is applicable to both pairwise and high-order MRFs and allows to take into account global potentials of certain types. We study theoretical properties of the proposed approach and evaluate it experimentally.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: Higher-order Markov Random Fields, which can capture important properties of natural images, have become increasingly important in computer vision. While graph cuts work well for first-order MRF’s, until recently they have rarely been effective for higher-order MRF’s. Ishikawa’s graph cut technique [1] , [2] shows great promise for many higher-order MRF’s. His method transforms an arbitrary higher-order MRF with binary labels into a first-order one with the same minima. If all the terms are submodular the exact solution can be easily found; otherwise, pseudoboolean optimization techniques can produce an optimal labeling for a subset of the variables. We present a new transformation with better performance than [1] , [2] , both theoretically and experimentally. While [1] , [2] transforms each higher-order term independently, we use the underlying hypergraph structure of the MRF to transform a group of terms at once. For $n$ binary variables, each of which appears in terms with $k$ other variables, at worst we produce $n$ non-submodular terms, while [1] , [2] produces $O(- k)$ . We identify a local completeness property under which our method perform even better, and show that under certain assumptions several important vision problems (including common variants of fusion moves) have this property. We show experimentally that our method produces smaller weight of non-submodular edges, and that this metric is directly related to the effectiveness of QPBO [3] . Running on the same field of experts dataset used in [1] , [2] we optimally label significantly more variables (96 versus 80 percent) and converge more rapidly to a lower energy. Preliminary experiments suggest that some other higher-order MRF’s used in stereo [4] and segmentation [5] are also locally complete and would thus benefit from our work.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: We introduce a conceptually novel structured prediction model, GPstruct , which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M $^3$ N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: The papers in this special issue contain extended versions of works that were originally presented at the Brazilian Symposium on Bioinformatics 2013 (BSB 2013), held in Recife, Brazil, November 3-6, 2013.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2015-06-06
    Description: Computational methods for predicting protein-protein interactions are important tools that can complement high-throughput technologies and guide biologists in designing new laboratory experiments. The proteins and the interactions between them can be described by a network which is characterized by several topological properties. Information about proteins and interactions between them, in combination with knowledge about topological properties of the network, can be used for developing computational methods that can accurately predict unknown protein-protein interactions. This paper presents a supervised learning framework based on Bayesian inference for combining two types of information: i) network topology information, and ii) information related to proteins and the interactions between them. The motivation of our model is that by combining these two types of information one can achieve a better accuracy in predicting protein-protein interactions, than by using models constructed from these two types of information independently.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: We develop a theory of algebraic operations over linear and context-free grammars that makes it possible to combine simple “atomic” grammars operating on single sequences into complex, multi-dimensional grammars. We demonstrate the utility of this framework by constructing the search spaces of complex alignment problems on multiple input sequences explicitly as algebraic expressions of very simple one-dimensional grammars. In particular, we provide a fully worked frameshift-aware, semiglobal DNA-protein alignment algorithm whose grammar is composed of products of small, atomic grammars. The compiler accompanying our theory makes it easy to experiment with the combination of multiple grammars and different operations. Composite grammars can be written out in $ {rm L}^AT_{E}X$ for documentation and as a guide to implementation of dynamic programming algorithms. An embedding in Haskell as a domain-specific language makes the theory directly accessible to writing and using grammar products without the detour of an external compiler. Software and supplemental files available here: http://www.bioinf.uni-leipzig.de/Software/gramprod/
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: Recent advancements in genomics and proteomics provide a solid foundation for understanding the pathogenesis of diabetes. Proteomics of diabetes associated pathways help to identify the most potent target for the management of diabetes. The relevant datasets are scattered in various prominent sources which takes much time to select the therapeutic target for the clinical management of diabetes. However, additional information about target proteins is needed for validation. This lacuna may be resolved by linking diabetes associated genes, pathways and proteins and it will provide a strong base for the treatment and planning management strategies of diabetes. Thus, a web source “Diabetes Associated Proteins Database (DAPD)” has been developed to link the diabetes associated genes, pathways and proteins using PHP, MySQL. The current version of DAPD has been built with proteins associated with different types of diabetes. In addition, DAPD has been linked to external sources to gain the access to more participatory proteins and their pathway network. DAPD will reduce the time and it is expected to pave the way for the discovery of novel anti-diabetic leads using computational drug designing for diabetes management. DAPD is open accessed via following url www.mkarthikeyan.bioinfoau.org/dapd.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2015-06-06
    Description: Determining the glycan topology automatically from mass spectra represents a great challenge. Existing methods fall into approximate and exact ones. The former including greedy and heuristic ones can reduce the computational complexity, but suffer from information lost in the procedure of glycan interpretation. The latter including dynamic programming and exhaustive enumeration are much slower than the former. In the past years, nearly all emerging methods adopted a tree structure to represent a glycan. They share such problems as repetitive peak counting in reconstructing a candidate structure. Besides, tree-based glycan representation methods often have to give different computational formulas for binary and ternary glycans. We propose a new directed acyclic graph structure for glycan representation. Based on it, this work develops a de novo algorithm to accurately reconstruct the tree structure iteratively from mass spectra with logical constraints and some known biosynthesis rules, by a single computational formula. The experiments on multiple complex glycans extracted from human serum show that the proposed algorithm can achieve higher accuracy to determine a glycan topology than prior methods without increasing computational burden.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2015-06-06
    Description: A crucial step in understanding the architecture of cells and tissues from microscopy images, and consequently explain important biological events such as wound healing and cancer metastases, is the complete extraction and enumeration of individual filaments from the cellular cytoskeletal network. Current efforts at quantitative estimation of filament length distribution, architecture and orientation from microscopy images are predominantly limited to visual estimation and indirect experimental inference. Here we demonstrate the application of a new algorithm to reliably estimate centerlines of biological filament bundles and extract individual filaments from the centerlines by systematically disambiguating filament intersections. We utilize a filament enhancement step followed by reverse diffusion based filament localization and an integer programming based set combination to systematically extract accurate filaments automatically from microscopy images. Experiments on simulated and real confocal microscope images of flat cells (2D images) show efficacy of the new method.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: The Local/Global Alignment (Zemla, 2003), or LGA, is a popular method for the comparison of protein structures. One of the two components of LGA requires us to compute the longest common contiguous segments between two protein structures. That is, given two structures $A=(a_1, ldots , a_n)$ and $B=(b_1, ldots , b_n)$ where $a_k$ , $b_kin mathbb {R}^3$ , we are to find, among all the segments $f=(a_i,ldots ,a_j)$ and $g=(b_i,ldots ,b_j)$ that fulfill a certain criterion regarding their similarity, those of the maximum length. We consider the following criteria: (1) the root mean squared deviation (RMSD) between $f$ and $g$ is to be within a given $tin mathbb {R}$ ; (2) $f$ and $g$ can be superposed such that for each $k$ , $ile kle j$ , $Vert a_k-b_kVert le t$ for a given $tin mathbb {R}$ . We give an algorithm of $O(n;log; n+n{{boldsymbol l}})$ time complexity when the first requirement applies, where ${{boldsymbol l}}$ is the maximum length of the segments fulfilling the criterion. We show an FPTAS which, for any
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-06-06
    Description: Noise can induce various dynamical behaviors in nonlinear systems. White noise perturbed systems have been extensively investigated during the last decades. In gene networks, experimentally observed extrinsic noise is colored. As an attempt, we investigate the genetic toggle switch systems perturbed by colored extrinsic noise and with kinetic parameters. Compared with white noise perturbed systems, we show there also exists optimal colored noise strength to induce the best stochastic switch behaviors in the single toggle switch, and the best synchronized switching in the networked systems, which demonstrate that noise-induced optimal switch behaviors are widely in existence. Moreover, under a wide range of system parameter regions, we find there exist wider ranges of white and colored noises strengths to induce good switch and synchronization behaviors, respectively; therefore, white noise is beneficial for switch and colored noise is beneficial for population synchronization. Our observations are very robust to extrinsic stimulus strength, cell density, and diffusion rate. Finally, based on the Waddington’s epigenetic landscape and the Wiener-Khintchine theorem, physical mechanisms underlying the observations are interpreted. Our investigations can provide guidelines for experimental design, and have potential clinical implications in gene therapy and synthetic biology.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2015-06-06
    Description: Revealing the underlying evolutionary mechanism plays an important role in understanding protein interaction networks in the cell. While many evolutionary models have been proposed, the problem about applying these models to real network data, especially for differentiating which model can better describe evolutionary process for the observed network remains a challenge. The traditional way is to use a model with presumed parameters to generate a network, and then evaluate the fitness by summary statistics, which however cannot capture the complete network structures information and estimate parameter distribution. In this work, we developed a novel method based on Approximate Bayesian Computation and modified Differential Evolution algorithm (ABC-DEP) that is capable of conducting model selection and parameter estimation simultaneously and detecting the underlying evolutionary mechanisms for PPI networks more accurately. We tested our method for its power in differentiating models and estimating parameters on simulated data and found significant improvement in performance benchmark, as compared with a previous method. We further applied our method to real data of protein interaction networks in human and yeast. Our results show duplication attachment model as the predominant evolutionary mechanism for human PPI networks and Scale-Free model as the predominant mechanism for yeast PPI networks.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2015-06-06
    Description: Single nucleotide polymorphisms, a dominant type of genetic variants, have been used successfully to identify defective genes causing human single gene diseases. However, most common human diseases are complex diseases and caused by gene-gene and gene-environment interactions. Many SNP-SNP interaction analysis methods have been introduced but they are not powerful enough to discover interactions more than three SNPs. The paper proposes a novel method that analyzes all SNPs simultaneously. Different from existing methods, the method regards an individual’s genotype data on a list of SNPs as a point with a unit of energy in a multi-dimensional space, and tries to find a new coordinate system where the energy distribution difference between cases and controls reaches the maximum. The method will find different multiple SNPs combinatorial patterns between cases and controls based on the new coordinate system. The experiment on simulated data shows that the method is efficient. The tests on the real data of age-related macular degeneration (AMD) disease show that it can find out more significant multi-SNP combinatorial patterns than existing methods.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2015-06-06
    Description: Disulfide connectivity is an important protein structural characteristic. Accurately predicting disulfide connectivity solely from protein sequence helps to improve the intrinsic understanding of protein structure and function, especially in the post-genome era where large volume of sequenced proteins without being functional annotated is quickly accumulated. In this study, a new feature extracted from the predicted protein 3D structural information is proposed and integrated with traditional features to form discriminative features. Based on the extracted features, a random forest regression model is performed to predict protein disulfide connectivity. We compare the proposed method with popular existing predictors by performing both cross-validation and independent validation tests on benchmark datasets. The experimental results demonstrate the superiority of the proposed method over existing predictors. We believe the superiority of the proposed method benefits from both the good discriminative capability of the newly developed features and the powerful modelling capability of the random forest. The web server implementation, called TargetDisulfide, and the benchmark datasets are freely available at: http://csbio.njust.edu.cn/bioinf/TargetDisulfide for academic use.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2015-06-06
    Description: Proteins are molecules that form the mass of living beings. These proteins exist in dissociated forms like amino-acids and carry out various biological functions, in fact, almost all body reactions occur with the participation of proteins. This is one of the reasons why the analysis of proteins has become a major issue in biology. In a more concrete way, the identification of conserved patterns in a set of related protein sequences can provide relevant biological information about these protein functions. In this paper, we present a novel algorithm based on teaching learning based optimization (TLBO) combined with a local search function specialized to predict common patterns in sets of protein sequences. This population-based evolutionary algorithm defines a group of individuals (solutions) that enhance their knowledge (quality) by means of different learning stages. Thus, if we correctly adapt it to the biological context of the mentioned problem, we can get an acceptable set of quality solutions. To evaluate the performance of the proposed technique, we have used six instances composed of different related protein sequences obtained from the PROSITE database. As we will see, the designed approach makes good predictions and improves the quality of the solutions found by other well-known biological tools.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2015-06-06
    Description: We introduce Supervised Variational Relevance Learning (Suvrel), a variational method to determine metric tensors to define distance based similarity in pattern classification, inspired in relevance learning. The variational method is applied to a cost function that penalizes large intraclass distances and favors small interclass distances. We find analytically the metric tensor that minimizes the cost function. Preprocessing the patterns by doing linear transformations using the metric tensor yields a dataset which can be more efficiently classified. We test our methods using publicly available datasets, for some standard classifiers. Among these datasets, two were tested by the MAQC-II project and, even without the use of further preprocessing, our results improve on their performance.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2015-07-30
    Description: In this paper, we present three improvements to a three-point third order variant of Newton’s method derived from the Simpson rule. The first one is a fifth order method using the same number of functional evaluations as the third order method, the second one is a four-point 10th order method and the last one is a five-point 20th order method. In terms of computational point of view, our methods require four evaluations (one function and three first derivatives) to get fifth order, five evaluations (two functions and three derivatives) to get 10th order and six evaluations (three functions and three derivatives) to get 20th order. Hence, these methods have efficiency indexes of 1.495, 1.585 and 1.648, respectively which are better than the efficiency index of 1.316 of the third order method. We test the methods through some numerical experiments which show that the 20th order method is very efficient.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2015-08-04
    Description: Improving the quality of degraded images is a key problem in image processing, but the breadth of the problem leads to domain-specific approaches for tasks such as super-resolution and compression artifact removal. Recent approaches have shown that a general approach is possible by learning application-specific models from examples; however, learning models sophisticated enough to generate high-quality images is computationally expensive, and so specific per-application or per-dataset models are impractical. To solve this problem, we present an efficient semi-local approximation scheme to large-scale Gaussian processes. This allows efficient learning of task-specific image enhancements from example images without reducing quality. As such, our algorithm can be easily customized to specific applications and datasets, and we show the efficiency and effectiveness of our approach across five domains: single-image super-resolution for scene, human face, and text images, and artifact removal in JPEG- and JPEG 2000-encoded images.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Parsimony, including sparsity and low rank, has been shown to successfully model data in numerous machine learning and signal processing tasks. Traditionally, such modeling approaches rely on an iterative algorithm that minimizes an objective function with parsimony-promoting terms. The inherently sequential structure and data-dependent complexity and latency of iterative optimization constitute a major limitation in many applications requiring real-time performance or involving large-scale data. Another limitation encountered by these modeling techniques is the difficulty of their inclusion in discriminative learning scenarios. In this work, we propose to move the emphasis from the model to the pursuit algorithm, and develop a process-centric view of parsimonious modeling, in which a learned deterministic fixed-complexity pursuit process is used in lieu of iterative optimization. We show a principled way to construct learnable pursuit process architectures for structured sparse and robust low rank models, derived from the iteration of proximal descent algorithms. These architectures learn to approximate the exact parsimonious representation at a fraction of the complexity of the standard optimization methods. We also show that appropriate training regimes allow to naturally extend parsimonious models to discriminative settings. State-of-the-art results are demonstrated on several challenging problems in image and audio processing with several orders of magnitude speed-up compared to the exact optimization algorithms.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    facet.materialart.
    Unknown
    Institute of Electrical and Electronics Engineers (IEEE)
    Publication Date: 2015-08-04
    Description: Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 $times$ 224) input image. This requirement is “artificial” and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 $times$ faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this - ompetition.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2015-08-04
    Description: We present efficient graph cut algorithms for three problems: (1) finding a region in an image, so that the histogram (or distribution) of an image feature within the region most closely matches a given model; (2) co-segmentation of image pairs and (3) interactive image segmentation with a user-provided bounding box. Each algorithm seeks the optimum of a global cost function based on the Bhattacharyya measure, a convenient alternative to other matching measures such as the Kullback–Leibler divergence. Our functionals are not directly amenable to graph cut optimization as they contain non-linear functions of fractional terms, which make the ensuing optimization problems challenging. We first derive a family of parametric bounds of the Bhattacharyya measure by introducing an auxiliary labeling. Then, we show that these bounds are auxiliary functions of the Bhattacharyya measure, a result which allows us to solve each problem efficiently via graph cuts. We show that the proposed optimization procedures converge within very few graph cut iterations. Comprehensive and various experiments, including quantitative and comparative evaluations over two databases, demonstrate the advantages of the proposed algorithms over related works in regard to optimality, computational load, accuracy and flexibility.
    Print ISSN: 0162-8828
    Electronic ISSN: 1939-3539
    Topics: Computer Science
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2015-07-30
    Description: Robust small target detection of low signal-to-noise ratio (SNR) is very important in infrared search and track applications for self-defense or attacks. Due to the complex background, current algorithms have some unsolved issues with false alarm rate. In order to reduce the false alarm rate, an infrared small target detection algorithm based on saliency detection and support vector machine was proposed. Firstly, we detect salient regions that may contain targets with phase spectrum Fourier transform (PFT) approach. Then, target recognition was performed in the salient regions. Experimental results show the proposed algorithm has ideal robustness and efficiency for real infrared small target detection applications.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2015-08-06
    Description: In dynamic propagation environments, beamforming algorithms may suffer from strong interference, steering vector mismatches, a low convergence speed and a high computational complexity. Reduced-rank signal processing techniques provide a way to address the problems mentioned above. This paper presents a low-complexity robust data-dependent dimensionality reduction based on an iterative optimization with steering vector perturbation (IOVP) algorithm for reduced-rank beamforming and steering vector estimation. The proposed robust optimization procedure jointly adjusts the parameters of a rank reduction matrix and an adaptive beamformer. The optimized rank reduction matrix projects the received signal vector onto a subspace with lower dimension. The beamformer/steering vector optimization is then performed in a reduced dimension subspace. We devise efficient stochastic gradient and recursive least-squares algorithms for implementing the proposed robust IOVP design. The proposed robust IOVP beamforming algorithms result in a faster convergence speed and an improved performance. Simulation results show that the proposed IOVP algorithms outperform some existing full-rank and reduced-rank algorithms with a comparable complexity.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2015-08-07
    Description: Recently, wireless sensor networks (WSNs) have drawn great interest due to their outstanding monitoring and management potential in medical, environmental and industrial applications. Most of the applications that employ WSNs demand all of the sensor nodes to run on a common time scale, a requirement that highlights the importance of clock synchronization. The clock synchronization problem in WSNs is inherently related to parameter estimation. The accuracy of clock synchronization algorithms depends essentially on the statistical properties of the parameter estimation algorithms. Recently, studies dedicated to the estimation of synchronization parameters, such as clock offset and skew, have begun to emerge in the literature. The aim of this article is to provide an overview of the state-of-the-art clock synchronization algorithms for WSNs from a statistical signal processing point of view. This article focuses on describing the key features of the class of clock synchronization algorithms that exploit the traditional two-way message (signal) exchange mechanism. Upon introducing the two-way message exchange mechanism, the main clock offset estimation algorithms for pairwise synchronization of sensor nodes are first reviewed, and their performance is compared. The class of fully-distributed clock offset estimation algorithms for network-wide synchronization is then surveyed. The paper concludes with a list of open research problems pertaining to clock synchronization of WSNs.
    Electronic ISSN: 1999-4893
    Topics: Computer Science
    Published by MDPI Publishing
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2015-08-07
    Description: Genes can participate in multiple biological processes at a time and thus their expression can be seen as a composition of the contributions from the active processes. Biclustering under a plaid assumption allows the modeling of interactions between transcriptional modules or biclusters (subsets of genes with coherence across subsets of conditions) by assuming an additive composition of contributions in their overlapping areas. Despite the biological interest of plaid models, few biclustering algorithms consider plaid effects and, when they do, they place restrictions on the allowed types and structures of biclusters, and suffer from robustness problems by seizing exact additive matchings. We propose BiP (Biclustering using Plaid models), a biclustering algorithm with relaxations to allow expression levels to change in overlapping areas according to biologically meaningful assumptions (weighted and noise-tolerant composition of contributions). BiP can be used over existing biclustering solutions (seizing their benefits) as it is able to recover excluded areas due to unaccounted plaid effects and detect noisy areas non-explained by a plaid assumption, thus producing an explanatory model of overlapping transcriptional activity. Experiments on synthetic data support BiP’s efficiency and effectiveness. The learned models from expression data unravel meaningful and non-trivial functional interactions between biological processes associated with putative regulatory modules.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2015-08-07
    Description: We propose a classifier system called iPFPi that predicts the functions of un-annotated proteins. iPFPi assigns an un-annotated protein $P$ the functions of GO annotation terms that are semantically similar to $P$ . An un-annotated protein $P$ and a GO annotation term $T$ are represented by their characteristics. The characteristics of $P$ are GO terms found within the abstracts of biomedical literature associated with $P$ . The characteristics of $T$ are GO terms found within the abstracts of biomedical literature associated with the proteins annotated with the function of $T$ . Let - F$ and $Fprime $ be the important (dominant) sets of characteristic terms representing $T$ and $P$ , respectively. iPFPi would annotate $P$ with the function of $T$ , if $F$ and $Fprime $ are semantically similar. We constructed a novel semantic similarity measure that takes into consideration several factors, such as the dominance degree of each characteristic term $t$ in set $F$
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2015-08-07
    Description: Adverse drug reaction (ADR) is a common clinical problem, sometimes accompanying with high risk of mortality and morbidity. It is also one of the major factors that lead to failure in new drug development. Unfortunately, most of current experimental and computational methods are unable to evaluate clinical safety of drug candidates in early drug discovery stage due to the very limited knowledge of molecular mechanisms underlying ADRs. Therefore, in this study, we proposed a novel naïve Bayesian model for rapid assessment of clinical ADRs with frequency estimation. This model was constructed on a gene-ADR association network, which covered 611 US FDA approved drugs, 14,251 genes, and 1,254 distinct ADR terms. An average detection rate of 99.86 and 99.73 percent were achieved eventually in identification of known ADRs in internal test data set and external case analyses respectively. Moreover, a comparative analysis between the estimated frequencies of ADRs and their observed frequencies was undertaken. It is observed that these two frequencies have the similar distribution trend. These results suggest that the naïve Bayesian model based on gene-ADR association network can serve as an efficient and economic tool in rapid ADRs assessment.
    Print ISSN: 1545-5963
    Electronic ISSN: 1557-9964
    Topics: Biology , Computer Science
    Published by Institute of Electrical and Electronics Engineers (IEEE) on behalf of The IEEE Computational Intelligence Society ; The IEEE Computer Society ; The IEEE Control Systems Society ; The IEEE Engineering in Medicine and Biology Society ; The Association for Computing Machinery.
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...