ALBERT

All Library Books, journals and Electronic Records Telegrafenberg

Your email was sent successfully. Check your inbox.

An error occurred while sending the email. Please try again.

Proceed reservation?

Export
Filter
  • Books
  • Articles  (281)
  • PeerJ  (281)
  • 2020-2022  (281)
  • 1985-1989
  • 1980-1984
  • 1960-1964
  • Computer Science  (281)
Collection
  • Books
  • Articles  (281)
Years
Year
Journal
  • 1
    Publication Date: 2021-08-20
    Description: Background Rumor detection is a popular research topic in natural language processing and data mining. Since the outbreak of COVID-19, related rumors have been widely posted and spread on online social media, which have seriously affected people’s daily lives, national economy, social stability, etc. It is both theoretically and practically essential to detect and refute COVID-19 rumors fast and effectively. As COVID-19 was an emergent event that was outbreaking drastically, the related rumor instances were very scarce and distinct at its early stage. This makes the detection task a typical few-shot learning problem. However, traditional rumor detection techniques focused on detecting existed events with enough training instances, so that they fail to detect emergent events such as COVID-19. Therefore, developing a new few-shot rumor detection framework has become critical and emergent to prevent outbreaking rumors at early stages. Methods This article focuses on few-shot rumor detection, especially for detecting COVID-19 rumors from Sina Weibo with only a minimal number of labeled instances. We contribute a Sina Weibo COVID-19 rumor dataset for few-shot rumor detection and propose a few-shot learning-based multi-modality fusion model for few-shot rumor detection. A full microblog consists of the source post and corresponding comments, which are considered as two modalities and fused with the meta-learning methods. Results Experiments of few-shot rumor detection on the collected Weibo dataset and the PHEME public dataset have shown significant improvement and generality of the proposed model.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 2
    Publication Date: 2021-08-20
    Description: Spectral clustering (SC) is one of the most popular clustering methods and often outperforms traditional clustering methods. SC uses the eigenvectors of a Laplacian matrix calculated from a similarity matrix of a dataset. SC has serious drawbacks: the significant increases in the time complexity derived from the computation of eigenvectors and the memory space complexity to store the similarity matrix. To address the issues, I develop a new approximate spectral clustering using the network generated by growing neural gas (GNG), called ASC with GNG in this study. ASC with GNG uses not only reference vectors for vector quantization but also the topology of the network for extraction of the topological relationship between data points in a dataset. ASC with GNG calculates the similarity matrix from both the reference vectors and the topology of the network generated by GNG. Using the network generated from a dataset by GNG, ASC with GNG achieves to reduce the computational and space complexities and improve clustering quality. In this study, I demonstrate that ASC with GNG effectively reduces the computational time. Moreover, this study shows that ASC with GNG provides equal to or better clustering performance than SC.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 3
    Publication Date: 2020-01-06
    Description: Human behavior refers to the way humans act and interact. Understanding human behavior is a cornerstone of observational practice, especially in psychotherapy. An important cue of behavior analysis is the dynamical changes of emotions during the conversation. Domain experts integrate emotional information in a highly nonlinear manner; thus, it is challenging to explicitly quantify the relationship between emotions and behaviors. In this work, we employ deep transfer learning to analyze their inferential capacity and contextual importance. We first train a network to quantify emotions from acoustic signals and then use information from the emotion recognition network as features for behavior recognition. We treat this emotion-related information as behavioral primitives and further train higher level layers towards behavior quantification. Through our analysis, we find that emotion-related information is an important cue for behavior recognition. Further, we investigate the importance of emotional-context in the expression of behavior by constraining (or not) the neural networks’ contextual view of the data. This demonstrates that the sequence of emotions is critical in behavior expression. To achieve these frameworks we employ hybrid architectures of convolutional networks and recurrent networks to extract emotion-related behavior primitives and facilitate automatic behavior recognition from speech.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 4
    Publication Date: 2020-09-28
    Description: Computer Science researchers rely on peer-reviewed conferences to publish their work and to receive feedback. The impact of these peer-reviewed papers on researchers’ careers can hardly be overstated. Yet conference organizers can make inconsistent choices for their review process, even in the same subfield. These choices are rarely reviewed critically, and when they are, the emphasis centers on the effects on the technical program, not the authors. In particular, the effects of conference policies on author experience and diversity are still not well understood. To help address this knowledge gap, this paper presents a cross-sectional study of 56 conferences from one large subfield of computer science, namely computer systems. We introduce a large author survey (n = 918), representing 809 unique papers. The goal of this paper is to expose this data and present an initial analysis of its findings. We primarily focus on quantitative comparisons between different survey questions and comparisons to external information we collected on author demographics, conference policies, and paper statistics. Another focal point of this study is author diversity. We found poor balance in the gender and geographical distributions of authors, but a more balanced spread across sector, experience, and English proficiency. For the most part, women and nonnative English speakers exhibit no differences in their experience of the peer-review process, suggesting no specific evidence of bias against these accepted authors. We also found strong support for author rebuttal to reviewers’ comments, especially among students and less experienced researchers.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 5
    Publication Date: 2020-10-19
    Description: We are concerned with the challenge of coronavirus disease (COVID-19) detection in chest X-ray and Computed Tomography (CT) scans, and the classification and segmentation of related infection manifestations. Even though it is arguably not an established diagnostic tool, using machine learning-based analysis of COVID-19 medical scans has shown the potential to provide a preliminary digital second opinion. This can help in managing the current pandemic, and thus has been attracting significant research attention. In this research, we propose a multi-task pipeline that takes advantage of the growing advances in deep neural network models. In the first stage, we fine-tuned an Inception-v3 deep model for COVID-19 recognition using multi-modal learning, that is, using X-ray and CT scans. In addition to outperforming other deep models on the same task in the recent literature, with an attained accuracy of 99.4%, we also present comparative analysis for multi-modal learning against learning from X-ray scans alone. The second and the third stages of the proposed pipeline complement one another in dealing with different types of infection manifestations. The former features a convolutional neural network architecture for recognizing three types of manifestations, while the latter transfers learning from another knowledge domain, namely, pulmonary nodule segmentation in CT scans, to produce binary masks for segmenting the regions corresponding to these manifestations. Our proposed pipeline also features specialized streams in which multiple deep models are trained separately to segment specific types of infection manifestations, and we show the significant impact that this framework has on various performance metrics. We evaluate the proposed models on widely adopted datasets, and we demonstrate an increase of approximately 2.5% and 4.5% for dice coefficient and mean intersection-over-union (mIoU), respectively, while achieving 60% reduction in computational time, compared to the recent literature.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 6
    Publication Date: 2020-07-27
    Description: Food consumption patterns have undergone changes that in recent years have resulted in serious health problems. Studies based on the evaluation of the nutritional status have determined that the adoption of a food pattern-based primarily on a Mediterranean diet (MD) has a preventive role, as well as the ability to mitigate the negative effects of certain pathologies. A group of more than 500 adults aged over 40 years from our cohort in Northwestern Spain was surveyed. Under our experimental design, 10 experiments were run with four different machine-learning algorithms and the predictive factors most relevant to the adherence of a MD were identified. A feature selection approach was explored and under a null hypothesis test, it was concluded that only 16 measures were of relevance, suggesting the strength of this observational study. Our findings indicate that the following factors have the highest predictive value in terms of the degree of adherence to the MD: basal metabolic rate, mini nutritional assessment questionnaire total score, weight, height, bone density, waist-hip ratio, smoking habits, age, EDI-OD, circumference of the arm, activity metabolism, subscapular skinfold, subscapular circumference in cm, circumference of the waist, circumference of the calf and brachial area.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 7
    Publication Date: 2020-10-12
    Description: The precise and rapid diagnosis of coronavirus (COVID-19) at the very primary stage helps doctors to manage patients in high workload conditions. In addition, it prevents the spread of this pandemic virus. Computer-aided diagnosis (CAD) based on artificial intelligence (AI) techniques can be used to distinguish between COVID-19 and non-COVID-19 from the computed tomography (CT) imaging. Furthermore, the CAD systems are capable of delivering an accurate faster COVID-19 diagnosis, which consequently saves time for the disease control and provides an efficient diagnosis compared to laboratory tests. In this study, a novel CAD system called FUSI-CAD based on AI techniques is proposed. Almost all the methods in the literature are based on individual convolutional neural networks (CNN). Consequently, the FUSI-CAD system is based on the fusion of multiple different CNN architectures with three handcrafted features including statistical features and textural analysis features such as discrete wavelet transform (DWT), and the grey level co-occurrence matrix (GLCM) which were not previously utilized in coronavirus diagnosis. The SARS-CoV-2 CT-scan dataset is used to test the performance of the proposed FUSI-CAD. The results show that the proposed system could accurately differentiate between COVID-19 and non-COVID-19 images, as the accuracy achieved is 99%. Additionally, the system proved to be reliable as well. This is because the sensitivity, specificity, and precision attained to 99%. In addition, the diagnostics odds ratio (DOR) is ≥ 100. Furthermore, the results are compared with recent related studies based on the same dataset. The comparison verifies the competence of the proposed FUSI-CAD over the other related CAD systems. Thus, the novel FUSI-CAD system can be employed in real diagnostic scenarios for achieving accurate testing for COVID-19 and avoiding human misdiagnosis that might exist due to human fatigue. It can also reduce the time and exertion made by the radiologists during the examination process.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 8
    Publication Date: 2020-09-21
    Description: It is essential for the advancement of science that researchers share, reuse and reproduce each other’s workflows and protocols. The FAIR principles are a set of guidelines that aim to maximize the value and usefulness of research data, and emphasize the importance of making digital objects findable and reusable by others. The question of how to apply these principles not just to data but also to the workflows and protocols that consume and produce them is still under debate and poses a number of challenges. In this paper we describe a two-fold approach of simultaneously applying the FAIR principles to scientific workflows as well as the involved data. We apply and evaluate our approach on the case of the PREDICT workflow, a highly cited drug repurposing workflow. This includes FAIRification of the involved datasets, as well as applying semantic technologies to represent and store data about the detailed versions of the general protocol, of the concrete workflow instructions, and of their execution traces. We propose a semantic model to address these specific requirements and was evaluated by answering competency questions. This semantic model consists of classes and relations from a number of existing ontologies, including Workflow4ever, PROV, EDAM, and BPMN. This allowed us then to formulate and answer new kinds of competency questions. Our evaluation shows the high degree to which our FAIRified OpenPREDICT workflow now adheres to the FAIR principles and the practicality and usefulness of being able to answer our new competency questions.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 9
    Publication Date: 2020-09-21
    Description: Despite the benefits of standardization, the customization of Software as a Service (SaaS) application is also essential because of the many unique requirements of customers. This study, therefore, focuses on the development of a valid and reliable software customization model for SaaS quality that consists of (1) generic software customization types and a list of common practices for each customization type in the SaaS multi-tenant context, and (2) key quality attributes of SaaS applications associated with customization. The study was divided into three phases: the conceptualization of the model, analysis of its validity using SaaS academic-derived expertise, and evaluation of its reliability by submitting it to an internal consistency reliability test conducted by software-engineer researchers. The model was initially devised based on six customization approaches, 46 customization practices, and 13 quality attributes in the SaaS multi-tenant context. Subsequently, its content was validated over two rounds of testing after which one approach and 14 practices were removed and 20 practices were reformulated. The internal consistency reliability study was thereafter conducted by 34 software engineer researchers. All constructs of the content-validated model were found to be reliable in this study. The final version of the model consists of 6 constructs and 44 items. These six constructs and their associated items are as follows: (1) Configuration (eight items), (2) Composition (four items), (3) Extension (six items), 4) Integration (eight items), (5) Modification (five items), and (6) SaaS quality (13 items). The results of the study may contribute to enhancing the capability of empirically analyzing the impact of software customization on SaaS quality by benefiting from all resultant constructs and items.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 10
    Publication Date: 2020-01-27
    Description: The article presents a discriminative approach to complement the unsupervised probabilistic nature of topic modelling. The framework transforms the probabilities of the topics per document into class-dependent deep learning models that extract highly discriminatory features suitable for classification. The framework is then used for sentiment analysis with minimum feature engineering. The approach transforms the sentiment analysis problem from the word/document domain to the topics domain making it more robust to noise and incorporating complex contextual information that are not represented otherwise. A stacked denoising autoencoder (SDA) is then used to model the complex relationship among the topics per sentiment with minimum assumptions. To achieve this, a distinct topic model and SDA per sentiment polarity is built with an additional decision layer for classification. The framework is tested on a comprehensive collection of benchmark datasets that vary in sample size, class bias and classification task. A significant improvement to the state of the art is achieved without the need for a sentiment lexica or over-engineered features. A further analysis is carried out to explain the observed improvement in accuracy.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 11
    Publication Date: 2020-09-14
    Description: Shipborne radars cannot only enable navigation and collision avoidance but also play an important role in the fields of hydrographic data inspection and disaster monitoring. In this paper, target extraction methods for oil films, ships and coastlines from original shipborne radar images are proposed. First, the shipborne radar video images are acquired by a signal acquisition card. Second, based on remote sensing image processing technology, the radar images are preprocessed, and the contours of the targets are extracted. Then, the targets identified in the radar images are integrated into an electronic navigation chart (ENC) by a geographic information system. The experiments show that the proposed target segmentation methods of shipborne radar images are effective. Using the geometric feature information of the targets identified in the shipborne radar images, information matching between radar images and ENC can be realized for hydrographic data inspection and disaster monitoring.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 12
    Publication Date: 2020-09-10
    Description: Background As the COVID-19 crisis endures and the virus continues to spread globally, the need for collecting epidemiological data and patient information also grows exponentially. The race against the clock to find a cure and a vaccine to the disease means researchers require storage of increasingly large and diverse types of information; for doctors following patients, recording symptoms and reactions to treatments, the need for storage flexibility is only surpassed by the necessity of storage security. The volume, variety, and variability of COVID-19 patient data requires storage in NoSQL database management systems (DBMSs). But with a multitude of existing NoSQL DBMSs, there is no straightforward way for institutions to select the most appropriate. And more importantly, they suffer from security flaws that would render them inappropriate for the storage of confidential patient data. Motivation This paper develops an innovative solution to remedy the aforementioned shortcomings. COVID-19 patients, as well as medical professionals, could be subjected to privacy-related risks, from abuse of their data to community bullying regarding their medical condition. Thus, in addition to being appropriately stored and analyzed, their data must imperatively be highly protected against misuse. Methods This paper begins by explaining the five most popular categories of NoSQL databases. It also introduces the most popular NoSQL DBMS types related to each one of them. Moreover, this paper presents a comparative study of the different types of NoSQL DBMS, according to their strengths and weaknesses. This paper then introduces an algorithm that would assist hospitals, and medical and scientific authorities to choose the most appropriate type for storing patients’ information. This paper subsequently presents a set of functions, based on web services, offering a set of endpoints that include authentication, authorization, auditing, and encryption of information. These functions are powerful and effective, making them appropriate to store all the sensitive data related to patients. Results and Contributions This paper presents an algorithm to select the most convenient NoSQL DBMS for COVID-19 patients, medical staff, and organizations data. In addition, the paper proposes innovative security solutions that eliminate the barriers to utilizing NoSQL DBMSs to store patients’ data. The proposed solutions resolve several security problems including authentication, authorization, auditing, and encryption. After implementing these security solutions, the use of NoSQL DBMSs will become a much more appropriate, safer, and affordable solution to storing and analyzing patients’ data, which would contribute greatly to the medical and research effort against COVID-19. This solution can be implemented for all types of NoSQL DBMSs; implementing it would result in highly securing patients’ data, and protecting them from any downsides related to data leakage.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 13
    Publication Date: 2020-09-07
    Description: Non-experts have long made important contributions to machine learning (ML) by contributing training data, and recent work has shown that non-experts can also help with feature engineering by suggesting novel predictive features. However, non-experts have only contributed features to prediction tasks already posed by experienced ML practitioners. Here we study how non-experts can design prediction tasks themselves, what types of tasks non-experts will design, and whether predictive models can be automatically trained on data sourced for their tasks. We use a crowdsourcing platform where non-experts design predictive tasks that are then categorized and ranked by the crowd. Crowdsourced data are collected for top-ranked tasks and predictive models are then trained and evaluated automatically using those data. We show that individuals without ML experience can collectively construct useful datasets and that predictive models can be learned on these datasets, but challenges remain. The prediction tasks designed by non-experts covered a broad range of domains, from politics and current events to health behavior, demographics, and more. Proper instructions are crucial for non-experts, so we also conducted a randomized trial to understand how different instructions may influence the types of prediction tasks being proposed. In general, understanding better how non-experts can contribute to ML can further leverage advances in Automatic machine learning and has important implications as ML continues to drive workplace automation.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 14
    Publication Date: 2020-01-06
    Description: Assessing levels of standing genetic variation within species requires a robust sampling for the purpose of accurate specimen identification using molecular techniques such as DNA barcoding; however, statistical estimators for what constitutes a robust sample are currently lacking. Moreover, such estimates are needed because most species are currently represented by only one or a few sequences in existing databases, which can safely be assumed to be undersampled. Unfortunately, sample sizes of 5–10 specimens per species typically seen in DNA barcoding studies are often insufficient to adequately capture within-species genetic diversity. Here, we introduce a novel iterative extrapolation simulation algorithm of haplotype accumulation curves, called HACSim (Haplotype Accumulation Curve Simulator) that can be employed to calculate likely sample sizes needed to observe the full range of DNA barcode haplotype variation that exists for a species. Using uniform haplotype and non-uniform haplotype frequency distributions, the notion of sampling sufficiency (the sample size at which sampling accuracy is maximized and above which no new sampling information is likely to be gained) can be gleaned. HACSim can be employed in two primary ways to estimate specimen sample sizes: (1) to simulate haplotype sampling in hypothetical species, and (2) to simulate haplotype sampling in real species mined from public reference sequence databases like the Barcode of Life Data Systems (BOLD) or GenBank for any genomic marker of interest. While our algorithm is globally convergent, runtime is heavily dependent on initial sample sizes and skewness of the corresponding haplotype frequency distribution.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 15
    Publication Date: 2020-08-10
    Description: Stress pervades our everyday life to the point of being considered the scourge of the modern industrial world. The effects of stress on knowledge workers causes, in short term, performance fluctuations, decline of concentration, bad sensorimotor coordination, and an increased error rate, while long term exposure to stress leads to issues such as dissatisfaction, resignation, depression and general psychosomatic ailment and disease. Software developers are known to be stressed workers. Stress has been suggested to have detrimental effects on team morale and motivation, communication and cooperation-dependent work, software quality, maintainability, and requirements management. There is a need to effectively assess, monitor, and reduce stress for software developers. While there is substantial psycho-social and medical research on stress and its measurement, we notice that the transfer of these methods and practices to software engineering has not been fully made. For this reason, we engage in an interdisciplinary endeavor between researchers in software engineering and medical and social sciences towards a better understanding of stress effects while developing software. This article offers two main contributions. First, we provide an overview of supported theories of stress and the many ways to assess stress in individuals. Second, we propose a robust methodology to detect and measure stress in controlled experiments that is tailored to software engineering research. We also evaluate the methodology by implementing it on an experiment, which we first pilot and then replicate in its enhanced form, and report on the results with lessons learned. With this work, we hope to stimulate research on stress in software engineering and inspire future research that is backed up by supported theories and employs psychometrically validated measures.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 16
    Publication Date: 2020-08-17
    Description: In the era of Internet of Things and 5G networks, handling real time network traffic with the required Quality of Services and optimal utilization of network resources is a challenging task. Traffic Engineering provides mechanisms to guide network traffic to improve utilization of network resources and meet requirements of the network Quality of Service (QoS). Traditional networks use IP based and Multi-Protocol Label Switching (MPLS) based Traffic Engineering mechanisms. Software Defined Networking (SDN) have characteristics useful for solving traffic scheduling and management. Currently the traditional networks are not going to be replaced fully by SDN enabled resources and hence traffic engineering solutions for Hybrid IP/SDN setups have to be explored. In this paper we propose a new Termite Inspired Optimization algorithm for dynamic path allocation and better utilization of network links using hybrid SDN setup. The proposed bioinspired algorithm based on Termite behaviour implemented in the SDN Controller supports elastic bandwidth demands from applications, by avoiding congestion, handling traffic priority and link availability. Testing in both simulated and physical test bed demonstrate the performance of the algorithm with the support of SDN. In cases of link failures, the algorithm in the SDN Controller performs failure recovery gracefully. The algorithm also performs very well in congestion avoidance. The SDN based algorithm can be implemented in the existing traditional WAN as a hybrid setup and is a less complex, better alternative to the traditional MPLS Traffic Engineering setup.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 17
    Publication Date: 2020-08-10
    Description: Background Many management tools, such as Discrete Event Simulation (DES) and Lean Healthcare, are efficient to support and assist health care quality. In this sense, the study aims at using Lean Thinking (LT) principles combined with DES to plan a Canadian emergency department (ED) expansion and at meeting the demand that comes from small care centers closed. The project‘s purpose is reducing the patients’ Length of Stay (LOS) in the ED. Additionally, they must be assisted as soon as possible after the triage process. Furthermore, the study aims at determining the ideal number of beds in the Short Stay Unit (SSU). The patients must not wait more than 180 min to be transferred. Methods For this purpose, the hospital decision-makers have suggested planning the expansion, and it was carried out by the simulation and modeling method. The emergency department was simulated by the software FlexSim Healthcare®, and, with the Design of Experiments (DoE), the optimal number of beds, seats, and resources for each shift was determined. Data collection and modeling were executed based on historical data (patients’ arrival) and from some databases that are in use by the hospital, from April 1st, 2017 to March 31st, 2018. The experiments were carried out by running 30 replicates for each scenario. Results The results show that the emergency department cannot meet expected demand in the initial planning scenario. Only 17.2% of the patients were completed treated, and LOS was 2213.7 (average), with a confidence interval of (2131.8–2295.6) min. However, after changing decision variables and applying LT techniques, the treated patients’ number increased to 95.7% (approximately 600%). Average LOS decreased to 461.2, with a confidence interval of (453.7–468.7) min, about 79.0%. The time to be attended after the triage decrease from 404.3 min to 20.8 (19.8–21.8) min, around 95.0%, while the time to be transferred from bed to the SSU decreased by 60.0%. Moreover, the ED reduced human resources downtime, according to Lean Thinking principles.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 18
    Publication Date: 2020-08-10
    Description: This article addresses the monitoring problem of the telecommunication networks. We consider these networks as multilevel dynamic objects. It shows that reconfigurable systems are necessary for their monitoring process in real life. We implement the reconfiguration abilities of the systems through the synthesis of monitoring programs and their execution in the monitoring systems and on the end-user devices. This article presents a new method for the synthesis of monitoring programs and develops a new language to describe the monitoring programs. The programs are translated into binary format and executed by the virtual machines installed on the elements of the networks. We present an example of the program synthesis for real distributed networks monitoring at last.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 19
    Publication Date: 2020-08-17
    Description: We investigate the automatic design of communication in swarm robotics through two studies. We first introduce Gianduja an automatic design method that generates collective behaviors for robot swarms in which individuals can locally exchange a message whose semantics is not a priori fixed. It is the automatic design process that, on a per-mission basis, defines the conditions under which the message is sent and the effect that it has on the receiving peers. Then, we extend Gianduja to Gianduja2 and Gianduja 3, which target robots that can exchange multiple distinct messages. Also in this case, the semantics of the messages is automatically defined on a per-mission basis by the design process. Gianduja and its variants are based on Chocolate, which does not provide any support for local communication. In the article, we compare Gianduja and its variants with a standard neuro-evolutionary approach. We consider a total of six different swarm robotics missions. We present results based on simulation and tests performed with 20 e-puck robots. Results show that, typically, Gianduja and its variants are able to associate a meaningful semantics to messages.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 20
    Publication Date: 2020-08-17
    Description: Background In the last twenty years, new methodologies have made possible the gathering of large amounts of data concerning the genetic information and metabolic functions associated to the human gut microbiome. In spite of that, processing all this data available might not be the simplest of tasks, which could result in an excess of information awaiting proper annotation. This assessment intended on evaluating how well respected databases could describe a mock human gut microbiome. Methods In this work, we critically evaluate the output of the cross–reference between the Uniprot Knowledge Base (Uniprot KB) and the Kyoto Encyclopedia of Genes and Genomes Orthologs (KEGG Orthologs) or the evolutionary genealogy of genes: Non-supervised Orthologous groups (EggNOG) databases regarding a list of species that were previously found in the human gut microbiome. Results From a list which contemplates 131 species and 52 genera, 53 species and 40 genera had corresponding entries for KEGG Database and 82 species and 47 genera had corresponding entries for EggNOG Database. Moreover, we present the KEGG Orthologs (KOs) and EggNOG Orthologs (NOGs) entries associated to the search as their distribution over species and genera and lists of functions that appeared in many species or genera, the “core” functions of the human gut microbiome. We also present the relative abundance of KOs and NOGs throughout phyla and genera. Lastly, we expose a variance found between searches with different arguments on the database entries. Inferring functionality based on cross-referencing UniProt and KEGG or EggNOG can be lackluster due to the low number of annotated species in Uniprot and due to the lower number of functions affiliated to the majority of these species. Additionally, the EggNOG database showed greater performance for a cross-search with Uniprot about a mock human gut microbiome. Notwithstanding, efforts targeting cultivation, single-cell sequencing or the reconstruction of high-quality metagenome-assembled genomes (MAG) and their annotation are needed to allow the use of these databases for inferring functionality in human gut microbiome studies.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 21
    Publication Date: 2020-06-15
    Description: Application of deep neural network is a rapidly expanding field now reaching many disciplines including genomics. In particular, convolutional neural networks have been exploited for identifying the functional role of short genomic sequences. These approaches rely on gathering large sets of sequences with known functional role, extracting those sequences from whole-genome-annotations. These sets are then split into learning, test and validation sets in order to train the networks. While the obtained networks perform well on validation sets, they often perform poorly when applied on whole genomes in which the ratio of positive over negative examples can be very different than in the training set. We here address this issue by assessing the genome-wide performance of networks trained with sets exhibiting different ratios of positive to negative examples. As a case study, we use sequences encompassing gene starts from the RefGene database as positive examples and random genomic sequences as negative examples. We then demonstrate that models trained using data from one organism can be used to predict gene-start sites in a related species, when using training sets providing good genome-wide performance. This cross-species application of convolutional neural networks provides a new way to annotate any genome from existing high-quality annotations in a related reference species. It also provides a way to determine whether the sequence motifs recognised by chromatin-associated proteins in different species are conserved or not.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 22
    Publication Date: 2020-03-02
    Description: Data classification is a fundamental task in data mining. Within this field, the classification of multi-labeled data has been seriously considered in recent years. In such problems, each data entity can simultaneously belong to several categories. Multi-label classification is important because of many recent real-world applications in which each entity has more than one label. To improve the performance of multi-label classification, feature selection plays an important role. It involves identifying and removing irrelevant and redundant features that unnecessarily increase the dimensions of the search space for the classification problems. However, classification may fail with an extreme decrease in the number of relevant features. Thus, minimizing the number of features and maximizing the classification accuracy are two desirable but conflicting objectives in multi-label feature selection. In this article, we introduce a multi-objective optimization algorithm customized for selecting the features of multi-label data. The proposed algorithm is an enhanced variant of a decomposition-based multi-objective optimization approach, in which the multi-label feature selection problem is divided into single-objective subproblems that can be simultaneously solved using an evolutionary algorithm. This approach leads to accelerating the optimization process and finding more diverse feature subsets. The proposed method benefits from a local search operator to find better solutions for each subproblem. We also define a pool of genetic operators to generate new feature subsets based on old generation. To evaluate the performance of the proposed algorithm, we compare it with two other multi-objective feature selection approaches on eight real-world benchmark datasets that are commonly used for multi-label classification. The reported results of multi-objective method evaluation measures, such as hypervolume indicator and set coverage, illustrate an improvement in the results obtained by the proposed method. Moreover, the proposed method achieved better results in terms of classification accuracy with fewer features compared with state-of-the-art methods.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 23
    Publication Date: 2020-02-17
    Description: Hadoop has become a promising platform to reliably process and store big data. It provides flexible and low cost services to huge data through Hadoop Distributed File System (HDFS) storage. Unfortunately, absence of any inherent security mechanism in Hadoop increases the possibility of malicious attacks on the data processed or stored through Hadoop. In this scenario, securing the data stored in HDFS becomes a challenging task. Hence, researchers and practitioners have intensified their efforts in working on mechanisms that would protect user’s information collated in HDFS. This has led to the development of numerous encryption-decryption algorithms but their performance decreases as the file size increases. In the present study, the authors have enlisted a methodology to solve the issue of data security in Hadoop storage. The authors have integrated Attribute Based Encryption with the honey encryption on Hadoop, i.e., Attribute Based Honey Encryption (ABHE). This approach works on files that are encoded inside the HDFS and decoded inside the Mapper. In addition, the authors have evaluated the proposed ABHE algorithm by performing encryption-decryption on different sizes of files and have compared the same with existing ones including AES and AES with OTP algorithms. The ABHE algorithm shows considerable improvement in performance during the encryption-decryption of files.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 24
    Publication Date: 2020-03-02
    Description: The theory of the continuous two-dimensional (2D) Fourier Transform in polar coordinates has been recently developed but no discrete counterpart exists to date. In the first part of this two-paper series, we proposed and evaluated the theory of the 2D Discrete Fourier Transform (DFT) in polar coordinates. The theory of the actual manipulated quantities was shown, including the standard set of shift, modulation, multiplication, and convolution rules. In this second part of the series, we address the computational aspects of the 2D DFT in polar coordinates. Specifically, we demonstrate how the decomposition of the 2D DFT as a DFT, Discrete Hankel Transform and inverse DFT sequence can be exploited for coding. We also demonstrate how the proposed 2D DFT can be used to approximate the continuous forward and inverse Fourier transform in polar coordinates in the same manner that the 1D DFT can be used to approximate its continuous counterpart.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 25
    Publication Date: 2020-02-17
    Description: For years, immersive interfaces using virtual and augmented reality (AR) for molecular visualization and modeling have promised a revolution in the way how we teach, learn, communicate and work in chemistry, structural biology and related areas. However, most tools available today for immersive modeling require specialized hardware and software, and are costly and cumbersome to set up. These limitations prevent wide use of immersive technologies in education and research centers in a standardized form, which in turn prevents large-scale testing of the actual effects of such technologies on learning and thinking processes. Here, I discuss building blocks for creating marker-based AR applications that run as web pages on regular computers, and explore how they can be exploited to develop web content for handling virtual molecular systems in commodity AR with no more than a webcam- and internet-enabled computer. Examples span from displaying molecules, electron microscopy maps and molecular orbitals with minimal amounts of HTML code, to incorporation of molecular mechanics, real-time estimation of experimental observables and other interactive resources using JavaScript. These web apps provide virtual alternatives to physical, plastic-made molecular modeling kits, where the computer augments the experience with information about spatial interactions, reactivity, energetics, etc. The ideas and prototypes introduced here should serve as starting points for building active content that everybody can utilize online at minimal cost, providing novel interactive pedagogic material in such an open way that it could enable mass-testing of the effect of immersive technologies on chemistry education.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 26
    Publication Date: 2020-02-24
    Description: Record linkage aims to identify records from multiple data sources that refer to the same entity of the real world. It is a well known data quality process studied since the second half of the last century, with an established pipeline and a rich literature of case studies mainly covering census, administrative or health domains. In this paper, a method to recognize matching records from real municipalities and banks through multiple similarity criteria and a Neural Network classifier is proposed: starting from a labeled subset of the available data, first several similarity measures are combined and weighted to build a feature vector, then a Multi-Layer Perceptron (MLP) network is trained and tested to find matching pairs. For validation, seven real datasets have been used (three from banks and four from municipalities), purposely chosen in the same geographical area to increase the probability of matches. The training only involved two municipalities, while testing involved all sources (municipalities vs. municipalities, banks vs banks and and municipalities vs. banks). The proposed method scored remarkable results in terms of both precision and recall, clearly outperforming threshold-based competitors.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 27
    Publication Date: 2020-03-02
    Description: Integrated circuits may be vulnerable to hardware Trojan attacks during its design or fabrication phases. This article is a case study of the design of a Viterbi decoder and the effect of hardware Trojans on a coded communication system employing the Viterbi decoder. Design of a Viterbi decoder and possible hardware Trojan models for the same are proposed. An FPGA-based implementation of the decoder and the associated Trojan circuits have been discussed. The noise-added encoded input data stream is stored in the block RAM of the FPGA and the decoded data stream is monitored on the PC through an universal asynchronous receiver transmitter interface. The implementation results show that there is barely any change in the LUTs used (0.5%) and power dissipation (3%) due to the insertion of the proposed Trojan circuits, thus establishing the surreptitious nature of the Trojan. In spite of the fact that the Trojans cause negligible changes in the circuit parameters, there are significant changes in the bit error rate (BER) due to the presence of Trojans. In the absence of Trojans, BER drops down to zero for signal to noise rations (SNRs) higher than 6 dB, but with the presence of Trojans, BER doesn’t reduce to zero even at a very high SNRs. This is true even with the Trojan being activated only once during the entire duration of the transmission.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 28
    Publication Date: 2020-05-04
    Description: We introduce SANgo (Storage Area Network in the Go language)—a Go-based package for simulating the behavior of modern storage infrastructure. The software is based on the discrete-event modeling paradigm and captures the structure and dynamics of high-level storage system building blocks. The flexible structure of the package allows us to create a model of a real storage system with a configurable number of components. The granularity of the simulated system can be defined depending on the replicated patterns of actual system behavior. Accurate replication enables us to reach the primary goal of our simulator—to explore the stability boundaries of real storage systems. To meet this goal, SANgo offers a variety of interfaces for easy monitoring and tuning of the simulated model. These interfaces allow us to track the number of metrics of such components as storage controllers, network connections, and hard-drives. Other interfaces allow altering the parameter values of the simulated system effectively in real-time, thus providing the possibility for training a realistic digital twin using, for example, the reinforcement learning (RL) approach. One can train an RL model to reduce discrepancies between simulated and real SAN data. The external control algorithm can adjust the simulator parameters to make the difference as small as possible. SANgo supports the standard OpenAI gym interface; thus, the software can serve as a benchmark for comparison of different learning algorithms.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 29
    Publication Date: 2020-05-04
    Description: When real-time systems are modeled as timed automata, different time scales may lead to substantial fragmentation of the symbolic state space. Exact acceleration solves the fragmentation problem without changing system reachability. The relatively mature technology of exact acceleration has been used with an appended cycle or a parking cycle, which can be applied to the calculation of a single acceleratable cycle model. Using these two technologies to develop a complex real-time model requires additional states and consumes a large amount of time cost, thereby influencing acceleration efficiency. In this paper, a complex real-time exact acceleration method based on an overlapping cycle is proposed, which is an application scenario extension of the parking-cycle technique. By comprehensively analyzing the accelerating impacts of multiple acceleratable cycles, it is only necessary to add a single overlapping period with a fixed length without relying on the windows of acceleratable cycles. Experimental results show that the proposed timed automaton model is simple and effectively decreases the time costs of exact acceleration. For the complex real-time system model, the method based on an overlapping cycle can accelerate the large scale and concurrent states which cannot be solved by the original exact acceleration theory.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 30
    Publication Date: 2020-01-16
    Description: Over the years, neuroscientists and psychophysicists have been asking whether data acquisition for facial analysis should be performed holistically or with local feature analysis. This has led to various advanced methods of face recognition being proposed, and especially techniques using facial landmarks. The current facial landmark methods in 3D involve a mathematically complex and time-consuming workflow involving semi-landmark sliding tasks. This paper proposes a homologous multi-point warping for 3D facial landmarking, which is verified experimentally on each of the target objects in a given dataset using 500 landmarks (16 anatomical fixed points and 484 sliding semi-landmarks). This is achieved by building a template mesh as a reference object and applying this template to each of the targets in three datasets using an artificial deformation approach. The semi-landmarks are subjected to sliding along tangents to the curves or surfaces until the bending energy between a template and a target form is minimal. The results indicate that our method can be used to investigate shape variation for multiple datasets when implemented on three databases (Stirling, FRGC and Bosphorus).
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 31
    Publication Date: 2020-06-01
    Description: Background A conformational B-cell epitope is one of the main components of vaccine design. It contains separate segments in its sequence, which are spatially close in the antigen chain. The availability of Ag-Ab complex data on the Protein Data Bank allows for the development predictive methods. Several epitope prediction models also have been developed, including learning-based methods. However, the performance of the model is still not optimum. The main problem in learning-based prediction models is class imbalance. Methods This study proposes CluSMOTE, which is a combination of a cluster-based undersampling method and Synthetic Minority Oversampling Technique. The approach is used to generate other sample data to ensure that the dataset of the conformational epitope is balanced. The Hierarchical DBSCAN algorithm is performed to identify the cluster in the majority class. Some of the randomly selected data is taken from each cluster, considering the oversampling degree, and combined with the minority class data. The balance data is utilized as the training dataset to develop a conformational epitope prediction. Furthermore, two binary classification methods, Support Vector Machine and Decision Tree, are separately used to develop model prediction and to evaluate the performance of CluSMOTE in predicting conformational B-cell epitope. The experiment is focused on determining the best parameter for optimal CluSMOTE. Two independent datasets are used to compare the proposed prediction model with state of the art methods. The first and the second datasets represent the general protein and the glycoprotein antigens respectively. Result The experimental result shows that CluSMOTE Decision Tree outperformed the Support Vector Machine in terms of AUC and Gmean as performance measurements. The mean AUC of CluSMOTE Decision Tree in the Kringelum and the SEPPA 3 test sets are 0.83 and 0.766, respectively. This shows that CluSMOTE Decision Tree is better than other methods in the general protein antigen, though comparable with SEPPA 3 in the glycoprotein antigen.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 32
    Publication Date: 2020-02-03
    Description: Proteins are the building blocks of all cells in both human and all living creatures of the world. Most of the work in the living organism is performed by proteins. Proteins are polymers of amino acid monomers which are biomolecules or macromolecules. The tertiary structure of protein represents the three-dimensional shape of a protein. The functions, classification and binding sites are governed by the protein’s tertiary structure. If two protein structures are alike, then the two proteins can be of the same kind implying similar structural class and ligand binding properties. In this paper, we have used the protein tertiary structure to generate effective features for applications in structural similarity to detect structural class and ligand binding. Firstly, we have analyzed the effectiveness of a group of image-based features to predict the structural class of a protein. These features are derived from the image generated by the distance matrix of the tertiary structure of a given protein. They include local binary pattern (LBP) histogram, Gabor filtered LBP histogram, separate row multiplication matrix with uniform LBP histogram, neighbor block subtraction matrix with uniform LBP histogram and atom bond. Separate row multiplication matrix and neighbor block subtraction matrix filters, as well as atom bond, are our novels. The experiments were done on a standard benchmark dataset. We have demonstrated the effectiveness of these features over a large variety of supervised machine learning algorithms. Experiments suggest support vector machines is the best performing classifier on the selected dataset using the set of features. We believe the excellent performance of Hybrid LBP in terms of accuracy would motivate the researchers and practitioners to use it to identify protein structural class. To facilitate that, a classification model using Hybrid LBP is readily available for use at http://brl.uiu.ac.bd/PL/. Protein-ligand binding is accountable for managing the tasks of biological receptors that help to cure diseases and many more. Therefore, binding prediction between protein and ligand is important for understanding a protein’s activity or to accelerate docking computations in virtual screening-based drug design. Protein-ligand binding prediction requires three-dimensional tertiary structure of the target protein to be searched for ligand binding. In this paper, we have proposed a supervised learning algorithm for predicting protein-ligand binding, which is a similarity-based clustering approach using the same set of features. Our algorithm works better than the most popular and widely used machine learning algorithms.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 33
    Publication Date: 2020-02-10
    Description: This work introduces a method to estimate reflectance, shading, and specularity from a single image. Reflectance, shading, and specularity are intrinsic images derived from the dichromatic model. Estimation of these intrinsic images has many applications in computer vision such as shape recovery, specularity removal, segmentation, or classification. The proposed method allows for recovering the dichromatic model parameters thanks to two independent quadratic programming steps. Compared to the state of the art in this domain, our approach has the advantage to address a complex inverse problem into two parallelizable optimization steps that are easy to solve and do not require learning. The proposed method is an extension of a previous algorithm that is rewritten to be numerically more stable, has better quantitative and qualitative results, and applies to multispectral images. The proposed method is assessed qualitatively and quantitatively on standard RGB and multispectral datasets.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 34
    Publication Date: 2020-05-18
    Description: In recent years, a large body of literature has accumulated around the topic of research paper recommender systems. However, since most studies have focused on the variable of accuracy, they have overlooked the serendipity of recommendations, which is an important determinant of user satisfaction. Serendipity is concerned with the relevance and unexpectedness of recommendations, and so serendipitous items are considered those which positively surprise users. The purpose of this article was to examine two key research questions: firstly, whether a user’s Tweets can assist in generating more serendipitous recommendations; and secondly, whether the diversification of a list of recommended items further improves serendipity. To investigate these issues, an online experiment was conducted in the domain of computer science with 22 subjects. As an evaluation metric, we use the serendipity score (SRDP), in which the unexpectedness of recommendations is inferred by using a primitive recommendation strategy. The results indicate that a user’s Tweets do not improve serendipity, but they can reflect recent research interests and are typically heterogeneous. Contrastingly, diversification was found to lead to a greater number of serendipitous research paper recommendations.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 35
    Publication Date: 2020-05-18
    Description: Adversarial Optimization provides a reliable, practical way to match two implicitly defined distributions, one of which is typically represented by a sample of real data, and the other is represented by a parameterized generator. Matching of the distributions is achieved by minimizing a divergence between these distribution, and estimation of the divergence involves a secondary optimization task, which, typically, requires training a model to discriminate between these distributions. The choice of the model has its trade-off: high-capacity models provide good estimations of the divergence, but, generally, require large sample sizes to be properly trained. In contrast, low-capacity models tend to require fewer samples for training; however, they might provide biased estimations. Computational costs of Adversarial Optimization becomes significant when sampling from the generator is expensive. One of the practical examples of such settings is fine-tuning parameters of complex computer simulations. In this work, we introduce a novel family of divergences that enables faster optimization convergence measured by the number of samples drawn from the generator. The variation of the underlying discriminator model capacity during optimization leads to a significant speed-up. The proposed divergence family suggests using low-capacity models to compare distant distributions (typically, at early optimization steps), and the capacity gradually grows as the distributions become closer to each other. Thus, it allows for a significant acceleration of the initial stages of optimization. This acceleration was demonstrated on two fine-tuning problems involving Pythia event generator and two of the most popular black-box optimization algorithms: Bayesian Optimization and Variational Optimization. Experiments show that, given the same budget, adaptive divergences yield results up to an order of magnitude closer to the optimum than Jensen-Shannon divergence. While we consider physics-related simulations, adaptive divergences can be applied to any stochastic simulation.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 36
    Publication Date: 2020-07-27
    Description: Malware development has seen diversity in terms of architecture and features. This advancement in the competencies of malware poses a severe threat and opens new research dimensions in malware detection. This study is focused on metamorphic malware, which is the most advanced member of the malware family. It is quite impossible for anti-virus applications using traditional signature-based methods to detect metamorphic malware, which makes it difficult to classify this type of malware accordingly. Recent research literature about malware detection and classification discusses this issue related to malware behavior. The main goal of this paper is to develop a classification method according to malware types by taking into consideration the behavior of malware. We started this research by developing a new dataset containing API calls made on the windows operating system, which represents the behavior of malicious software. The types of malicious malware included in the dataset are Adware, Backdoor, Downloader, Dropper, spyware, Trojan, Virus, and Worm. The classification method used in this study is LSTM (Long Short-Term Memory), which is a widely used classification method in sequential data. The results obtained by the classifier demonstrate accuracy up to 95% with 0.83 $F_1$-score, which is quite satisfactory. We also run our experiments with binary and multi-class malware datasets to show the classification performance of the LSTM model. Another significant contribution of this research paper is the development of a new dataset for Windows operating systems based on API calls. To the best of our knowledge, there is no such dataset available before our research. The availability of our dataset on GitHub facilitates the research community in the domain of malware detection to benefit and make a further contribution to this domain.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 37
    Publication Date: 2020-01-13
    Description: The task-based approach is a parallelization paradigm in which an algorithm is transformed into a direct acyclic graph of tasks: the vertices are computational elements extracted from the original algorithm and the edges are dependencies between those. During the execution, the management of the dependencies adds an overhead that can become significant when the computational cost of the tasks is low. A possibility to reduce the makespan is to aggregate the tasks to make them heavier, while having fewer of them, with the objective of mitigating the importance of the overhead. In this paper, we study an existing clustering/partitioning strategy to speed up the parallel execution of a task-based application. We provide two additional heuristics to this algorithm and perform an in-depth study on a large graph set. In addition, we propose a new model to estimate the execution duration and use it to choose the proper granularity. We show that this strategy allows speeding up a real numerical application by a factor of 7 on a multi-core system.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 38
    Publication Date: 2020-07-13
    Description: A human Visual System (HVS) has the ability to pay visual attention, which is one of the many functions of the HVS. Despite the many advancements being made in visual saliency prediction, there continues to be room for improvement. Deep learning has recently been used to deal with this task. This study proposes a novel deep learning model based on a Fully Convolutional Network (FCN) architecture. The proposed model is trained in an end-to-end style and designed to predict visual saliency. The entire proposed model is fully training style from scratch to extract distinguishing features. The proposed model is evaluated using several benchmark datasets, such as MIT300, MIT1003, TORONTO, and DUT-OMRON. The quantitative and qualitative experiment analyses demonstrate that the proposed model achieves superior performance for predicting visual saliency.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 39
    Publication Date: 2020-07-13
    Description: Interpolation techniques provide a method to convert point data of a geographic phenomenon into a continuous field estimate of that phenomenon, and have become a fundamental geocomputational technique of spatial and geographical analysts. Natural neighbour interpolation is one method of interpolation that has several useful properties: it is an exact interpolator, it creates a smooth surface free of any discontinuities, it is a local method, is spatially adaptive, requires no statistical assumptions, can be applied to small datasets, and is parameter free. However, as with any interpolation method, there will be uncertainty in how well the interpolated field values reflect actual phenomenon values. Using a method based on natural neighbour distance based rates of error calculated for data points via cross-validation, a cross-validation error-distance field can be produced to associate uncertainty with the interpolation. Virtual geography experiments demonstrate that given an appropriate number of data points and spatial-autocorrelation of the phenomenon being interpolated, the natural neighbour interpolation and cross-validation error-distance fields provide reliable estimates of value and error within the convex hull of the data points. While this method does not replace the need for analysts to use sound judgement in their interpolations, for those researchers for whom natural neighbour interpolation is the best interpolation option the method presented provides a way to assess the uncertainty associated with natural neighbour interpolations.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 40
    Publication Date: 2020-01-20
    Description: Background Owing to the rapid advances in DNA sequencing technologies, whole genome from more and more species are becoming available at increasing pace. For whole-genome analysis, idiograms provide a very popular, intuitive and effective way to map and visualize the genome-wide information, such as GC content, gene and repeat density, DNA methylation distribution, genomic synteny, etc. However, most available software programs and web servers are available only for a few model species, such as human, mouse and fly, or have limited application scenarios. As more and more non-model species are sequenced with chromosome-level assembly being available, tools that can generate idiograms for a broad range of species and be capable of visualizing more data types are needed to help better understanding fundamental genome characteristics. Results The R package RIdeogram allows users to build high-quality idiograms of any species of interest. It can map continuous and discrete genome-wide data on the idiograms and visualize them in a heat map and track labels, respectively. Conclusion The visualization of genome-wide data mapping and comparison allow users to quickly establish a clear impression of the chromosomal distribution pattern, thus making RIdeogram a useful tool for any researchers working with omics.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 41
    Publication Date: 2020-03-02
    Description: The building of large-scale Digital Elevation Models (DEMs) using various interpolation algorithms is one of the key issues in geographic information science. Different choices of interpolation algorithms may trigger significant differences in interpolation accuracy and computational efficiency, and a proper interpolation algorithm needs to be carefully used based on the specific characteristics of the scene of interpolation. In this paper, we comparatively investigate the performance of parallel Radial Basis Function (RBF)-based, Moving Least Square (MLS)-based, and Shepard’s interpolation algorithms for building DEMs by evaluating the influence of terrain type, raw data density, and distribution patterns on the interpolation accuracy and computational efficiency. The drawn conclusions may help select a suitable interpolation algorithm in a specific scene to build large-scale DEMs.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 42
    Publication Date: 2020-01-06
    Description: In recent years, the need for security of personal data is becoming progressively important. In this regard, the identification system based on fusion of multibiometric is most recommended for significantly improving and achieving the high performance accuracy. The main purpose of this paper is to propose a hybrid system of combining the effect of tree efficient models: Convolutional neural network (CNN), Softmax and Random forest (RF) classifier based on multi-biometric fingerprint, finger-vein and face identification system. In conventional fingerprint system, image pre-processed is applied to separate the foreground and background region based on K-means and DBSCAN algorithm. Furthermore, the features are extracted using CNNs and dropout approach, after that, the Softmax performs as a recognizer. In conventional fingervein system, the region of interest image contrast enhancement using exposure fusion framework is input into the CNNs model. Moreover, the RF classifier is proposed for classification. In conventional face system, the CNNs architecture and Softmax are required to generate face feature vectors and classify personal recognition. The score provided by these systems is combined for improving Human identification. The proposed algorithm is evaluated on publicly available SDUMLA-HMT real multimodal biometric database using a GPU based implementation. Experimental results on the datasets has shown significant capability for identification biometric system. The proposed work can offer an accurate and efficient matching compared with other system based on unimodal, bimodal, multimodal characteristics.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 43
    Publication Date: 2020-09-28
    Description: The R language is widely used for data analysis. However, it does not allow for complex object-oriented implementation and it tends to be slower than other languages such as Java, C and C++. Consequently, it can be more computationally efficient to run native Java code in R. To do this, there exist at least two approaches. One is based on the Java Native Interface (JNI) and it has been successfully implemented in the rJava package. An alternative approach consists of running a local server in Java and linking it to an R environment through a socket connection. This alternative approach has been implemented in an R package called J4R. This article shows how this approach makes it possible to simplify the calls to Java methods and to integrate the R vectorization. The downside is a loss of performance. However, if the vectorization is used in conjunction with multithreading, this loss of performance can be compensated for.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 44
    Publication Date: 2020-09-14
    Description: Background In the modern world, millions of people suffer from fake and poor-quality medical products entering the market. Violation of the rules of transportation of drugs makes them ineffective and even dangerous. The relationship between the various parts of the supply chain, production and regulation of drugs is too hard and has many problems. Distributed ledger technology is a distributed database, the properties of which allow us to track the entire path of medical products from the manufacturer to consumer, to improve the current model of the supply chain, to transform the pharmaceutical industry and prevent falsified drugs reach the market. Objective The aim of the article is to analyze the distributed ledger technology as an innovative means of poor-quality pharmaceuticals prevention to reach the market as well as their forehanded detection. Methods Content analysis of web sites of companies developing distributed ledger technology solutions had been performed. Five examples found with a google search engine by keywords “distributed ledger technology”, “blockchain”, “pharmaceuticals” and “supply chain” were examined. Analysis of relative scientific publications had been made. With the help of generalization and systematization methods, services provided by these companies were analyzed. The visual model of the supply chain was created with Microsoft Visio software. Results The analysis results contain a principle scheme of distributed ledger technology implementation to achieve the objectives. The analysis of present-day pharmaceuticals supply chain structure and the distributed ledger technology capacities to improve pharmaceutical companies has been carried out and presented. Furthermore, the article allows getting acquainted with today’s projects released to the market as well as the prognosis of the distributed ledger technology in pharmaceutical industry enhancement in the future.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 45
    Publication Date: 2020-09-14
    Description: This paper proposes a slow-moving management method for a system using of intermittent demand per unit time and lead time demand of items in service enterprise inventory models. Our method uses zero-inflated truncated normal statistical distribution, which makes it possible to model intermittent demand per unit time using mixed statistical distribution. We conducted numerical experiments based on an algorithm used to forecast intermittent demand over fixed lead time to show that our proposed distributions improved the performance of the continuous review inventory model with shortages. We evaluated multi-criteria elements (total cost, fill-rate, shortage of quantity per cycle, and the adequacy of the statistical distribution of the lead time demand) for decision analysis using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS). We confirmed that our method improved the performance of the inventory model in comparison to other commonly used approaches such as simple exponential smoothing and Croston’s method. We found an interesting association between the intermittency of demand per unit of time, the square root of this same parameter and reorder point decisions, that could be explained using classical multiple linear regression model. We confirmed that the parameter of variability of the zero-inflated truncated normal statistical distribution used to model intermittent demand was positively related to the decision of reorder points. Our study examined a decision analysis using illustrative example. Our suggested approach is original, valuable, and, in the case of slow-moving item management for service companies, allows for the verification of decision-making using multiple criteria.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 46
    Publication Date: 2020-09-14
    Description: Mutation testing is a method widely used to evaluate the effectiveness of the test suite in hardware and software tests or to design new software tests. In mutation testing, the original model is systematically mutated using certain error assumptions. Mutation testing is based on well-defined mutation operators that imitate typical programming errors or which form highly successful test suites. The success of test suites is determined by the rate of killing mutants created through mutation operators. Because of the high number of mutants in mutation testing, the calculation cost increases in the testing of finite state machines (FSM). Under the assumption that each mutant is of equal value, random selection can be a practical method of mutant reduction. However, in this study, it was assumed that each mutant did not have an equal value. Starting from this point of view, a new mutant reduction method was proposed by using the centrality criteria in social network analysis. It was assumed that the central regions selected within this frame were the regions from where test cases pass the most. To evaluate the proposed method, besides the feature of detecting all failures related to the model, the widely-used W method was chosen. Random and proposed mutant reduction methods were compared with respect to their success by using test suites. As a result of the evaluations, it was discovered that mutants selected via the proposed reduction technique revealed a higher performance. Furthermore, it was observed that the proposed method reduced the cost of mutation testing.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 47
    Publication Date: 2020-03-02
    Description: Reconstructing a “forma mentis”, a mindset, and its changes, means capturing how individuals perceive topics, trends and experiences over time. To this aim we use forma mentis networks (FMNs), which enable direct, microscopic access to how individuals conceptually perceive knowledge and sentiment around a topic, providing richer contextual information than machine learning. FMNs build cognitive representations of stances through psycholinguistic tools like conceptual associations from semantic memory (free associations, i.e., one concept eliciting another) and affect norms (valence, i.e., how attractive a concept is). We test FMNs by investigating how Norwegian nursing and engineering students perceived innovation and health before and after a 2-month research project in e-health. We built and analysed FMNs by six individuals, based on 75 cues about innovation and health, and leading to 1,000 associations between 730 concepts. We repeated this procedure before and after the project. When investigating changes over time, individual FMNs highlighted drastic improvements in all students’ stances towards “teamwork”, “collaboration”, “engineering” and “future”, indicating the acquisition and strengthening of a positive belief about innovation. Nursing students improved their perception of ‘robots” and “technology” and related them to the future of nursing. A group-level analysis related these changes to the emergence, during the project, of conceptual associations about openness towards multidisciplinary collaboration, and a positive, leadership-oriented group dynamics. The whole group identified “mathematics” and “coding” as highly relevant concepts after the project. When investigating persistent associations, characterising the core of students’ mindsets, network distance entropy and closeness identified as pivotal in the students’ mindsets concepts related to “personal well-being”, “professional growth” and “teamwork”. This result aligns with and extends previous studies reporting the relevance of teamwork and personal well-being for Norwegian healthcare professionals, also within the novel e-health sector. Our analysis indicates that forma mentis networks are powerful proxies for detecting individual- and group-level mindset changes due to professional growth. FMNs open new scenarios for data-informed, multidisciplinary interventions aimed at professional training in innovation.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 48
    Publication Date: 2020-09-14
    Description: Mindset reconstruction maps how individuals structure and perceive knowledge, a map unfolded here by investigating language and its cognitive reflection in the human mind, i.e., the mental lexicon. Textual forma mentis networks (TFMN) are glass boxes introduced for extracting and understanding mindsets’ structure (in Latin forma mentis) from textual data. Combining network science, psycholinguistics and Big Data, TFMNs successfully identified relevant concepts in benchmark texts, without supervision. Once validated, TFMNs were applied to the case study of distorted mindsets about the gender gap in science. Focusing on social media, this work analysed 10,000 tweets mostly representing individuals’ opinions at the beginning of posts. “Gender” and “gap” elicited a mostly positive, trustful and joyous perception, with semantic associates that: celebrated successful female scientists, related gender gap to wage differences, and hoped for a future resolution. The perception of “woman” highlighted jargon of sexual harassment and stereotype threat (a form of implicit cognitive bias) about women in science “sacrificing personal skills for success”. The semantic frame of “man” highlighted awareness of the myth of male superiority in science. No anger was detected around “person”, suggesting that tweets got less tense around genderless terms. No stereotypical perception of “scientist” was identified online, differently from real-world surveys. This analysis thus identified that Twitter discourse mostly starting conversations promoted a majorly stereotype-free, positive/trustful perception of gender disparity, aimed at closing the gap. Hence, future monitoring against discriminating language should focus on other parts of conversations like users’ replies. TFMNs enable new ways for monitoring collective online mindsets, offering data-informed ground for policy making.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 49
    Publication Date: 2020-04-06
    Description: The uncertainty underlying real-world phenomena has attracted attention toward statistical analysis approaches. In this regard, many problems can be modeled as networks. Thus, the statistical analysis of networked problems has received special attention from many researchers in recent years. Exponential Random Graph Models, known as ERGMs, are one of the popular statistical methods for analyzing the graphs of networked data. ERGM is a generative statistical network model whose ultimate goal is to present a subset of networks with particular characteristics as a statistical distribution. In the context of ERGMs, these graph’s characteristics are called statistics or configurations. Most of the time they are the number of repeated subgraphs across the graphs. Some examples include the number of triangles or the number of cycle of an arbitrary length. Also, any other census of the graph, as with the edge density, can be considered as one of the graph’s statistics. In this review paper, after explaining the building blocks and classic methods of ERGMs, we have reviewed their newly presented approaches and research papers. Further, we have conducted a comprehensive study on the applications of ERGMs in many research areas which to the best of our knowledge has not been done before. This review paper can be used as an introduction for scientists from various disciplines whose aim is to use ERGMs in some networked data in their field of expertise.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 50
    Publication Date: 2020-07-06
    Description: In this article we forecast daily closing price series of Bitcoin, Litecoin and Ethereum cryptocurrencies, using data on prices and volumes of prior days. Cryptocurrencies price behaviour is still largely unexplored, presenting new opportunities for researchers and economists to highlight similarities and differences with standard financial prices. We compared our results with various benchmarks: one recent work on Bitcoin prices forecasting that follows different approaches, a well-known paper that uses Intel, National Bank shares and Microsoft daily NASDAQ closing prices spanning a 3-year interval and another, more recent paper which gives quantitative results on stock market index predictions. We followed different approaches in parallel, implementing both statistical techniques and machine learning algorithms: the Simple Linear Regression (SLR) model for uni-variate series forecast using only closing prices, and the Multiple Linear Regression (MLR) model for multivariate series using both price and volume data. We used two artificial neural networks as well: Multilayer Perceptron (MLP) and Long short-term memory (LSTM). While the entire time series resulted to be indistinguishable from a random walk, the partitioning of datasets into shorter sequences, representing different price “regimes”, allows to obtain precise forecast as evaluated in terms of Mean Absolute Percentage Error(MAPE) and relative Root Mean Square Error (relativeRMSE). In this case the best results are obtained using more than one previous price, thus confirming the existence of time regimes different from random walks. Our models perform well also in terms of time complexity, and provide overall results better than those obtained in the benchmark studies, improving the state-of-the-art.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 51
    Publication Date: 2020-06-29
    Description: Comparison of hierarchies aims at identifying differences and similarities between two or more hierarchical structures. In the biological taxonomy domain, comparison is indispensable for the reconciliation of alternative versions of a taxonomic classification. Biological taxonomies are knowledge structures that may include large amounts of nodes (taxa), which are typically maintained manually. We present the results of a user study with taxonomy experts that evaluates four well-known methods for the comparison of two hierarchies, namely, edge drawing, matrix representation, animation and agglomeration. Each of these methods is evaluated with respect to seven typical biological taxonomy curation tasks. To this end, we designed an interactive software environment through which expert taxonomists performed exercises representative of the considered tasks. We evaluated participants’ effectiveness and level of satisfaction from both quantitative and qualitative perspectives. Overall quantitative results evidence that participants were less effective with agglomeration whereas they were more satisfied with edge drawing. Qualitative findings reveal a greater preference among participants for the edge drawing method. In addition, from the qualitative analysis, we obtained insights that contribute to explain the differences between the methods and provide directions for future research.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 52
    Publication Date: 2020-06-15
    Description: When, where and how people move is a fundamental part of how human societies organize around every-day needs as well as how people adapt to risks, such as economic scarcity or instability, and natural disasters. Our ability to characterize and predict the diversity of human mobility patterns has been greatly expanded by the availability of Call Detail Records (CDR) from mobile phone cellular networks. The size and richness of these datasets is at the same time a blessing and a curse: while there is great opportunity to extract useful information from these datasets, it remains a challenge to do so in a meaningful way. In particular, human mobility is multiscale, meaning a diversity of patterns of mobility occur simultaneously, which vary according to timing, magnitude and spatial extent. To identify and characterize the main spatio-temporal scales and patterns of human mobility we examined CDR data from the Orange mobile network in Senegal using a new form of spectral graph wavelets, an approach from manifold learning. This unsupervised analysis reduces the dimensionality of the data to reveal seasonal changes in human mobility, as well as mobility patterns associated with large-scale but short-term religious events. The novel insight into human mobility patterns afforded by manifold learning methods like spectral graph wavelets have clear applications for urban planning, infrastructure design as well as hazard risk management, especially as climate change alters the biophysical landscape on which people work and live, leading to new patterns of human migration around the world.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 53
    Publication Date: 2020-04-13
    Description: Cancer classification is a topic of major interest in medicine since it allows accurate and efficient diagnosis and facilitates a successful outcome in medical treatments. Previous studies have classified human tumors using a large-scale RNA profiling and supervised Machine Learning (ML) algorithms to construct a molecular-based classification of carcinoma cells from breast, bladder, adenocarcinoma, colorectal, gastro esophagus, kidney, liver, lung, ovarian, pancreas, and prostate tumors. These datasets are collectively known as the 11_tumor database, although this database has been used in several works in the ML field, no comparative studies of different algorithms can be found in the literature. On the other hand, advances in both hardware and software technologies have fostered considerable improvements in the precision of solutions that use ML, such as Deep Learning (DL). In this study, we compare the most widely used algorithms in classical ML and DL to classify the tumors described in the 11_tumor database. We obtained tumor identification accuracies between 90.6% (Logistic Regression) and 94.43% (Convolutional Neural Networks) using k-fold cross-validation. Also, we show how a tuning process may or may not significantly improve algorithms’ accuracies. Our results demonstrate an efficient and accurate classification method based on gene expression (microarray data) and ML/DL algorithms, which facilitates tumor type prediction in a multi-cancer-type scenario.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 54
    Publication Date: 2020-03-30
    Description: This paper presents a new simplex-type algorithm for Linear Programming with the following two main characteristics: (i) the algorithm computes basic solutions which are neither primal or dual feasible, nor monotonically improving and (ii) the sequence of these basic solutions is connected with a sequence of monotonically improving interior points to construct a feasible direction at each iteration. We compare the proposed algorithm with the state-of-the-art commercial CPLEX and Gurobi Primal-Simplex optimizers on a collection of 93 well known benchmarks. The results are promising, showing that the new algorithm competes versus the state-of-the-art solvers in the total number of iterations required to converge.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 55
    Publication Date: 2020-03-16
    Description: Estimating free energy differences by computer simulation is useful for a wide variety of applications such as virtual screening for drug design and for understanding how amino acid mutations modify protein interactions. However, calculating free energy differences remains challenging and often requires extensive trial and error and very long simulation times in order to achieve converged results. Here, we present an implementation of the adaptive integration method (AIM). We tested our implementation on two molecular systems and compared results from AIM to those from a suite of other methods. The model systems tested here include calculating the solvation free energy of methane, and the free energy of mutating the peptide GAG to GVG. We show that AIM is more efficient than other tested methods for these systems, that is, AIM results converge to a higher level of accuracy and precision for a given simulation time.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 56
    Publication Date: 2020-03-23
    Description: The vast volume of documents available in legal databases demands effective information retrieval approaches which take into consideration the intricacies of the legal domain. Relevant document retrieval is the backbone of the legal domain. The concept of relevance in the legal domain is very complex and multi-faceted. In this work, we propose a novel approach of concept based similarity estimation among court judgments. We use a graph-based method, to identify prominent concepts present in a judgment and extract sentences representative of these concepts. The sentences and concepts so mined are used to express/visualize likeness among concepts between a pair of documents from different perspectives. We also propose to aggregate the different levels of matching so obtained into one measure quantifying the level of similarity between a judgment pair. We employ the ordered weighted average (OWA) family of aggregation operators for obtaining the similarity value. The experimental results suggest that the proposed approach of concept based similarity is effective in the extraction of relevant legal documents and performs better than other competing techniques. Additionally, the proposed two-level abstraction of similarity enables informative visualization for deeper insights into case relevance.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 57
    Publication Date: 2020-03-02
    Description: Integrating data from multiple heterogeneous data sources entails dealing with data distributed among heterogeneous information sources, which can be structured, semi-structured or unstructured, and providing the user with a unified view of these data. Thus, in general, gathering information is challenging, and one of the main reasons is that data sources are designed to support specific applications. Very often their structure is unknown to the large part of users. Moreover, the stored data is often redundant, mixed with information only needed to support enterprise processes, and incomplete with respect to the business domain. Collecting, integrating, reconciling and efficiently extracting information from heterogeneous and autonomous data sources is regarded as a major challenge. In this paper, we present an approach for the semantic integration of heterogeneous data sources, DIF (Data Integration Framework), and a software prototype to support all aspects of a complex data integration process. The proposed approach is an ontology-based generalization of both Global-as-View and Local-as-View approaches. In particular, to overcome problems due to semantic heterogeneity and to support interoperability with external systems, ontologies are used as a conceptual schema to represent both data sources to be integrated and the global view.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 58
    Publication Date: 2020-05-25
    Description: The use of end-to-end data mining methodologies such as CRISP-DM, KDD process, and SEMMA has grown substantially over the past decade. However, little is known as to how these methodologies are used in practice. In particular, the question of whether data mining methodologies are used ‘as-is’ or adapted for specific purposes, has not been thoroughly investigated. This article addresses this gap via a systematic literature review focused on the context in which data mining methodologies are used and the adaptations they undergo. The literature review covers 207 peer-reviewed and ‘grey’ publications. We find that data mining methodologies are primarily applied ‘as-is’. At the same time, we also identify various adaptations of data mining methodologies and we note that their number is growing rapidly. The dominant adaptations pattern is related to methodology adjustments at a granular level (modifications) followed by extensions of existing methodologies with additional elements. Further, we identify two recurrent purposes for adaptation: (1) adaptations to handle Big Data technologies, tools and environments (technological adaptations); and (2) adaptations for context-awareness and for integrating data mining solutions into business processes and IT systems (organizational adaptations). The study suggests that standard data mining methodologies do not pay sufficient attention to deployment issues, which play a prominent role when turning data mining models into software products that are integrated into the IT architectures and business processes of organizations. We conclude that refinements of existing methodologies aimed at combining data, technological, and organizational aspects, could help to mitigate these gaps.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 59
    Publication Date: 2020-03-30
    Description: Anti-forgery information, transaction verification, and smart contract are functionalities of blockchain technology that can change the traditional business processes of IT applications. These functionalities increase the data transparency, and trust of users in the new application models, thus resolving many different social problems today. In this work, we take all the advantages of this technology to build a blockchain-based authentication system (called the Vietnamese Educational Certification blockchain, which stands for VECefblock) to deal with the delimitation of fake certificate issues in Vietnam. In this direction, firstly, we categorize and analyze blockchain research and application trends to make out our contributions in this domain. Our motivating factor is to curb fake certificates in Vietnam by applying the suitability of blockchain technology to the problem domain. This study proposed some blockchain-based application development principles in order to build a step by step VECefblock with the following procedures: designing overall architecture along with business processes, data mapping structure and implementing the decentralized application that can meet the specific Vietnamese requirements. To test system functionalities, we used Hyperledger Fabric as a blockchain platform that is deployed on the Amazon EC2 cloud. Through performance evaluations, we proved the operability of VECefblock in the practical deployment environment. This experiment also shows the feasibility of our proposal, thus promoting the application of blockchain technology to deal with social problems in general as well as certificate management in Vietnam.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 60
    Publication Date: 2020-11-09
    Description: Background Business process modelling is increasingly used not only by the companies’ management but also by scientists dealing with process models. Process modeling is seldom done without decision-making nodes, which is why operational research methods are increasingly included in the process analyses. Objective This systematic literature review aimed to provide a detailed and comprehensive description of the relevant aspects of used operational research techniques in Business Process Model and Notation (BPMN) model. Methods The Web Of Science of Clarivate Analytics was searched for 128 studies of that used operation research techniques and business process model and notation, published in English between 1 January 2004 and 18 May 2020. The inclusion criteria were as follows: Use of Operational Research methods in conjunction with the BPMN, and is available in full-text format. Articles were not excluded based on methodological quality. The background information of the included studies, as well as specific information on the used approaches, were extracted. Results In this research, thirty-six studies were included and considered. A total of 11 specific methods falling into the field of Operations Research have been identified, and their use in connection with the process model was described. Conclusion Operational research methods are a useful complement to BPMN process analysis. It serves not only to analyze the probability of the process, its economic and personnel demands but also for process reengineering.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 61
    Publication Date: 2020-11-09
    Description: Children activity recognition (CAR) is a subject for which numerous works have been developed in recent years, most of them focused on monitoring and safety. Commonly, these works use as data source different types of sensors that can interfere with the natural behavior of children, since these sensors are embedded in their clothes. This article proposes the use of environmental sound data for the creation of a children activity classification model, through the development of a deep artificial neural network (ANN). Initially, the ANN architecture is proposed, specifying its parameters and defining the necessary values for the creation of the classification model. The ANN is trained and tested in two ways: using a 70–30 approach (70% of the data for training and 30% for testing) and with a k-fold cross-validation approach. According to the results obtained in the two validation processes (70–30 splitting and k-fold cross validation), the ANN with the proposed architecture achieves an accuracy of 94.51% and 94.19%, respectively, which allows to conclude that the developed model using the ANN and its proposed architecture achieves significant accuracy in the children activity classification by analyzing environmental sound.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 62
    Publication Date: 2020-11-09
    Description: Stochastic computing (SC) is an alternative computing domain for ubiquitous deterministic computing whereby a single logic gate can perform the arithmetic operation by exploiting the nature of probability math. SC was proposed in the 1960s when binary computing was expensive. However, presently, SC started to regain interest after the widespread of deep learning application, specifically the convolutional neural network (CNN) algorithm due to its practicality in hardware implementation. Although not all computing functions can translate to the SC domain, several useful function blocks related to the CNN algorithm had been proposed and tested by researchers. An evolution of CNN, namely, binarised neural network, had also gained attention in the edge computing due to its compactness and computing efficiency. This study reviews various SC CNN hardware implementation methodologies. Firstly, we review the fundamental concepts of SC and the circuit structure and then compare the advantages and disadvantages amongst different SC methods. Finally, we conclude the overview of SC in CNN and make suggestions for widespread implementation.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 63
    Publication Date: 2020-11-09
    Description: We investigate the possibilities, challenges, and limitations that arise from the use of behavior trees in the context of the automatic modular design of collective behaviors in swarm robotics. To do so, we introduce Maple, an automatic design method that combines predefined modules—low-level behaviors and conditions—into a behavior tree that encodes the individual behavior of each robot of the swarm. We present three empirical studies based on two missions: aggregation and Foraging. To explore the strengths and weaknesses of adopting behavior trees as a control architecture, we compare Maple with Chocolate, a previously proposed automatic design method that uses probabilistic finite state machines instead. In the first study, we assess Maple’s ability to produce control software that crosses the reality gap satisfactorily. In the second study, we investigate Maple’s performance as a function of the design budget, that is, the maximum number of simulation runs that the design process is allowed to perform. In the third study, we explore a number of possible variants of Maple that differ in the constraints imposed on the structure of the behavior trees generated. The results of the three studies indicate that, in the context of swarm robotics, behavior trees might be appealing but in many settings do not produce better solutions than finite state machines.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 64
    Publication Date: 2020-11-09
    Description: We propose a new visualization method for massive supercomputer simulations. The key idea is to scatter multiple omnidirectional cameras to record the simulation via in situ visualization. After the simulations are complete, researchers can interactively explore the data collection of the recorded videos by navigating along a path in four-dimensional spacetime. We demonstrate the feasibility of this method by applying it to three different fluid and magnetohydrodynamics simulations using up to 1,000 omnidirectional cameras.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 65
    Publication Date: 2020-11-23
    Description: Integration of heterogeneous data sources in a single representation is an active field with many different tools and techniques. In the case of text-based approaches—those that base the definition of the mappings and the integration on a DSL—there is a lack of usability studies. In this work we have conducted a usability experiment (n = 17) on three different languages: ShExML (our own language), YARRRML and SPARQL-Generate. Results show that ShExML users tend to perform better than those of YARRRML and SPARQL-Generate. This study sheds light on usability aspects of these languages design and remarks some aspects of improvement.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 66
    Publication Date: 2020-11-23
    Description: As an effective method to alleviate traffic congestion, traffic signal coordination control has been applied in many cities to manage queues and to regulate traffic flow under oversaturated traffic condition. However, the previous methods are usually based on two hypotheses. One is that traffic demand is constant. The other assumes that the velocity of vehicle is immutable when entering the downstream section. In the paper, we develop a novel traffic coordination control method to control the traffic flow along oversaturated two-way arterials without both these hypotheses. The method includes two modules: intersection coordination control and arterial coordination control. The green time plan for all intersections can be obtained by the module of intersection coordination control. The module of arterial coordination control can optimize offset plan for all intersections along oversaturated two-way arterials. The experiment results verify that the proposed method can effectively control the queue length under the oversaturated traffic state. In addition, the delay in this method can be decreased by 5.4% compared with the existing delay minimization method and 13.6% compared with the traffic coordination control method without offset optimization. Finally, the proposed method can balance the delay level of different links along oversaturated arterial, which can directly reflect the efficiency of the proposed method on the traffic coordination control under oversaturated traffic condition.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 67
    Publication Date: 2020-11-23
    Description: Background and Objective The COVID-19 pandemic has caused severe mortality across the globe, with the USA as the current epicenter of the COVID-19 epidemic even though the initial outbreak was in Wuhan, China. Many studies successfully applied machine learning to fight COVID-19 pandemic from a different perspective. To the best of the authors’ knowledge, no comprehensive survey with bibliometric analysis has been conducted yet on the adoption of machine learning to fight COVID-19. Therefore, the main goal of this study is to bridge this gap by carrying out an in-depth survey with bibliometric analysis on the adoption of machine learning-based technologies to fight COVID-19 pandemic from a different perspective, including an extensive systematic literature review and bibliometric analysis. Methods We applied a literature survey methodology to retrieved data from academic databases and subsequently employed a bibliometric technique to analyze the accessed records. Besides, the concise summary, sources of COVID-19 datasets, taxonomy, synthesis and analysis are presented in this study. It was found that the Convolutional Neural Network (CNN) is mainly utilized in developing COVID-19 diagnosis and prognosis tools, mostly from chest X-ray and chest CT scan images. Similarly, in this study, we performed a bibliometric analysis of machine learning-based COVID-19 related publications in the Scopus and Web of Science citation indexes. Finally, we propose a new perspective for solving the challenges identified as direction for future research. We believe the survey with bibliometric analysis can help researchers easily detect areas that require further development and identify potential collaborators. Results The findings of the analysis presented in this article reveal that machine learning-based COVID-19 diagnose tools received the most considerable attention from researchers. Specifically, the analyses of results show that energy and resources are more dispenses towards COVID-19 automated diagnose tools while COVID-19 drugs and vaccine development remains grossly underexploited. Besides, the machine learning-based algorithm that is predominantly utilized by researchers in developing the diagnostic tool is CNN mainly from X-rays and CT scan images. Conclusions The challenges hindering practical work on the application of machine learning-based technologies to fight COVID-19 and new perspective to solve the identified problems are presented in this article. Furthermore, we believed that the presented survey with bibliometric analysis could make it easier for researchers to identify areas that need further development and possibly identify potential collaborators at author, country and institutional level, with the overall aim of furthering research in the focused area of machine learning application to disease control.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 68
    Publication Date: 2020-11-23
    Description: Autonomous driving highly depends on depth information for safe driving. Recently, major improvements have been taken towards improving both supervised and self-supervised methods for depth reconstruction. However, most of the current approaches focus on single frame depth estimation, where quality limit is hard to beat due to limitations of supervised learning of deep neural networks in general. One of the way to improve quality of existing methods is to utilize temporal information from frame sequences. In this paper, we study intelligent ways of integrating recurrent block in common supervised depth estimation pipeline. We propose a novel method, which takes advantage of the convolutional gated recurrent unit (convGRU) and convolutional long short-term memory (convLSTM). We compare use of convGRU and convLSTM blocks and determine the best model for real-time depth estimation task. We carefully study training strategy and provide new deep neural networks architectures for the task of depth estimation from monocular video using information from past frames based on attention mechanism. We demonstrate the efficiency of exploiting temporal information by comparing our best recurrent method with existing image-based and video-based solutions for monocular depth reconstruction.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 69
    Publication Date: 2020-11-30
    Description: Logistics is the aspect of the supply chain which is responsible of the efficient flow and delivery of goods or services from suppliers to customers. Because a logistic system involves specialized operations such as inventory control, facility location and distribution planning, the logistic professional requires mathematical, technological and managerial skills and tools to design, adapt and improve these operations. The main research is focused on modeling and solving logistic problems through specialized tools such as integer programing and meta-heuristics methods. In practice, the use of these tools for large and complex problems requires mathematical and computational proficiency. In this context, the present work contributes with a coded suite of models to explore relevant problems by the logistic professional, undergraduate/postgraduate student and/or academic researcher. The functions of the coded suite address the following: (1) generation of test instances for routing and facility location problems with real geographical coordinates; (2) computation of Euclidean, Manhattan and geographical arc length distance metrics for routing and facility location problems; (3) simulation of non-deterministic inventory control models; (4) importing/exporting and plotting of input data and solutions for analysis and visualization by third-party platforms; and (5) designing of a nearest-neighbor meta-heuristic to provide very suitable solutions for large vehicle routing and facility location problems. This work is completed by a discussion of a case study which integrates the functions of the coded suite.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 70
    Publication Date: 2020-11-30
    Description: A “property” in the Mizar proof-assistant is a construction that can be used to register chosen features of predicates (e.g., “reflexivity”, “symmetry”), operations (e.g., “involutiveness”, “commutativity”) and types (e.g., “sethoodness”) declared at the definition stage. The current implementation of Mizar allows using properties for notions with a specific number of visible arguments (e.g., reflexivity for a predicate with two visible arguments and involutiveness for an operation with just one visible argument). In this paper we investigate a more general approach to overcome these limitations. We propose an extension of the Mizar language and a corresponding enhancement of the Mizar proof-checker which allow declaring properties of notions of arbitrary arity with respect to explicitly indicated arguments. Moreover, we introduce a new property—the “fixedpoint-free” property of unary operations—meaning that the result of applying the operation to its argument always differs from the argument. Results of tests conducted on the Mizar Mathematical Library are presented.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 71
    Publication Date: 2020-11-30
    Description: Background Heart arrhythmia, as one of the most important cardiovascular diseases (CVDs), has gained wide attention in the past two decades. The article proposes a hybrid method for heartbeat classification via convolutional neural networks, multilayer perceptrons and focal loss. Methods In the method, a convolution neural network is used to extract the morphological features. The reason behind this is that the morphological characteristics of patients have inter-patient variations, which makes it difficult to accurately describe using traditional hand-craft ways. Then the extracted morphological features are combined with the RR intervals features and input into the multilayer perceptron for heartbeat classification. The RR intervals features contain the dynamic information of the heartbeat. Furthermore, considering that the heartbeat classes are imbalanced and would lead to the poor performance of minority classes, a focal loss is introduced to resolve the problem in the article. Results Tested using the MIT-BIH arrhythmia database, our method achieves an overall positive predictive value of 64.68%, sensitivity of 68.55%, f1-score of 66.09%, and accuracy of 96.27%. Compared with existing works, our method significantly improves the performance of heartbeat classification. Conclusions Our method is simple yet effective, which is potentially used for personal automatic heartbeat classification in remote medical monitoring. The source code is provided on https://github.com/JackAndCole/Deep-Neural-Network-For-Heartbeat-Classification.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 72
    Publication Date: 2020-11-30
    Description: Background Application of Artificial Intelligence (AI) and the use of agent-based systems in the healthcare system have attracted various researchers to improve the efficiency and utility in the Electronic Health Records (EHR). Nowadays, one of the most important and creative developments is the integration of AI and Blockchain that is, Distributed Ledger Technology (DLT) to enable better and decentralized governance. Privacy and security is a critical piece in EHR implementation and/or adoption. Health records are updated every time a patient visits a doctor as they contain important information about the health and wellbeing of the patient and describes the history of care received during the past and to date. Therefore, such records are critical to research, hospitals, emergency rooms, healthcare laboratories, and even health insurance providers. Methods In this article, a platform employing the AI and the use of multi-agent based systems along with the DLT technology for privacy preservation is proposed. The emphasis of security and privacy is highlighted during the process of collecting, managing and distributing EHR data. Results This article aims to ensure privacy, integrity and security metrics of the electronic health records are met when such copies are not only immutable but also distributed. The findings of this work will help guide the development of further techniques using the combination of AI and multi-agent based systems backed by DLT technology for secure and effective handling EHR data. This proposed architecture uses various AI-based intelligent based agents and blockchain for providing privacy and security in EHR. Future enhancement in this work can be the addition of the biometric based systems for improved security.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 73
    Publication Date: 2020-11-30
    Description: A popular unsupervised learning method, known as clustering, is extensively used in data mining, machine learning and pattern recognition. The procedure involves grouping of single and distinct points in a group in such a way that they are either similar to each other or dissimilar to points of other clusters. Traditional clustering methods are greatly challenged by the recent massive growth of data. Therefore, several research works proposed novel designs for clustering methods that leverage the benefits of Big Data platforms, such as Apache Spark, which is designed for fast and distributed massive data processing. However, Spark-based clustering research is still in its early days. In this systematic survey, we investigate the existing Spark-based clustering methods in terms of their support to the characteristics Big Data. Moreover, we propose a new taxonomy for the Spark-based clustering methods. To the best of our knowledge, no survey has been conducted on Spark-based clustering of Big Data. Therefore, this survey aims to present a comprehensive summary of the previous studies in the field of Big Data clustering using Apache Spark during the span of 2010–2020. This survey also highlights the new research directions in the field of clustering massive data.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 74
    Publication Date: 2020-11-30
    Description: Technological advances have lead to the creation of large epigenetic datasets, including information about DNA binding proteins and DNA spatial structure. Hi-C experiments have revealed that chromosomes are subdivided into sets of self-interacting domains called Topologically Associating Domains (TADs). TADs are involved in the regulation of gene expression activity, but the mechanisms of their formation are not yet fully understood. Here, we focus on machine learning methods to characterize DNA folding patterns in Drosophila based on chromatin marks across three cell lines. We present linear regression models with four types of regularization, gradient boosting, and recurrent neural networks (RNN) as tools to study chromatin folding characteristics associated with TADs given epigenetic chromatin immunoprecipitation data. The bidirectional long short-term memory RNN architecture produced the best prediction scores and identified biologically relevant features. Distribution of protein Chriz (Chromator) and histone modification H3K4me3 were selected as the most informative features for the prediction of TADs characteristics. This approach may be adapted to any similar biological dataset of chromatin features across various cell lines and species. The code for the implemented pipeline, Hi-ChiP-ML, is publicly available: https://github.com/MichalRozenwald/Hi-ChIP-ML
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 75
    Publication Date: 2020-11-16
    Description: Massive Open Online Courses are a dominant force in remote-learning yet suffer from persisting problems stemming from lack of commitment and low completion rates. In this initial study we investigate how the use of immersive virtual environments for Power-Point based informational learning may benefit learners and mimic traditional lectures successfully. We examine the role of embodied agent tutors which are frequently implemented within virtual learning environments. We find similar performance on a bespoke knowledge test and metrics for motivation, satisfaction, and engagement by learners in both real and virtual environments, regardless of embodied agent tutor presence. Our results raise questions regarding the viability of using virtual environments for remote-learning paradigms, and we emphasise the need for further investigation to inform the design of effective remote-learning applications.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 76
    Publication Date: 2020-11-16
    Description: Background Deep learning using convolutional neural networks (CNN) has achieved significant results in various fields that use images. Deep learning can automatically extract features from data, and CNN extracts image features by convolution processing. We assumed that increasing the image size using interpolation methods would result in an effective feature extraction. To investigate how interpolation methods change as the number of data increases, we examined and compared the effectiveness of data augmentation by inversion or rotation with image augmentation by interpolation when the image data for training were small. Further, we clarified whether image augmentation by interpolation was useful for CNN training. To examine the usefulness of interpolation methods in medical images, we used a Gender01 data set, which is a sex classification data set, on chest radiographs. For comparison of image enlargement using an interpolation method with data augmentation by inversion and rotation, we examined the results of two- and four-fold enlargement using a Bilinear method. Results The average classification accuracy improved by expanding the image size using the interpolation method. The biggest improvement was noted when the number of training data was 100, and the average classification accuracy of the training model with the original data was 0.563. However, upon increasing the image size by four times using the interpolation method, the average classification accuracy significantly improved to 0.715. Compared with the data augmentation by inversion and rotation, the model trained using the Bilinear method showed an improvement in the average classification accuracy by 0.095 with 100 training data and 0.015 with 50,000 training data. Comparisons of the average classification accuracy of the chest X-ray images showed a stable and high-average classification accuracy using the interpolation method. Conclusion Training the CNN by increasing the image size using the interpolation method is a useful method. In the future, we aim to conduct additional verifications using various medical images to further clarify the reason why image size is important.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 77
    Publication Date: 2021-03-31
    Description: The agricultural sector is still lagging behind from all other sectors in terms of using the newest technologies. For production, the latest machines are being introduced and adopted. However, pre-harvest and post-harvest processing are still done by following traditional methodologies while tracing, storing, and publishing agricultural data. As a result, farmers are not getting deserved payment, consumers are not getting enough information before buying their product, and intermediate person/processors are increasing retail prices. Using blockchain, smart contracts, and IoT devices, we can fully automate the process while establishing absolute trust among all these parties. In this research, we explored the different aspects of using blockchain and smart contracts with the integration of IoT devices in pre-harvesting and post-harvesting segments of agriculture. We proposed a system that uses blockchain as the backbone while IoT devices collect data from the field level, and smart contracts regulate the interaction among all these contributing parties. The system implementation has been shown in diagrams and with proper explanations. Gas costs of every operation have also been attached for a better understanding of the costs. We also analyzed the system in terms of challenges and advantages. The overall impact of this research was to show the immutable, available, transparent, and robustly secure characteristics of blockchain in the field of agriculture while also emphasizing the vigorous mechanism that the collaboration of blockchain, smart contract, and IoT presents.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 78
    Publication Date: 2021-03-23
    Description: Background The Internet of Medical Things (IoMTs) is gradually replacing the traditional healthcare system. However, little attention has been paid to their security requirements in the development of the IoMT devices and systems. One of the main reasons can be the difficulty of tuning conventional security solutions to the IoMT system. Machine Learning (ML) has been successfully employed in the attack detection and mitigation process. Advanced ML technique can also be a promising approach to address the existing and anticipated IoMT security and privacy issues. However, because of the existing challenges of IoMT system, it is imperative to know how these techniques can be effectively utilized to meet the security and privacy requirements without affecting the IoMT systems quality, services, and device’s lifespan. Methodology This article is devoted to perform a Systematic Literature Review (SLR) on the security and privacy issues of IoMT and their solutions by ML techniques. The recent research papers disseminated between 2010 and 2020 are selected from multiple databases and a standardized SLR method is conducted. A total of 153 papers were reviewed and a critical analysis was conducted on the selected papers. Furthermore, this review study attempts to highlight the limitation of the current methods and aims to find possible solutions to them. Thus, a detailed analysis was carried out on the selected papers through focusing on their methods, advantages, limitations, the utilized tools, and data. Results It was observed that ML techniques have been significantly deployed for device and network layer security. Most of the current studies improved traditional metrics while ignored performance complexity metrics in their evaluations. Their studies environments and utilized data barely represent IoMT system. Therefore, conventional ML techniques may fail if metrics such as resource complexity and power usage are not considered.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 79
    Publication Date: 2021-03-23
    Description: The robot controller plays an important role in controlling the robot. The controller mainly aims to eliminate or suppress the influence of uncertain factors on the control robot. Furthermore, there are many types of controllers, and different types of controllers have different features. To explore the differences between controllers of the same category, this article studies some controllers from basic controllers and advanced controllers. This article conducts the benchmarking of the selected controller through pre-set tests. The test task is the most commonly used pick and place. Furthermore, to complete the robustness test, a task of external force interference is also set to observe whether the controller can control the robot arm to return to a normal state. Subsequently, the accuracy, control efficiency, jitter and robustness of the robot arm controlled by the controller are analyzed by comparing the Position and Effort data. Finally, some future works of the benchmarking and reasonable improvement methods are discussed.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 80
    Publication Date: 2021-03-22
    Description: The security of patient information is important during the transfer of medical data. A hybrid spatial domain watermarking algorithm that includes encryption, integrity protection, and steganography is proposed to strengthen the information originality based on the authentication. The proposed algorithm checks whether the patient’s information has been deliberately changed or not. The created code is distributed at every pixel of the medical image and not only in the regions of non-interest pixels, while the image details are still preserved. To enhance the security of the watermarking code, SHA-1 is used to get the initial key for the Symmetric Encryption Algorithm. The target of this approach is to preserve the content of the image and the watermark simultaneously, this is achieved by synthesizing an encrypted watermark from one of the components of the original image and not by embedding a watermark in the image. To evaluate the proposed code the Least Significant Bit (LSB), Bit2SB, and Bit3SB were used. The evaluation of the proposed code showed that the LSB is of better quality but overall the Bit2SB is better in its ability against the active attacks up to a size of 2*2 pixels, and it preserves the high image quality.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 81
    Publication Date: 2021-03-23
    Description: Human posture detection allows the capture of the kinematic parameters of the human body, which is important for many applications, such as assisted living, healthcare, physical exercising and rehabilitation. This task can greatly benefit from recent development in deep learning and computer vision. In this paper, we propose a novel deep recurrent hierarchical network (DRHN) model based on MobileNetV2 that allows for greater flexibility by reducing or eliminating posture detection problems related to a limited visibility human torso in the frame, i.e., the occlusion problem. The DRHN network accepts the RGB-Depth frame sequences and produces a representation of semantically related posture states. We achieved 91.47% accuracy at 10 fps rate for sitting posture recognition.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 82
    Publication Date: 2021-02-01
    Description: It is important in software development to enforce proper restrictions on protected services and resources. Typically software services can be accessed through REST API endpoints where restrictions can be applied using the Role-Based Access Control (RBAC) model. However, RBAC policies can be inconsistent across services, and they require proper assessment. Currently, developers use penetration testing, which is a costly and cumbersome process for a large number of APIs. In addition, modern applications are split into individual microservices and lack a unified view in order to carry out automated RBAC assessment. Often, the process of constructing a centralized perspective of an application is done using Systematic Architecture Reconstruction (SAR). This article presents a novel approach to automated SAR to construct a centralized perspective for a microservice mesh based on their REST communication pattern. We utilize the generated views from SAR to propose an automated way to find RBAC inconsistencies.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 83
    Publication Date: 2021-03-25
    Description: A microarray is a revolutionary tool that generates vast volumes of data that describe the expression profiles of genes under investigation that can be qualified as Big Data. Hadoop and Spark are efficient frameworks, developed to store and analyze Big Data. Analyzing microarray data helps researchers to identify correlated genes. Clustering has been successfully applied to analyze microarray data by grouping genes with similar expression profiles into clusters. The complex nature of microarray data obligated clustering methods to employ multiple evaluation functions to ensure obtaining solutions with high quality. This transformed the clustering problem into a Multi-Objective Problem (MOP). A new and efficient hybrid Multi-Objective Whale Optimization Algorithm with Tabu Search (MOWOATS) was proposed to solve MOPs. In this article, MOWOATS is proposed to analyze massive microarray datasets. Three evaluation functions have been developed to ensure an effective assessment of solutions. MOWOATS has been adapted to run in parallel using Spark over Hadoop computing clusters. The quality of the generated solutions was evaluated based on different indices, such as Silhouette and Davies–Bouldin indices. The obtained clusters were very similar to the original classes. Regarding the scalability, the running time was inversely proportional to the number of computing nodes.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 84
    Publication Date: 2021-03-25
    Description: The interdisciplinary field of data science, which applies techniques from computer science and statistics to address questions across domains, has enjoyed recent considerable growth and interest. This emergence also extends to undergraduate education, whereby a growing number of institutions now offer degree programs in data science. However, there is considerable variation in what the field actually entails and, by extension, differences in how undergraduate programs prepare students for data-intensive careers. We used two seminal frameworks for data science education to evaluate undergraduate data science programs at a subset of 4-year institutions in the United States; developing and applying a rubric, we assessed how well each program met the guidelines of each of the frameworks. Most programs scored high in statistics and computer science and low in domain-specific education, ethics, and areas of communication. Moreover, the academic unit administering the degree program significantly influenced the course-load distribution of computer science and statistics/mathematics courses. We conclude that current data science undergraduate programs provide solid grounding in computational and statistical approaches, yet may not deliver sufficient context in terms of domain knowledge and ethical considerations necessary for appropriate data science applications. Additional refinement of the expectations for undergraduate data science education is warranted.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 85
    Publication Date: 2021-03-25
    Description: Urban expressways provide an effective solution to traffic congestion, and ramp signal optimization can ensure the efficiency of expressway traffic. The existing methods are mainly based on the static spatial distance between mainline and ramp to achieve multi-ramp coordinated signal optimization, which lacks the consideration of the dynamic traffic flow and lead to the long time-lag, thus affecting the efficiency. This article develops a coordinated ramp signal optimization framework based on mainline traffic states. The main contribution was traffic flow-series flux-correlation analysis based on cross-correlation, and development of a novel multifactorial matric that combines flow-correlation to assign the excess demand for mainline traffic. Besides, we used the GRU neural network for traffic flow prediction to ensure real-time optimization. To obtain a more accurate correlation between ramps and congested sections, we used gray correlation analysis to determine the percentage of each factor. We used the Simulation of Urban Mobility simulation platform to evaluate the performance of the proposed method under different traffic demand conditions, and the experimental results show that the proposed method can reduce the density of mainline bottlenecks and improve the efficiency of mainline traffic.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 86
    Publication Date: 2021-03-11
    Description: Investing in stocks is an important tool for modern people’s financial management, and how to forecast stock prices has become an important issue. In recent years, deep learning methods have successfully solved many forecast problems. In this paper, we utilized multiple factors for the stock price forecast. The news articles and PTT forum discussions are taken as the fundamental analysis, and the stock historical transaction information is treated as technical analysis. The state-of-the-art natural language processing tool BERT are used to recognize the sentiments of text, and the long short term memory neural network (LSTM), which is good at analyzing time series data, is applied to forecast the stock price with stock historical transaction information and text sentiments. According to experimental results using our proposed models, the average root mean square error (RMSE ) has 12.05 accuracy improvement.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 87
    Publication Date: 2021-02-18
    Description: Increased interest in the use of word embeddings, such as word representation, for biomedical named entity recognition (BioNER) has highlighted the need for evaluations that aid in selecting the best word embedding to be used. One common criterion for selecting a word embedding is the type of source from which it is generated; that is, general (e.g., Wikipedia, Common Crawl), or specific (e.g., biomedical literature). Using specific word embeddings for the BioNER task has been strongly recommended, considering that they have provided better coverage and semantic relationships among medical entities. To the best of our knowledge, most studies have focused on improving BioNER task performance by, on the one hand, combining several features extracted from the text (for instance, linguistic, morphological, character embedding, and word embedding itself) and, on the other, testing several state-of-the-art named entity recognition algorithms. The latter, however, do not pay great attention to the influence of the word embeddings, and do not facilitate observing their real impact on the BioNER task. For this reason, the present study evaluates three well-known NER algorithms (CRF, BiLSTM, BiLSTM-CRF) with respect to two corpora (DrugBank and MedLine) using two classic word embeddings, GloVe Common Crawl (of the general type) and Pyysalo PM + PMC (specific), as unique features. Furthermore, three contextualized word embeddings (ELMo, Pooled Flair, and Transformer) are compared in their general and specific versions. The aim is to determine whether general embeddings can perform better than specialized ones on the BioNER task. To this end, four experiments were designed. In the first, we set out to identify the combination of classic word embedding, NER algorithm, and corpus that results in the best performance. The second evaluated the effect of the size of the corpus on performance. The third assessed the semantic cohesiveness of the classic word embeddings and their correlation with respect to several gold standards; while the fourth evaluates the performance of general and specific contextualized word embeddings on the BioNER task. Results show that the classic general word embedding GloVe Common Crawl performed better in the DrugBank corpus, despite having less word coverage and a lower internal semantic relationship than the classic specific word embedding, Pyysalo PM + PMC; while in the contextualized word embeddings the best results are presented in the specific ones. We conclude, therefore, when using classic word embeddings as features on the BioNER task, the general ones could be considered a good option. On the other hand, when using contextualized word embeddings, the specific ones are the best option.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 88
    Publication Date: 2021-02-03
    Description: Security analysis is an essential activity in security engineering to identify potential system vulnerabilities and specify security requirements in the early design phases. Due to the increasing complexity of modern systems, traditional approaches lack the power to identify insecure incidents caused by complex interactions among physical systems, human and social entities. By contrast, the System-Theoretic Process Analysis for Security (STPA-Sec) approach views losses as resulting from interactions, focuses on controlling system vulnerabilities instead of external threats, and is applicable for complex socio-technical systems. However, the STPA-Sec pays less attention to the non-safety but information-security issues (e.g., data confidentiality) and lacks efficient guidance for identifying information security concepts. In this article, we propose a data-flow-based adaption of the STPA-Sec (named STPA-DFSec) to overcome the mentioned limitations and elicit security constraints systematically. We use the STPA-DFSec and STPA-Sec to analyze a vehicle digital key system and investigate the relationship and differences between both approaches, their applicability, and highlights. To conclude, the proposed approach can identify information-related problems more directly from the data processing aspect. As an adaption of the STPA-Sec, it can be used with other STPA-based approaches to co-design systems in multi-disciplines under the unified STPA framework.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 89
    Publication Date: 2021-03-11
    Description: Keyword extraction is essential in determining influenced keywords from huge documents as the research repositories are becoming massive in volume day by day. The research community is drowning in data and starving for information. The keywords are the words that describe the theme of the whole document in a precise way by consisting of just a few words. Furthermore, many state-of-the-art approaches are available for keyword extraction from a huge collection of documents and are classified into three types, the statistical approaches, machine learning, and graph-based methods. The machine learning approaches require a large training dataset that needs to be developed manually by domain experts, which sometimes is difficult to produce while determining influenced keywords. However, this research focused on enhancing state-of-the-art graph-based methods to extract keywords when the training dataset is unavailable. This research first converted the handcrafted dataset, collected from impact factor journals into n-grams combinations, ranging from unigram to pentagram and also enhanced traditional graph-based approaches. The experiment was conducted on a handcrafted dataset, and all methods were applied on it. Domain experts performed the user study to evaluate the results. The results were observed from every method and were evaluated with the user study using precision, recall and f-measure as evaluation matrices. The results showed that the proposed method (FNG-IE) performed well and scored near the machine learning approaches score.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 90
    Publication Date: 2021-02-18
    Description: Cervical intraepithelial neoplasia (CIN) and cervical cancer are major health problems faced by women worldwide. The conventional Papanicolaou (Pap) smear analysis is an effective method to diagnose cervical pre-malignant and malignant conditions by analyzing swab images. Various computer vision techniques can be explored to identify potential precancerous and cancerous lesions by analyzing the Pap smear image. The majority of existing work cover binary classification approaches using various classifiers and Convolution Neural Networks. However, they suffer from inherent challenges for minute feature extraction and precise classification. We propose a novel methodology to carry out the multiclass classification of cervical cells from Whole Slide Images (WSI) with optimum feature extraction. The actualization of Conv Net with Transfer Learning technique substantiates meaningful Metamorphic Diagnosis of neoplastic and pre-neoplastic lesions. As the Progressive Resizing technique (an advanced method for training ConvNet) incorporates prior knowledge of the feature hierarchy and can reuse old computations while learning new ones, the model can carry forward the extracted morphological cell features to subsequent Neural Network layers iteratively for elusive learning. The Progressive Resizing technique superimposition in consultation with the Transfer Learning technique while training the Conv Net models has shown a substantial performance increase. The proposed binary and multiclass classification methodology succored in achieving benchmark scores on the Herlev Dataset. We achieved singular multiclass classification scores for WSI images of the SIPaKMed dataset, that is, accuracy (99.70%), precision (99.70%), recall (99.72%), F-Beta (99.63%), and Kappa scores (99.31%), which supersede the scores obtained through principal methodologies. GradCam based feature interpretation extends enhanced assimilation of the generated results, highlighting the pre-malignant and malignant lesions by visual localization in the images.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 91
    Publication Date: 2021-02-03
    Description: In sports competitions, depending on the conditions such as excitement, stress, fatigue, etc. during the match, negative situations such as disability or loss of life may occur for players and spectators. Therefore, it is extremely important to constantly check their health. In addition, some strategic analyzes are made during the match. According to the results of these analyzes, the technical team affects the course of the match. Effects can have positive and sometimes negative results. In this article, fog computing and an Internet of Things (IoT) based architecture are proposed to produce new technical strategies and to avoid disabilities. Players and spectators are monitored with sensors such as blood pressure, body temperature, heart rate, location etc. The data obtained from the sensors are processed in the fog layer and the resulting information is sent to the devices of the technical team and club doctors. In the architecture based on fog computing and IoT, priority processes are computed with low latency. For this, a task management algorithm based on priority queue and list of fog nodes is modified in the fog layer. Authentication and data confidentiality are provided with the Federated Lightweight Authentication of Things (FLAT) method used in the proposed model. In addition, using the Software Defined Network controller based on blockchain technology ensures data integrity.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 92
    Publication Date: 2021-02-03
    Description: Cryptocurrencies such as Bitcoin (BTC) have seen a surge in value in the recent past and appeared as a useful investment opportunity for traders. However, their short term profitability using algorithmic trading strategies remains unanswered. In this work, we focus on the short term profitability of BTC against the euro and the yen for an eight-year period using seven trading algorithms over trading periods of length 15 and 30 days. We use the classical buy and hold (BH) as a benchmark strategy. Rather surprisingly, we found that on average, the yen is more profitable than BTC and the euro; however the answer also depends on the choice of algorithm. Reservation price algorithms result in 7.5% and 10% of average returns over 15 and 30 days respectively which is the highest for all the algorithms for the three assets. For BTC, all algorithms outperform the BH strategy. We also analyze the effect of transaction fee on the profitability of algorithms for BTC and observe that for trading period of length 15 no trading strategy is profitable for BTC. For trading period of length 30, only two strategies are profitable.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 93
    Publication Date: 2021-03-29
    Description: The high volatility of an asset in financial markets is commonly seen as a negative factor. However short-term trades may entail high profits if traders open and close the correct positions. The high volatility of cryptocurrencies, and in particular of Bitcoin, is what made cryptocurrency trading so profitable in these last years. The main goal of this work is to compare several frameworks each other to predict the daily closing Bitcoin price, investigating those that provide the best performance, after a rigorous model selection by the so-called k-fold cross validation method. We evaluated the performance of one stage frameworks, based only on one machine learning technique, such as the Bayesian Neural Network, the Feed Forward and the Long Short Term Memory Neural Networks, and that of two stages frameworks formed by the neural networks just mentioned in cascade to Support Vector Regression. Results highlight higher performance of the two stages frameworks with respect to the correspondent one stage frameworks, but for the Bayesian Neural Network. The one stage framework based on Bayesian Neural Network has the highest performance and the order of magnitude of the mean absolute percentage error computed on the predicted price by this framework is in agreement with those reported in recent literature works.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 94
    Publication Date: 2021-03-25
    Description: Firms face an increasingly complex economic and financial environment in which the access to international networks and markets is crucial. To be successful, companies need to understand the role of internationalization determinants such as bilateral psychic distance, experience, etc. Cutting-edge feature selection methods are applied in the present paper and compared to previous results to gain deep knowledge about strategies for Foreign Direct Investment. More precisely, evolutionary feature selection, addressed from the wrapper approach, is applied with two different classifiers as the fitness function: Bagged Trees and Extreme Learning Machines. The proposed intelligent system is validated when applied to real-life data from Spanish Multinational Enterprises (MNEs). These data were extracted from databases belonging to the Spanish Ministry of Industry, Tourism, and Trade. As a result, interesting conclusions are derived about the key features driving to the internationalization of the companies under study. This is the first time that such outcomes are obtained by an intelligent system on internationalization data.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 95
    Publication Date: 2021-03-17
    Description: As a promising next-generation network architecture, named data networking (NDN) supports name-based routing and in-network caching to retrieve content in an efficient, fast, and reliable manner. Most of the studies on NDN have proposed innovative and efficient caching mechanisms and retrieval of content via efficient routing. However, very few studies have targeted addressing the vulnerabilities in NDN architecture, which a malicious node can exploit to perform a content poisoning attack (CPA). This potentially results in polluting the in-network caches, the routing of content, and consequently isolates the legitimate content in the network. In the past, several efforts have been made to propose the mitigation strategies for the content poisoning attack, but to the best of our knowledge, no specific work has been done to address an emerging attack-surface in NDN, which we call an interest flooding attack. Handling this attack-surface can potentially make content poisoning attack mitigation schemes more effective, secure, and robust. Hence, in this article, we propose the addition of a security mechanism in the CPA mitigation scheme that is, Name-Key Based Forwarding and Multipath Forwarding Based Inband Probe, in which we block the malicious face of compromised consumers by monitoring the Cache-Miss Ratio values and the Queue Capacity at the Edge Routers. The malicious face is blocked when the cache-miss ratio hits the threshold value, which is adjusted dynamically through monitoring the cache-miss ratio and queue capacity values. The experimental results show that we are successful in mitigating the vulnerability of the CPA mitigation scheme by detecting and blocking the flooding interface, at the cost of very little verification overhead at the NDN Routers.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 96
    Publication Date: 2021-03-09
    Description: The Alternating Direction Method of Multipliers (ADMM) is a popular and promising distributed framework for solving large-scale machine learning problems. We consider decentralized consensus-based ADMM in which nodes may only communicate with one-hop neighbors. This may cause slow convergence. We investigate the impact of network topology on the performance of an ADMM-based learning of Support Vector Machine using expander, and mean-degree graphs, and additionally some of the common modern network topologies. In particular, we investigate to which degree the expansion property of the network influences the convergence in terms of iterations, training and communication time. We furthermore suggest which topology is preferable. Additionally, we provide an implementation that makes these theoretical advances easily available. The results show that the performance of decentralized ADMM-based learning of SVMs in terms of convergence is improved using graphs with large spectral gaps, higher and homogeneous degrees.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 97
    Publication Date: 2021-03-08
    Description: This article presents an approach to solve the inverse kinematics of cooperative mobile manipulators for coordinate manipulation tasks. A self-adaptive differential evolution algorithm is used to solve the inverse kinematics as a global constrained optimization problem. A kinematics model of the cooperative mobile manipulators system is proposed, considering a system with two omnidirectional platform manipulators with n DOF. An objective function is formulated based on the forward kinematics equations. Consequently, the proposed approach does not suffer from singularities because it does not require the inversion of any Jacobian matrix. The design of the objective function also contains penalty functions to handle the joint limits constraints. Simulation experiments are performed to test the proposed approach for solving coordinate path tracking tasks. The solutions of the inverse kinematics show precise and accurate results. The experimental setup considers two mobile manipulators based on the KUKA Youbot system to demonstrate the applicability of the proposed approach.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 98
    Publication Date: 2021-03-09
    Description: Due to the increasing size and complexity of many current software systems, the architectural design of these systems has become a considerately complicated task. In this scenario, reference architectures have already proven to be very relevant to support the architectural design of systems in diverse critical application domains, such as health, avionics, transportation, and the automotive sector. However, these architectures are described in many different approaches, such as using textual description, informal models, and even modeling languages as UML. Hence, practitioners are faced with a difficult decision of the better approaches to describing reference architectures. The main contribution of this work is to depict a detailed panorama containing the state of the art (from the literature) and state of the practice (based on existing reference architectures) of approaches for describing reference architectures. For this, we firstly examined the existing approaches (e.g., processes, methods, models, and modeling languages) and compared them concerning completeness and applicability. We also examined four well-known, successful reference architectures (AUTOSAR, ARC-IT, IIRA, and AXMEDIS) in view of the approaches used to describe them. As a result, there exists a misalignment between the state of the art and state of the practice, requiring an engagement of the software architecture community, through research collaboration of academia and industry, to propose more suitable means to describe reference architectures and, as a consequence, promoting the sustainability of these architectures.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 99
    Publication Date: 2021-03-12
    Description: Extreme learning machine (ELM) algorithm is widely used in regression and classification problems due to its advantages such as speed and high-performance rate. Different artificial intelligence-based optimization methods and chaotic systems have been proposed for the development of the ELM. However, a generalized solution method and success rate at the desired level could not be obtained. In this study, a new method is proposed as a result of developing the ELM algorithm used in regression problems with discrete-time chaotic systems. ELM algorithm has been improved by testing five different chaotic maps (Chebyshev, iterative, logistic, piecewise, tent) from chaotic systems. The proposed discrete-time chaotic systems based ELM (DCS-ELM) algorithm has been tested in steel fiber reinforced self-compacting concrete data sets and public four different datasets, and a result of its performance compared with the basic ELM algorithm, linear regression, support vector regression, kernel ELM algorithm and weighted ELM algorithm. It has been observed that it gives a better performance than other algorithms.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
  • 100
    Publication Date: 2021-03-12
    Description: Breast cancer is one of the leading causes of death in the current age. It often results in subpar living conditions for a patient as they have to go through expensive and painful treatments to fight this cancer. One in eight women all over the world is affected by this disease. Almost half a million women annually do not survive this fight and die from this disease. Machine learning algorithms have proven to outperform all existing solutions for the prediction of breast cancer using models built on the previously available data. In this paper, a novel approach named BCD-WERT is proposed that utilizes the Extremely Randomized Tree and Whale Optimization Algorithm (WOA) for efficient feature selection and classification. WOA reduces the dimensionality of the dataset and extracts the relevant features for accurate classification. Experimental results on state-of-the-art comprehensive dataset demonstrated improved performance in comparison with eight other machine learning algorithms: Support Vector Machine (SVM), Random Forest, Kernel Support Vector Machine, Decision Tree, Logistic Regression, Stochastic Gradient Descent, Gaussian Naive Bayes and k-Nearest Neighbor. BCD-WERT outperformed all with the highest accuracy rate of 99.30% followed by SVM achieving 98.60% accuracy. Experimental results also reveal the effectiveness of feature selection techniques in improving prediction accuracy.
    Electronic ISSN: 2376-5992
    Topics: Computer Science
    Published by PeerJ
    Location Call Number Expected Availability
    BibTip Others were also interested in ...
Close ⊗
This website uses cookies and the analysis tool Matomo. More information can be found here...