Publication Date:
2022-05-25
Description:
Table 1 summarizes the results of the comparison of the precision and recall values of Naïve
Bayes and Maximum Entropy classification algorithms with various parameter
estimation methods like GIS, IIS, and L-BFGS on the manually annotated American
Seashell book including the Decision Tree Learning algorithm
implemented in the Natural Language toolkit[http://www.nltk.org/]
Description:
A scientific name for an organism can be associated with almost all biological data. Name identification is an important step in many text mining tasks aiming to extract useful information from biological, biomedical and biodiversity text sources. A scientific name acts as an important metadata element to link biological information.We present NetiNeti, a machine learning based approach for identification and discovery of scientific names. The system implementing the approach can be accessed at http://namefinding.ubio.org we present the comparison results of various machine learning algorithms on our annotated corpus. Naïve Bayes and Maximum Entropy with Generalized Iterative Scaling (GIS) parameter estimation are the top two performing algorithms
Keywords:
Naïve Bayes, Maximum Entropy classification, Natural Language toolkit
Repository Name:
Woods Hole Open Access Server
Type:
Dataset
Format:
text/plain
Permalink