Efficient Classification and Prediction Algorithms for Biomedical Information

Metadata

Handle

http://hdl.handle.net/11134/20002:860639923

Persons

Creator (cre): Kilany, Rania M.

Major Advisor (mja): Ammar, Reda A.

Associate Advisor (asa): Rajasekaran, Sanguthevar

Associate Advisor (asa): Huang, Chun-Hsi (Vincent)

Associate Advisor (asa): Wu, Yufeng

Title

Efficient Classification and Prediction Algorithms for Biomedical Information

Origin Information

Event Place	Storrs, CT
Date Created	2013
Publisher	University of Connecticut

Parent Item

Dissertations

Resource Type

Text

Digital Origin

born digital

Description

Voluminous data sets are being generated on a continual basis in various branches of science and engineering. As a result, the amount of scholarly publications has also increased tremendously. For instance, Pubmed carries millions of abstracts. Pubmed's size keeps growing at a rapid pace. Given such large repositories, one of the challenges for any biologist will be to retrieve the information of interest in a short amount of time. In this research we propose novel solutions for such problems of information retrieval. One of the goals of this research has been to develop a computational tool that can come up with a short list of documents that are likely to contain the information of interest in a short amount of time. Information retrieval (IR) is the process of finding the information (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored in databases). Information retrieval tools are useful for people from different walks of life including reference librarians, paralegals, etc. Another popular application is web search. The term "unstructured data" refers to data which does not have a clear, semantically overt, and easy-for-a-computer structure. In this research we have developed information retrieval techniques that classify documents into two, namely, those that have information pertinent to a specific topic and those that do not. A typical tool that we envision will take as input a set of pre-classified documents (that characterize the information of interest), extract all the keywords from the pre-classified documents, and will develop a learner model that is capable of classifying new documents (unknown or non-classified documents) into two classes. A class 1 document does have information of interest and a class 2 document does not. It is noteworthy that there are tools reported in the literature that are similar to what we study in this research. Examples include the TextMine algorithm by Vyas et al., the Gene Selection algorithm by Song and Rajasekaran, and others. We have compared our algorithms with those in the literature and showed that our algorithms yield better results.

Genre

doctoral dissertations

Organizations

Degree granting institution (dgg): University of Connecticut

Held By

Archives & Special Collections, University of Connecticut Library

Rights Statement

IN COPYRIGHT

Use and Reproduction

These materials are provided for educational and research purposes only.

Local Identifier

OC_d_105

OCLC Number

844961226

Efficient Classification and Prediction Algorithms for Biomedical Information

Share