Efficient Classification and Prediction Algorithms for Biomedical Information
Digital Document
Document
Handle |
Handle
http://hdl.handle.net/11134/20002:860639923
|
||||||
---|---|---|---|---|---|---|---|
Persons |
Persons
Creator (cre): Kilany, Rania M.
Major Advisor (mja): Ammar, Reda A.
Associate Advisor (asa): Rajasekaran, Sanguthevar
Associate Advisor (asa): Huang, Chun-Hsi (Vincent)
Associate Advisor (asa): Wu, Yufeng
|
||||||
Title |
Title
Title
Efficient Classification and Prediction Algorithms for Biomedical Information
|
||||||
Origin Information |
Origin Information
|
||||||
Parent Item |
Parent Item
|
||||||
Resource Type |
Resource Type
|
||||||
Digital Origin |
Digital Origin
born digital
|
||||||
Description |
Description
Voluminous data sets are being generated on a continual basis in various branches of science and engineering. As a result, the amount of scholarly publications has also increased tremendously. For instance, Pubmed carries millions of abstracts. Pubmed's size keeps growing at a rapid pace. Given such large repositories, one of the challenges for any biologist will be to retrieve the information of interest in a short amount of time. In this research we propose novel solutions for such problems of information retrieval. One of the goals of this research has been to develop a computational tool that can come up with a short list of documents that are likely to contain the information of interest in a short amount of time. Information retrieval (IR) is the process of finding the information (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored in databases). Information retrieval tools are useful for people from different walks of life including reference librarians, paralegals, etc. Another popular application is web search. The term "unstructured data" refers to data which does not have a clear, semantically overt, and easy-for-a-computer structure. In this research we have developed information retrieval techniques that classify documents into two, namely, those that have information pertinent to a specific topic and those that do not. A typical tool that we envision will take as input a set of pre-classified documents (that characterize the information of interest), extract all the keywords from the pre-classified documents, and will develop a learner model that is capable of classifying new documents (unknown or non-classified documents) into two classes. A class 1 document does have information of interest and a class 2 document does not. It is noteworthy that there are tools reported in the literature that are similar to what we study in this research. Examples include the TextMine algorithm by Vyas et al., the Gene Selection algorithm by Song and Rajasekaran, and others. We have compared our algorithms with those in the literature and showed that our algorithms yield better results.
|
||||||
Genre |
Genre
|
||||||
Organizations |
Organizations
Degree granting institution (dgg): University of Connecticut
|
||||||
Held By | |||||||
Rights Statement |
Rights Statement
|
||||||
Use and Reproduction |
Use and Reproduction
These materials are provided for educational and research purposes only.
|
||||||
Local Identifier |
Local Identifier
OC_d_105
|
||||||
OCLC Number |
OCLC Number
844961226
|