View this PageEdit this Page (locked)Uploads to this Page (locked)Versions of this Page over TimePrintable Version of this PageHome PageRecent ChangesSearchSign In

MERLIN Project

AI in Biomedicine - Cognitive Computing Lab


MERLIN (MEdical Rule LernINg) is a joint research project between CCL, Georgia Tech and Emory University. It aims at exploring how to automatically extract, encode and reason from medical knowledge from published articles and internet resources and create expert system rules from the literature.


Project Overview:



AI Decision Support Tools for Nuclear Cardiac Medicine


o        Faculty

o        Dr. Ashwin Ram

o        Dr. Ernest V Garcia (Emory U.)

o        Dr. Eugene Agichtein (Emory U.)

o          Post-docs

o        Dr. Baoli Li

o          Students

o        Saurav Sahay

o        Bharat Ravisekar

o        Aditya Devurkar

o        Ajay Choudhari



Biomedical Knowledge Acquisition, Ontology Learning, Text Classification, Knowledge Representation, Textual Case Based Reasoning, Rule Learning, Expert Systems


The rapidly increasing volume of unstructured biomedical information presents the challenge of knowledge integration so as to build autonomic computing systems that can acquire, represent and learn such knowledge, and efficiently reason from it to aid in knowledge discovery and re-use. The construction of these automated systems to assist biomedical decision making is impeded by difficulties in formalizing knowledge and in encoding that knowledge for use by computer systems. These projects focus on developing efficient methods of information retrieval and extraction and build a semantic intelligence infrastructure using techniques of language processing, learning and reasoning. These require automatic construction of knowledge models and ontologies for representing biological entities and relationships, as well as methods for expressing hypotheses and 'biological inference rules' that will facilitate their evaluation against what is already known.

Projects:1) Textual Case Based Reasoning (TCBR)

We are building a system called NEO that learns and represents complete knowledge of an increasing medical literature domain and returns documents for queries based on the knowledge (in the form of contextual maps) and previous query-response episodes (containing expert’s opinions and user’s result preferences) in the TCBR framework. This semantic network representation acts as the basis for retrieval and extractions tasks using techniques like graph search and spreading activation.

Keywords: Knowledge Maps, Annotation, Case characteristics, Case Indexing, Feedback, Explanation, Applications

2) Knowledge Acquisition

This project aims at developing an automatic data acquisition module to query for information about biomedical concepts from biomedical abstracts, internet search engines, XML and RSS feeds and knowledge authority sites such as Wikipedia and identify correctly identified knowledge. Biomedical databases like Pubmed provide Web Service facilities for remotely querying their databases, and internet search engines also have APIs to programmatically retrieve results. This project also aims to explore TCBR techniques to populate and evolve a case library of concepts using different feature comparison techniques.

Keywords: Web Mining, XML Document Retrieval, Data Extraction from Web Pages and RSS feeds, TCBR

3) Knowledge Representation

Knowledge Representation (KR) has long been considered one of the principal elements of Artificial Intelligence, and a critical part of all problem solving. We believe the acquired knowledge should be represented in a form understandable by humans, and should cause the system that uses the knowledge to behave as if it understands it. Many powerful meta models of semantic networks have been developed such as Existential Graphs of Charles S Peirce, Conceptual Graphs of John F Sowa and the Resource Description Framework by the World Wide Web Consortium. This project aims to explore the more recent KR frameworks such as RDF Schema for Knowledge Representation as the semantic network model.

Keywords: Semantic Nets, Semantic Web Languages

4) Semantic Parsing and Relational Learning

We look at the problem of relationship extraction as a sequence labeling and segmentation problem of observation instances and learning classifiers for semantic role labeling of entities in text. This research aims at using generative and discriminative graphical models like HMMs and Conditional Random Fields and other classifiers for extracting complex relationships in textual documents by identifying local and global linguistic and semantic features in text.

Keywords: SRL, Dependency Graphs, Role Labeling

5) Domain Ontology Learning

We are building a prototype for automatically constructing domain-specific ontologies directly from domain corpus using statistical NLP techniques and represent and reason from it in ontology language formats. The challenge is to identify domain terms, their properties, the intension and the extension, and the relationships between the terms. We are using several techniques in the above projects to create the domain specific ontologies.

Pattern learning, dictionary based matching, anaphora and wsd, rule based learning, ontology reasoning





1. Baoli Li, Neha Sugandh, Ernest V. Garcia, Ashwin Ram. 2007. Adapting Associative Classification to Text

Categorization. (To appear) In: Proceedings of the 2007 ACM Symposium on Document Engineering. (ACM DocEng

2007). Winnipeg, Manitoba, Canada, August 28-31, 2007.


2. Baoli Li, Joseph Irwin, Ernest V. Garcia, Ashwin Ram. 2007. Machine Learning Based Semantic Inference:

Experiments and Observations at RTE-3. (To appear) In: Proceedings of the ACL-PASCAL Workshop on Textual

Entailment and Paraphrasing at ACL-2007 (WTEP-2007).


3. Saurav Sahay,Baoli Li, Ernest V. Garcia, Eugene Agichtein, and Ashwin Ram. 2007. Domain Ontology

Construction from Biomedical Text. In: Proceedings of The 2007 International Conference on Artificial Intelligence

(ICAI'07: June 25-28, 2007)


4. Shreekanth Karvaje, Bharat Ravisekar, Saurav Sahay, Baoli Li, Ernest Garcia, Ashwin Ram. Discovering Causal

Sentences with Automatically Learned Patterns, In: Poster Proceedings of International Symposium on

Bioinformatics Research and Applications


5. Saurav Sahay, Sougata Mukherjea, Eugene Agichtein, Ernest V Garcia, Shamkant Navathe, Ashwin Ram.

Discovering Semantic Biomedical Relations utilizing the Web. [accepted in ACM Journal on Transactions on Knowledge

Discovery from Data – special issue on Bioinformatics, 2007]


6. Saurav Sahay,Eugene Agichtein, Baoli Li, Ernest V. Garcia and Ashwin Ram. 2007. Semantic Annotation and

Inference for Medical Knowledge Discovery. In: NSF Symposium on Next Generation Data Mining