With the explosion of information in the biomolecular field, there is a dire need for tools that assist biologists in retrieving, extracting, and relating information and knowledge in the literature and in molecular databases. The Biosemantics Association develops and evaluates such tools, focussing on the elucidation of hidden or implicit knowledge by the massive meta-analysis of textual documents.
The groups currently address three areas of research:
1. Concept identification and disambiguation algorithms.
Proper recognition of concepts that characterize a document and disambiguation of terms that can have more than one meaning, is the basis for all subsequent steps in text analysis. We make extensive use of thesauri that contain the concepts relevant for a particular field.
2. Meta-analysis and visualization techniques.
For meta-analysis, we are studying different approaches that relate the information in many (possibly hundreds of thousands) documents from the literature. Visualization deals with reducing the often multi-dimensional output of the meta-analysis tools to two dimensions that can easily be interpreted.
3. Evaluation of applications in the biological field.
In several studies we are investigating the potential of the developed technology to interconnect genes and proteins and discover knowledge that is hidden in the literature, while the technology is also being evaluated for semi-automated annotation of protein function. Development and evaluation is done in close cooperation with domain experts, both in national and international collaborations.