My Photo
I am Research Scholar at the Memorial Sloan Kettering Cancer Centre in New York, working on the BioPAX pathway exchange model and Pathway Commons.
View my complete profile

Post Docs

University of Glasgow

Interfacing proteomic and genomic data. Post Doc in the RASOR project. I was tasked to implement technologies to interface across proteomic and genomic data. The focus of this project is improved data handling, storage and distribution through an integrated LIMS systems as a foundation to the establishment of an integrated relational database. On the face of it, this is classic data integration of two heterogeneous data systems, however, given the nature of the data sources, proteomics data and genomics data, actual integration would be minimal since the overlapping data elements are few. The data are semantically different and therefore not only are they difficult to physically integrate but the process would add little value to the data itself. Since the actual reason for integration is to query the data as a unit, it is more important to the data users to have the data in a form that allows querying across these data. Semantic integration promises to provide exactly this capability. I am using RDF, RDF-S and OWL to integrate genomic, proteomic and transcriptomic data.


Integration of available taxonomic hierarchies from online databases to facilitate easy information retrieval from TreeBASE.

The problem my PhD studies addressed was the use of different hierarchies in information retrieval. Different opinions in taxonomic placement has meant that different taxonomic resources, such as NCBI and ITIS, use different taxonomic hierarchies. While it is very useful to conduct a hierarchical search to retrieve data, for example all Insecta data or all data in the genus Drosophila, in practice this is difficult because the term Insecta in one taxonomic resource can contain very different data in another. Enabling hierarchical queires in systems such as TreeBASE is therefore a challenge that requires delivering and maintaining taxonomic hierarchies that encompasses current taxonomic opinion.

TCl-Db is a datawarehouse of taxonomic names and classfication hierarchies that can be layered over systems such as TreeBASE to enable hierarchical queries.

TCl-DB TreeBASE wrapper can be found here.


E-MAIL: anwarn @ mskcc . org
E-MAIL: anwar @ cbio . mskcc . org
SKYPE: anwarnadia


  • Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies
    BMC Bioinformatics 2009, 10(Suppl 10):S3 (1 October 2009)
    Nadia Anwar and Ela Hunt

  • Improved data retrieval from TreeBASE via taxonomic and linguistic data enrichment
    BMC Evolutionary Biology 2009, 9:93 (8 May 2009)
    Nadia Anwar and Ela Hunt

  • Semantic Data Integration for Francisella tularensis novicida Proteomic and Genomic Data.
    Semantic Web Applications and Tools for Life Sciences (SWAT4LS), November 2008, Edinburgh, Scotland.
    Nadia Anwar, Ela Hunt, Walter Kolch and Andrew Pitt

  • Taxonomic Support in Systematics, Doctoral Thesis University of Glasgow (2008).
    N. Anwar
    PDF (9.5M)

  • Taxonomy database as an enabling technology for the Tree of Life
    Workshop on Database Issues in Biological Databases (DBiBD), January 2005, Edinburgh, Scotland.
    N. Anwar
    DBiBD Proceedings

  • 12thInternational Conference on Intelligent Systems for Molecular Biology &
    3rdEuropean Conference on Computational Biology (ISMB/ECCB2004)
    August 2004, Glasgow, Scotland.
    Poster - Taxonomy, Biology's first ontology, and the Tree of Life, Biology's grandest endeavour.
    N. Anwar


    I am currently going through the painful process of writing up my PhD. I haven't decided on a title yet. The official title is "Tooling up for the Tree of Life", I have also been flaunting a very vague title "Phyloinformatics". Essentially, I built a data warehouse of taxonomic names for the purposes of of building a taxonomic backbone into TreeBASE. The warehouse contains names from multiple data sources so that there is broad coverage. The database, differs from most other taxonomic name servers in that the database also stores the individual classifications from each of the datasources, which can be used to perform hierarchical queries. The purpose of this database is to enable a kind of query expansion on taxon names. The best example is when someone submits a search using a vernacular name like birds, this name should translate (expand) to the latin name (Aves) and hierarchically to include all children of the term.