My Photo
I am Research Scholar at the Memorial Sloan Kettering Cancer Centre in New York, working on the BioPAX pathway exchange model and Pathway Commons.
View my complete profile

Post Docs

University of Glasgow

Interfacing proteomic and genomic data. Post Doc in the RASOR project. I was tasked to implement technologies to interface across proteomic and genomic data. The focus of this project is improved data handling, storage and distribution through an integrated LIMS systems as a foundation to the establishment of an integrated relational database. On the face of it, this is classic data integration of two heterogeneous data systems, however, given the nature of the data sources, proteomics data and genomics data, actual integration would be minimal since the overlapping data elements are few. The data are semantically different and therefore not only are they difficult to physically integrate but the process would add little value to the data itself. Since the actual reason for integration is to query the data as a unit, it is more important to the data users to have the data in a form that allows querying across these data. Semantic integration promises to provide exactly this capability. I am using RDF, RDF-S and OWL to integrate genomic, proteomic and transcriptomic data.


Integration of available taxonomic hierarchies from online databases to facilitate easy information retrieval from TreeBASE.

The problem my PhD studies addressed was the use of different hierarchies in information retrieval. Different opinions in taxonomic placement has meant that different taxonomic resources, such as NCBI and ITIS, use different taxonomic hierarchies. While it is very useful to conduct a hierarchical search to retrieve data, for example all Insecta data or all data in the genus Drosophila, in practice this is difficult because the term Insecta in one taxonomic resource can contain very different data in another. Enabling hierarchical queires in systems such as TreeBASE is therefore a challenge that requires delivering and maintaining taxonomic hierarchies that encompasses current taxonomic opinion.

TCl-Db is a datawarehouse of taxonomic names and classfication hierarchies that can be layered over systems such as TreeBASE to enable hierarchical queries.

TCl-DB TreeBASE wrapper can be found here.


E-MAIL: anwarn @ mskcc . org
E-MAIL: anwar @ cbio . mskcc . org
SKYPE: anwarnadia


  • Francisella tularensis novicida proteomic and transcriptomic data integration and annotation based on semantic web technologies
    BMC Bioinformatics 2009, 10(Suppl 10):S3 (1 October 2009)
    Nadia Anwar and Ela Hunt

  • Improved data retrieval from TreeBASE via taxonomic and linguistic data enrichment
    BMC Evolutionary Biology 2009, 9:93 (8 May 2009)
    Nadia Anwar and Ela Hunt

  • Semantic Data Integration for Francisella tularensis novicida Proteomic and Genomic Data.
    Semantic Web Applications and Tools for Life Sciences (SWAT4LS), November 2008, Edinburgh, Scotland.
    Nadia Anwar, Ela Hunt, Walter Kolch and Andrew Pitt

  • Taxonomic Support in Systematics, Doctoral Thesis University of Glasgow (2008).
    N. Anwar
    PDF (9.5M)

  • Taxonomy database as an enabling technology for the Tree of Life
    Workshop on Database Issues in Biological Databases (DBiBD), January 2005, Edinburgh, Scotland.
    N. Anwar
    DBiBD Proceedings

  • 12thInternational Conference on Intelligent Systems for Molecular Biology &
    3rdEuropean Conference on Computational Biology (ISMB/ECCB2004)
    August 2004, Glasgow, Scotland.
    Poster - Taxonomy, Biology's first ontology, and the Tree of Life, Biology's grandest endeavour.
    N. Anwar

    You can't have two tables in one model!
    After creating a table you then create a model - this fails since the model already exists and as far as I can tell there is no SDO_RDF.add_to_model blah blah.

    This may not even matter since all the models are stored in the same network and the tables store references to the RDF data that are loaded into the network.

    So with better understanding: now the question is, one table/model for each RDF data file or throw everything together into one enormous table/model?? From the query examples I think I want to have all my data in one model as per

    select x, y, name from
    ‘(:Tom :hasParent ?y)
    (?y :hasFather ?x)
    (?x :name ?name)’,
    SDO_RDF_Models(‘family'), ...);

    Need to figure out if SDO_RDF_Models(‘family','order','plan') is possible???