My Work on Protege
From 1998 to 2001, I worked as an AI Researcher at Stanford University (more precisely, at Stanford Medical Informatics, now known as The Center for Biomedical Informatics Research). I worked on the Protege knowledge representation framework.
Knowledge Represention, the art of representing real-world information in a form that’s amenable to automated reasoning, is a deep and interesting subject which still fascinates me today. And Protege (twitter) is still the world’s leading general purpose knowledge representation framework.
At the time I believed that that the biggest barrier to progress in AI was the glut of tools — at the time every research lab built its own tools from the ground up (see Survey on Ontology Contruction Tools for a comparison of 13 roughly-equivalent frameworks). Not only was this enormously wasteful and redundant, it also meant the community was using a large number of under-developed, buggy and incompatible AI tool-chains. And so there was very little reuse of either code or knowledge-bases. And there were innumerable talks proving that this framework or that framework provided the most elegant expression of the Wine Ontology.
I tried to address this problem by making Protege more palatable to other research / AI groups worldwide. This effort involved three main activities:
- Being on the “open-source” side in the internal conversations. While it might seem like a obvious decision now, there was a lot of uncertainty around whether or not to open source Protege. But … many research labs told us that the big barrier to adoption was having their research rely on a tool that they didn’t control and which might go away. So we open-sourced it and adoption spiked.
- Defining a formal knowledge model and widget API, so that that people could extend the app, write plugins, and generally understand the semantics of the underlying knowledge base. I reworked the Protege knowledge model to be OKBC compliant (see the discussion in When Knowledge Models Collide) and then gave hours upon hours of talks about how the widget API and the knowledge model worked together. The best of these talks were:
- Formal Aspects of Protege is an 80 slide talk on the underlying knowledge model, culminating in a details description of the Protege Axiom Language (PAL, also something I wrote).
- The Awesome Power of Slot Widgets (Unleashed) is a much shorter talk on how to write a user-interface Plugin for Protege.
- Evangelizing what we were doing with Protege and why it made sense for other research groups to adopt Protege (instead of writing or maintaining their own tools). Knowledge Modeling at the Millenium was the best paper I wrote about this, and it led to a plenary talk at the 12’th Banff Workshop on Knowledge Acquisition, Modelling, and Management (KAW ’99).