Already a member?
Sign in
Web 3.0 -- "The Semantic Web"
How to find things better on the Internet
PIMO - a PIM Ontology for the Semantic Desktop
(draft) Author Leo Sauermann, DFKI GmbH, Kaiserslautern Date current version: 2006-07-19 first created: 3.12.2005
Technical Report - DRAFT!
Abstract
The Semantic Desktop needs a well-thought use of ontologies and ontology languages. Existing ontology languages like RDF/S, OWL, SKOS and Topic Maps are very well suited for certain application areas, but do not fulfill some of the requirements given on the Semantic Desktop. In this report, a new ontology language, extending RDF/S, is proposed, the PIMO ontology language, which addresses the requirements of the Semantic Desktop and uses existing solutions as an inspiration to build a suitable solution. The language contains a core upper ontology, defining basic classes for things, concepts, resources, persons, etc. and also stops at these basic entities. Extending the ontology definitions of classes and relations is possible by PIMO-domain ontologies. The core application area of the PIMO-language is to allow individual persons to express their own mental models in a structured way, the different mental models can then be integrated based on matching algorithms or on domain ontologies. Based on the core upper ontology elements, each user can extend his personal mental model in an open manner.Accompanying to the description of the ontology is a RDF/S version of the ontology language, created using the popular Protégé tool. An example of the mental model of a user is given, the fictional user "Paul" is further described. With the gnowsis-beta open source software, an implementation based on the PIMO language exists, that allows validating ontology files. A web-service for that will be provided soon. The ontology language, this document, the open-source reference implementation and the example documents can provide a stable basis for discussions on this topic and allow you to extend your own work.
Purpose and target audience
This report contains a description of the problems faced with personal ontologies, gathers requirements and shows a possible ontology language for personal ontologies. Example data and a user interface mockup give more practical hints. Common problems in the field are not ignored but addressed and possible solutions suggested. The target audience are ontology authors or application developers that are interested in creating ontologies for the purpose of personal information management, further researchers from the Semantic Desktop field will find a summary about the practical and theoretical background of our work. The gnowsis developers at DFKI are a good example of such a group.As technical architecture we will reference the Semantic Web, so you should be familiar with the RDF(S), and OWL standards and know how to build ontologies using OWL or RDF(S). We will also reference some semantic and philosophical ideas from Topic Maps, to illustrate the problems and solutions from a different angle. The reader is expected to have a level of familiarity with RDF(S) that at least corresponds to the tutorial material in [RDF-PRIMER], [RDFS], [OWL]. Knowledge about Topic Maps would be very helpful to understand our ideas [TM-empolis]. For a background on the technical aspects of the Semantic Desktop, you may refer to [Sauermann2003]. We will also refer to philosophical findings, giving an introduction to these in this document.
Contents
IntroductionLanguage used in this document
Requirements in Personal Information Management (PIM)
Communication, Import, and Merge
Analysis of Existing Ontology Languages
The PIMO Solution
Areas in the land of PIMO
Representing Things
Identification in the PIMO
Metadata of a PIMO
Importing domain ontologies - what's with foaf?
Hiding things
Automatic Creation of Things from ResourceManifestations
Validation of a PIMO
User Interface Example
Summary
Acknowledgements
Frequently Asked Questions
References
Footnotes
Introduction
Traditionally, Personal Information Management (PIM) is about managing calendars, contacts, and e-mails. If we make a reality check, every piece of information a person encounters has to be managed. Hence PIM in a digital world is about everything accessed by a person through his or her computer, not just calendars, contacts, and e-mails.Googling for PIM, we find products like the KDE-PIM, solving similar problems like Microsoft Outlook, synchronisation tools like amphire or intellisync, and niché products like the text processor treepad. On wikipedia, the personal information manager roughly covers what can be categorised as "Microsoft Outlook lookalike". To manage information as such, ideas and thoughts, we find products that allow us to express our ideas using bubbles and arrows, mind-manager or mind-mapping tools. They might even be standardized like topic maps [XTM] or use the Resource Description Framework [RDF-PRIMER] like mindraider does. There are integrated research prototypes like haystack which also aim at PIM in a very elegant way.
In 2003, Leo Sauermann started the gnowsis project, which aims at building a so called "Semantic Desktop", a platform that extends the operating system to allow an enhanced PIM, based on Semantic Web technology.
Experiences from the gnowsis Semantic Desktop
Authoring and using ontologies for personal information management is not enough, in the gnowsis project we aimed at bringing the ontology into the daily applications of the end user. Here is a short description of the main ideas behind gnowsis. The Semantic Web is a way of enriching the existing WWW with data. This technology may also be used on a personal computer. Similar to the Web, a server application publishes the available data. This local Semantic Web Server can be used to access the information stored on the PC. Files and other resources can be used like Web resources. They are identified by their URI and links can be created from one document to another. The existing data is available in the RDF syntax and can be used by any application. Interaction between the applications is simplified by these standards.
By taking the Semantic Web technologies and using them on a personal computer, a semantic layer in system architecture is created. Parallel to the existing access to resources using already installed applications, they can be accessed through the Semantic Web server. The resources are identified independent of their file format. Based on this layer, a system-wide classification system can be created. Concepts like private or business are represented once in the system and can be reused in all applications. The class my XML Lecture is represented similar to a group in a web directory (like yahoo), the class can contain files, emails, webpages, etc. All resources can be identified by their URI and can be opened using a browser like program. A crawler similar to google can be built on top of such a system to create a full text index of all existing text data.
The links between local resources are bidirectional, this is an extension of the common WWW principle of unidirectional links. A relation between two resources can be entered at one place into the system and can be found from all connected resources.
As shown by these examples, bringing web technologies to a personal computer can be used in many ways. Accessing the data on the computer through the interface of a Semantic Web server creates a new perspective of existing data, they can be addressed by URIs and organized in web directories etc. (from [Sauermann2003], page 15, with new highlighting).
Existing applications are not replaced but instead enhanced with semantic features. Users can create an ontology and use it from within their normal e-mail client or text processor. Software plugins to these applications can connect to it and relate the current file to another photo using semantic web technology.By taking the Semantic Web technologies and using them on a personal computer, a semantic layer in system architecture is created. Parallel to the existing access to resources using already installed applications, they can be accessed through the Semantic Web server. The resources are identified independent of their file format. Based on this layer, a system-wide classification system can be created. Concepts like private or business are represented once in the system and can be reused in all applications. The class my XML Lecture is represented similar to a group in a web directory (like yahoo), the class can contain files, emails, webpages, etc. All resources can be identified by their URI and can be opened using a browser like program. A crawler similar to google can be built on top of such a system to create a full text index of all existing text data.
The links between local resources are bidirectional, this is an extension of the common WWW principle of unidirectional links. A relation between two resources can be entered at one place into the system and can be found from all connected resources.
As shown by these examples, bringing web technologies to a personal computer can be used in many ways. Accessing the data on the computer through the interface of a Semantic Web server creates a new perspective of existing data, they can be addressed by URIs and organized in web directories etc. (from [Sauermann2003], page 15, with new highlighting).
Using the standard RDF ontology languages, we came to a dead end when faced with complex problems in the field of PIM. The RDF standard and especially OWL and RDF(S) as ontology languages are not enough for personal information management. During the last years of implementation and testing, these flaws in existing approaches appeared:
- RDF/S and OWL as ontology languages do not provide basic semantic relations. We especially missed part-of relations.
- We employed a mixture of RDF/S and OWL ontologies on the semantic desktop, leading to confusion. For an integrated project, the ontology basis should be the same. But public best-practice (like in foaf) is to mix RDF/S and OWL elements.
- the distinction between a property that just describes a resource and a property that draws relations to other resources is not clear. ObjectProperties of OWL give a syntactical mean how to write this down, but the semantics are not clear.
- the next problem that often occurs is when users are beginning to work with Protégé and OWL ontologies, they sometimes come to a point where a tertiary connection is needed (or, in database speak: a N-N relationship with additional properties) and then have a conceptual problem and either use RDF-reification to annotate relations or give up doing so with reification (its painful to implement, cannot be described in ontologies, etc).
- The representation of concepts is not clearly stated. How do we represent "love". Or, simpler, the "w3c homepage at www.w3.org" versus the document found at "http://www.w3.org/". Ontology engineers using RDF/S and OWL are not trained to separate between concepts and resources, people from the Topic Map or SKOS community are.
- What does a URI identify? What is the right URI to identify the concept of "Love", and then again - as every person sees the world in his/her own way, how to find this URI? Many different approaches where suggested how to identify concepts using URIs, but no solution can be recommended as best practice.
- First and foremost, the Enquire system by Tim Berners-Lee. This system is, from the semantic ideas and architecture, quite close to PIMO. Some parts-of his ontology are resembled in our PIMO-language. Enquire led to the WWW, led to the Semantic Web. One of our goals is to bring Enquire back to the Semantic Web. Luckily, the manual of Enquire is online: [TimblEnquire].
- The OWL ontology language. It features very precise properties like inverseFunctionalProperties or inverseProperties [OWL]
- A paper by Jean Rhomer on 10 years of practical experience in PIM [Rhomer2005] which claims that inverseProperties and other things are a prerequisite for such systems.
- A paper by Huiyong Xiao and Isabel F. Cruz describing a Multi-Ontology Approach for Personal Information Management [Xiao2005]. They suggest to use different ontologies for resources and concepts and then map between them.
- The SKOS ontology language and the ongoing SKOS effort. It shows how concepts can be modeled, starting in the thesauri and taxonomy domain but extensible for other domains [SKOS-core, SKOS-Effort].
- The Topic Map standardisation effort and its end-products. This standard offers - philosophically seen - much more ways to express mental models than OWL or SKOS allow. Merging ontologies, representing documents seperated from the ontology and identification issues are well defined here. The usability of topic maps was proven in many products that are sold on a growing market [XTM, TM-empolis, TAO, TM-survey]
- public available thesauri, aiming at representing the world in a semantic way. One is the wordnet project, mapped to RDF by Dan Brickely. Semantic explanations to most day-to-day things are available here (example: car). [Wordnet], [Wordnet-RDF]
- public available social information tools like wikipedia. Here we also find nice explanations of the term car. With the advantage, that everybody can extend it.
- The whole field of linguistics, cognitive science and philosophy. Sadly, the findings of these important knowledge bases are not used in many Semantic Web projects today and we have to look at ideas from them to fully understand what problem we really want to solve with a PIM ontology. Gestalttheorie, a constructivistic view of the world and other secrets like the relation between things and their representation are such findings.
Finally, we face ourselves with frequently asked questions that will typically come up in your mind after reading this report. We also try to answer these questions.
Language used in this document
As we cross different ontology languages (RDF, SKOS and Topic Maps) and also reference to cognitive science, reading this document is not so simple if you do not think in the same language as this document was written in. thing The top of our ontology, everything is a thing. Abstract concepts, physical objects, people are things. resource A web page, a file on your hard-disk, any way information can manifest inside a computer system. Resources are beyond normal files, in the RDF world, a resource can also be an address book item, an entry in a database or an appointment in your calendar. In Topic Map terms, this is an occurrence. concept A thing that represent the mental conceptualization of a user inside his information model, stored on a computer. The mental conceptualization is part of the cognitive system of the person, the concept in the information model is the point where such a mental conceptualization can be represented, identified and visualized on a computer system. Note that they are subjective to the person, as each human being has his own cognition, the concepts in the computer are tightly coupled to that. The term grounding can be used here: the mental conceptualization is grounded in the concept. (for grounding-see also [Haikonen2003]). In Topic Map terms, this is a topic.Requirements in Personal Information Management (PIM)
Common "Microsoft Outlook and its lookalikes" tools offer a way to write down data to be used for PIM.- calendar & appointments: date & place, attendees, invitations, recurring events, synchronising
- address book: people, addresses, phone numbers, portrait photos, ...
- process support: project management, todo lists, tasks, etc
- e-mails: sending and receiving e-mails, hopefully integrated with above
- categories: assign colors to anything, named tags, labels like "important" or "can delete this"
Beyond the normal look at PIM, which is address book, calendar, todo lists and e-mails, we want to help people use computers to organize their information life. People organize their files in a very personal way, and often they employ a categorization scheme for their file folders or bookmark folders. The way of classifying files has grown over years and cannot be reproduced by others - the same extent to which people are different, their filesystems are different. On MS-Windows filesystems (without soft links) each file remains in exactly one folder and folder hierarchies cannot loop. This classification system only allows the creation of taxonomies, where one file can be placed in one class. So typically, the folders have a partitive semantic: subfolders are part-of the super-folder and cannot be part-of any other folder.
But these schemes only apply to file systems, with the semantic features of RDF we can go beyond these basic features. Based on the previous work mentioned above and other existing ontologies [SKOS-Effort, XTM], we assume that people will need the following features on a Semantic Desktop:
- a representation of abstract concepts: Love, Rome, Acme Inc.
- a representation of concrete, addressable resources: "w3c homepage at www.w3.org"
- a representation of documents: "the document at http://www.w3.org/"
- multiple names for a thing: "Love", "Liebe", "W3", "WWW"
- same name for two different things: "Apache - helicopter", "Apache - software".
- class-subclass relations: a subclass has all properties of the superclass + its own
- class-instance relations
- part-of relations: the city of Rome is part of Italy
- related information: Spaghetti is related to Italy
- data properties to describe details: Rome has a population of 2.8 mio
- document-has-topic: the document "http://www.w3.org/2001/sw" is about the "Semantic Web"
- a representation of time: the document was created in 2005. The project started on 1.1.2006
For a detailed look about part-of and has-part relations, have a look at the OWL W3C draft about that [OWL-partof].
The gnowsis and semantic desktop experience
From the experience of creating gnowsis, and an analysis of our work, we infer these requirements to a PIM system:- bring existing resources, files, e-mails, appointments, web-sites etc. into the ontology
- separate resources from concepts
- bring the ontology into existing applications via plugins
- publish the ontology via an PIMO API so that the plugins work
- allow the use of concepts throughout all software, similar to tagging
- tagging documents/things using keywords.
- allow the user to annotate any concept using free text and also data properties
- make the use as simple as a wiki and a blog combined
- allow the linking of any concept to any other concept
- populate the ontology automatically by analyzing the user's data
- relate existing file folders and email folders to concepts
- synchronize folder structures with ontology structures
- import external ontologies by copying, connect the copied concepts to the originals.
- try to be compatible with the beauty and buzz of the FOAF-community
The personal file-system of the user is a key point on the Semantic Desktop. There is much semantic to be found in folder structures: "I put project related files in the project folder. Inside the project folder, there is a sub-folder for each project". Here we find first the class "Project" and then instances of this class. Another semantic relation may be that of time: "All files created in the year 2005 are in the 2005 folder. Inside this folder, I have sub-folders for each month. The folder January is then a part-of 2005". A partition based on topics is possible: "I have a folder Topics where I put documents by topic inside the folder. I have the topics NewProducts, CRM, Sales, Accounting, Paragliding". Each file in the folder CRM would then have the topic CRM (customer relationship management) . These implicit semantic relations hidden in the file-system, only known by the user can be expressed explicitly. Files inside a folder called "Marketing" are automatically assigned the semantics of "files having topic Marketing". At DFKI KM Lab we also evaluated different methods how to use this information for natural language processing and text indexing, connecting text documents to ontologies opens many doors. The creation of subclass or instance relationships can also be automated based on folder structures. Assume the user has a folder /myFiles/Topics/ containing many subfolders and each subfolder represents exactly one topic, instances of the class "Topic" can then be created automatically from these folders. So called Active Folders are a possible name for this. Users can write rules that say what is part of a folder and what not. We see these in popular like iTunes, where you can create smart playlists2. In the upcoming windows file system winfs, we find a similar feature. From the scientific view, smart or active folders have been discussed starting with semantic filesystems. [SemFs]. We will name these features Active Folders in the discussion. To define and implement Active Folders, part of the PIMO has to include classes and properties. An API-focused part of the ontology will deal with these. A very important need that has risen over the years is the separation between resources and mental concepts. Right from the beginning of the project, also in discussions with others, we had a strong feeling towards the separation of resources (be they files, e-mails, web-pages) and concepts in the mental model. Although files, web-sites, e-mails are very important, they are just artifacts that contain and reference our mental models. This can also be seen in the Topic Maps standard, where there are two manifestations of documents. Using the language of this standard, it is possible to say: First the web site "http://www.dfki.de/" is a Topic of interest I work on. I represent it using the topic LeoTopics#www.dfki.de. I can identify it using the field subjectIdentity and resourceRef, an addressable subject. Second, the topic of the company DFKI can be modeled using a topic LeoTopics#DFKI. As the dfki website (as a document) mentions the topic DFKI, I can connect it via an occurrence. You can find a detailed description of this topic in the Q&A section here. From practical experience we have noticed that many topics are created with mistakes. At some point, the user may notice that "Rome" is not only a Topic, but instead a city. Therefore, the class of a thing has to be changeable. Also, the user may accidentally create two concepts that mean the same. For example for "Rome" and "Roma". Then, it has to be possible to merge different things, and to delete them. A need in all of these is checking the user's actions for validity. If "Rome" has a property that only a "City" can have, then the type cannot be changed or the property is removed during the process. This checking should be rule-based, obviously.
- active folders and automatic creation of things
- changing the type of a thing
- multiple types for one thing (although - maybe not)
- merging two synonymous things (merging two things that mean the same)
- deleting things
- rule-based checking of consistency
- rule-based checking of change actions before they are executed
Identification Issues - how do we identify the concept "Love" what does a URI identify?
When people start to use Semantic Web technology to model ontologies, at a certain point they all hit a barrier, appearing concealed in question like:Which URI do I give to the concept "Love"? The true ontologist wants an answer that fits "the world", meaning that many other people will also agree to use the same URI to identify Love. That is the spirita of shared understanding and that's also where the problems start. The question is: what URIs are good to identify things, documents, concepts, persons, places, etc. On top of that, are these URIs useful in a distributed scenario or are they only useful on a single computer. Do they resemble a public agreement, are they standardized? A PIMO system has to cope with something known as the "web's identity crisis" or "uri crisis". The problem is, that we use URIs to identify all of the above, we have URIs for documents, URIs for topics, URIs for everything. But what does a URI identify? Tim Berners-Lee wrote an interesting article to clear this problem, here is an exercise [TimblUri2002]:1) What does "http://www.amazon.com/exec/obidos/ASIN/0679600108/qid=1027958807/sr=2-3/ref=sr_2_3/103-4363499-9407855" identify?So basically, it is not so easy. Even if you think that you found a URI that identifies something, it may identify too much. That's why the article didn't solve the problem and people discussed in mailinglists forever, creating the impression that we are all stuck in an infinite loop of arguments. People stated that their URI scheme is better than yours. So Tim Berners-Lee wrote another article in 2005 about the same problem [TimblUri2005]. The proposed solution from the technical architecture group (TAG) is basically this (abbreviated):When was the thing it identifed last changed?
- A whale
- "Moby Dick or the Whale" by Herman Melville
- A web page on Amazon offering a book for sale
- A URI string
- All the above
Have you read the thing it identifies?
The W3C Technical Architecture group eventually decided to resolve the architectural problem that if an HTTP response code of 200 (a successful retreival) was given, that indicated that the URI indeed was for an information resource, but with no such response, or with a different code, no such assumption could be made. This compromise resolved the issue, leaving a consistent architecture.This may serve well from the technical side, but the semantic problem remains:
- Where do we get URIs that identify topics/things/concepts?
- Where is the URI for LOVE?
- Shouldn't we place an HTML website at this URI so that people can click on it? (which the TAG solution seems to discourage).
Steve Pepper has written an article on how to cure the web's identity crisis [Pepper2004]. The argumentation there is sounds correct. He proposes to solve the distinction of Resource VS Concept by using Topic Map syntax. Topic Maps do determine between addressable subjects and non-addressable subjects. The use of resourceRef or subjectIndicatorRef state a clear message about the topic in question: it is addressable (its a document) or it is not addressable (it is a concept) and in the Topic Map Community, usually one URI is not used in both uses. Also, the Topic Map community treats all concepts in the context of a Topic Map - every resource (be it addressable or not) is first re-instantiated in a local Topic Map using a local identifier (usually something like <topic id="opera">).
Pepper suggests to expand the use of rdf:about with rdf:subject and rdf:indicator to extend the identification. But Pepper didn't take the last needed step: to create a new view on RDF where each Resource is re-instantiated in the current ontology. He does not propose to create topics versus resources. We more tend towards SKOS: there, each Topic is called a Concept and is identified uniquely inside the SKOS ConceptScheme, using the URI of the concept scheme [SKOS-Effort].
Summing the identification problem up:
- To identify concepts, resources, people, books, ... we need to agree on a URI to identify it. There may be multiple URIs possible to identify something.
- For every URI we use, there are multiple ideas the URI identifies.
Philosophical and Linguistic Background Information
Seen through the lens of a constructivistic worldview, we assume that people abstract from things percepted in the real world to a level of internal representation. A real world object is perceived as sensory impression and then triggers an internal representation, which might resolve to a mental concept. For example, the perception of the color red in the real world leads to a signal in the eyes which every human eye decodes the same. But each person may associate it a different mental concept, which may be the same or not. At this point, perception determines how the signal is matched against previous signals and other mental concepts. At least communication of the mental concept can happen via language - then the perception of the red lightwave is written or said as being "a red light", "rouge" or "ein rotes Licht". So we differ, from the philosophical view between the following manifestations:- real world things (red light waves, a car, the chair I sit on right now)
- perception of real world things
- abstract topics (time, "love", organisations) that people can perceive and express
- mental representation of things and topics
- language we use to express our mental representation
- documents containing this message
- to come: semantic documents that contain the message better!
This model of information representation is relevant for the Semantic Desktop for two reasons- each manifestation from above should be considered in the ontology
- when two people communicate, a concept switches manifestations and hopefully less information is lost when we do this right
In semantic web communication, the communication process is different. The message is much more precise than normal verbal or textual communication. Single words from the text can be annotated using the relating concepts from the sender's ontology. The annotation of certain words (or the whole message) can be sent together with the message to the receiver. As an example, Paul sends an email to Peter that mentions Paul's concept of Rome in the text. In the packaged metadata sent within the email, Paul's concept of Rome is described using Paul's way of thinking. paul:Rome is linked to a public identifier from a website like wikipedia or wordnet. Thus public thesauri like wikipedia or wordnet can play a role in this scenario as representing the "Plationian Representation", namely a public identifier that can be used to identify the concepts Bob and Alice talk about. We assume that the cultural background of Bob and Alice is overlapping enough that both recognize Wikipedia and can relate to the concepts described there. But this requires that both users have some personal ontology at hand. We hope that this way of communication is superior to normal e-mails today.Communication, Import, and Merge
From the examples raised in the philosophical requirements, we see the need of communication and merging of different PIMOs.If I want to represent IMDB movies in my ontology, I would have to include a movie ontology and the IMDB website somehow. The external movie ontology defines its own public identifieres, etc. When representing the movies in my PIM, they should be linked to the IMDB website to allow identification. When merging others and communicating, how to do this?
- importing other RDF(S) ontologies and use their properties and classes in some way
- publish ontologies expressed using PIMO
- publish a message that contains part of a PIMO
- merge ontologies
Non-Functional Requirements
From the technical side, our goal is to find a solution that is scaleable and finds wide adaption in PIM applications.- a simple Ontology Language (minimal number of classes and properties)
- easy to understand by developers
- easy to understand by end users
- compatible with best practice in current Semantic Web projects.
- documented
- examples of data models
Analysis of Existing Ontology Languages
The existing ontology languages can be compared to some of the requirements that were listed above.| Ontology Languages for PIM | |||
| Features | RDF-Schema | SKOS | Topic Maps |
| PIM | |||
| abstract things | = may use RDFS-Class | X concept | X topic |
| addressable things | X resources | X concept | X occurrence, topic |
| documents | X resources | = may use rdf:Resources | X occurrence |
| naming different names for one thing | = may use different languages | X different labels | X language, scope, variant names |
| naming same name for different things | X | X | X |
| class-subclass relation | X | - no inheritance | X |
| class-instance relation | X | X narrowerInstantive 1 | X |
| part-of relation | - (suggested for owl) | X narrowerPartitive 1 | - |
| related information | - | X skos:related | X association |
| data properties | X rdfs:Property | - | = occurrences with resourceData, but not typed |
| document has topic | - | X skos:subject | X occurrence |
| time | - | - | - |
| Identification | |||
| identification of documents | X uri | = uses URIs of rdf:Resources | X resourceRef and xlink |
| identification of topics/things/concepts | X uri | X uri | X uri |
| approach to identify things across ontologies | - | = mapping ontology allows mapping, skos:primarySubject may point identity | X addressable subjects with resourceRef, non-addressable subjects with subjectIndicatorRef |
| Communication | |||
| Importing other RDF(S) ontologies | X via OWL:import or implicit by namespace | - | = (Topic Maps can be imported via mergeMap directive) |
| publish ontologies | X | X | X |
| merge ontologies | - no explicit rules how to merge semantically | X via mapping ontology | X via mergeMap directive |
Symbols:
- X : feature fulfilled using this technique
- = : feature may be fulfilled
- - : feature not given
The PIMO Solution
In this section we will describe the PIMO ontology.- a new ontology language that can be used to express personal mental models. This will be called PIMO-language
- An initial fill of the PIMO-language with a few basic concepts that many users will need and also programmers will like. It serves as "upper ontology" and contains people, places, topics, documents, time.
- an example ontology of a person called Paul.
- instructions how to implement software with the ontology
- user interface mockups that show how to interact with it
The ontology itself and the according example can be downloaded here: [PIMO-lang, PIMO-example]
Areas in the land of PIMO
If the PIMO of a person is like a landscape, we can find different counties and regions in this mental land. When the user starts to explore the land of his own knowledge, the first area he reaches is Thing. There, the mental models are represented on an conceptual level, it is the land of knowledge, where a user models concepts that the computer can reflect, expressing ideas as near to the mental models as possible. Going further he comes to ResourceManifestation where the files and web/pages start. They are connected to the Things. Then there will be the mysterious caves of metadata about the ontology, where only programmers have to find their righteous path - InformationSystemConcept.- the class Thing and its subclasses: these serve as representation of mental concepts of the user. Here you find people, places, organizations, processes, documents, projects and topics (vulg: tags). The user can express mental concepts using a semantic PIM application like mindraider, gnowsis or any other application that supports it.
- the class ResourceManifestation and its subclasses: instances of ResourceManifestation handle resources found on digital systems. Files, e-mails, webpages, address book items, bibtex items, bibtext files are all living here. The user creates instances of these by working in native applications like the filesystem or PIM applications like KDE-PIM or MS Outlook. Programmers are responsible to extend this part of the ontology.
- the class InformationSystemConcept which belongs to the machine room of the developers. Here metadata about the user and his PIMO are collected, namespaces defined, classes for annotations, meta-classes etc. Nobody should have the urge to change this part of the ontology, but programmers are encouraged to use it.
Representing Things
The subclasses of the pimo:Thing and instances pimo:Thing are used to represent concepts that come from the user's mental model. To the right you see the basic class structure provided by the ontology. We find classes for a person, a group, a location, document types, organization, event, task, project and topic. If we have a person called Paul, we can create Paul as instance of Person. If Paul wants to represent his friends Peter and Tim he can also create them as being a Person. When they are working in the city Rome, then they can create the thing Rome as being a city. When they plan to start a new branch office of their company in Rome, Paul can create the project BranchOfficeRome as being a project. If Paul has a meeting with others about the status of the project, he can create the ProjectStatusMeetingRome. You may have noticed that the labels given to these three things are words without funny characters nor spaces. For labeling things, we suggest to use wiki-names3. You might label things any way you want (from the technical point of view) but for these examples and practical reasons, wiki-names are much better. In our example, instances of the class pimo:Thing are:- Paul, Peter, Tim: instance of pimo:Person.
- Rome: instance of pimo:City
- BranchOfficeRome: instance of pimo:Project
- ProjectStatusMeetingRome: instance of pimo:Meeting
- paul:Paul (Paul's concept of himself)
- paul:Peter, paul:Tim, paul:Rome, paul:BranchOfficeRome, paul:ProjectStatusMeetingRome
- paul:BusinessPlan rdfs:subclassOf pimo:Document
- paul:BusinessPlanRomeBranch rdf:type paul:BusinessPlan
Naming and displaying things
Things, classes and also properties can be labeled. The primary way of naming things is the pimo:label. The use of pimo:label is not restricted, you may use any string. There is exactly one pimo:label for a Thing, it is mandatory. If a thing has more names, the pimo:altLabel property can be used. These alternative names are useful during text search. For various applications, a wiki-name of things may be useful. Wiki-names have to be unique and should not contain any symbols besides alphabetical letters and numbers, whitespace and funny characters should be avoided. Using CamelCase to combine more words into a wiki-word is suggested. Wiki-names help in tagging applications and other simple applications we envision for the semantic web. Also, the wiki-name might be a good candidate as local name in a URI. To show a thing in a GUI, you can also provide icons to optically represent a thing or all instances of a class. You can assign icons to things, classes and properties. Use the pimo:hasIcon property to connect to icons, the icons have to be identified by a URL pointing to a local file or a web image.Describing things - DescribingProperty
Things can be, once defined, described in more detail using RDF properties. The most important property is that of the label, which names the thing and allows us to show it in a user interface. Labels are modeled using the RDFS:label property. The label may be any string value, but should be short and descriptive. For practical reasons, we also created a property pimo:label that allows us to label things inside the Protégé editor. Every thing must have a label. In the wiki context - when things are used inside text - wiki-names are needed. Things can also be described in more detail using wiki-text. For this, use instances of the pimo:PimoWikiContent and connect them to things using pimo:hasWikiContent. Although using plain text would be fairly enough to describe things, wiki-text allows many researchers to create wonderful applications on top of the PIMO that will surely surprise us all. Other describing properties can be added by creating sub-properties of pimo:DescribingProperty. Typically, the user will add properties to add comments, more descriptions, etc. For example, if the user wants to enter the fact that Rome has 2.8 mio inhabitants, he may say:- paul:Rome - inhabitants - "2.8 mio".
- create a new rdfs:Property, a sub-property of pimo:describingProperty. The property is identified using the URI paul:inhabitants and has an rdfs:label "inhabitants".
- create the triple: paul:Rome paul:inhabitants "2.8 mio".
Relations between things
The things inside the PIMO can have relations to each other, semantic links. In RDF these are called triples, in Topic Maps they are Associations. We decided to support both, but focus on the triples first. A few basic relations between things are assumed:- partOf - one thing is part of another.
- hasTopic - a thing is about another
- related - a thing is related to another
- pimo:partOf - pimo:hasPart
- pimo:hasTopic - pimo:isTopicOf
- pimo:related - pimo:related
- paul:BranchOfficeRome - pimo:hasPart - paul:BusinessPlanRomeBranch
- paul:BranchOfficeRome - pimo:isTopicOf - paul:ProjectStatusMeetingRome
- paul:BranchOfficeRome - pimo:related - paul:Rome
- paul:BusinessPlanRomeBranch - pimo:partOf - paul:BranchOfficeRome
- paul:ProjectStatusMeetingRome - pimo:hasTopic - paul:BranchOfficeRome
- paul:Rome - pimo:related - paul:BranchOfficeRome
- paul:supervises - rdf:type - rdf:Property
- paul:supervises - rdfs:subPropertyOf - pimo:related
- paul:supervises - rdfs:label - "supervises"
- paul:supervises - protege:inverseProperty - paul:isSupervisedBy
- paul:isSupervisedBy - rdf:type - rdf:Property
- paul:isSupervisedBy - rdfs:subPropertyOf - pimo:related
- paul:isSupervisedBy - rdfs:label - "is supervised by"
Complex relations - pimo:Association
If binary relations are not enough, then an instance of the pimo:Association class is needed. For example, the role of Peter in relation to the branch-office project may be more complicated than just "manages the project". Peter may have begun his job sometime during the project and his job ended sometimes during the project, to be replaced by Jacob. Then the role Peter has had was only true for a certain amount of time. For this, the pimo:Role class helps, which is a sub-class of pimo:Association. By making an instance of pimo:Role, you can connect a person to a project and tell more about the role the person is having and also more metadata. For example, if another person is monitoring if Peter is playing his role good, you may add another person to the relation. We will not go into further detail on associations, because we will not support them in the gnowsis beta GUI. If you need more information, please consult the XTM standard documents, where the concept of associations is explained in more detail. [XTM, TM-empolis, TAO].Representing Resources
Files on the hard-disk, web-pages or any other resource from local information systems can be integrated to the PIMO of a user. The ontology does not end at the border of our PIM application, it can be used to connect outside resources. The main class to represent resources is ResourceManifestation. The class itself and its subclasses are considered to be addressable resources in the sense of RDF. We deliberately did not use the term Resource here, as our pimo:Thing is also a resource in the meaning of RDF. On the right you see typical classes of resource manifestations. Note that this scheme can be extended by programmers, if you write an application that brings in a new resource-type, create a new class for it. Comprehensive and useful class structures for these ResourceManifestations are currently developed in the Aperture project, where also adapters and extractors are written that can gather such resources from a system and from the web [Aperture]. The resource structure is deliberately held very simple to be open to such extensions. Instances of resources are not created by the user manually, they should always be created automatically by the system itself. They are identified using a URI that fits best - for a file, the file:// uri scheme, for an e-mail the messageid or IMAP uri. You only must guarantee that you can find the resource based on its URI and that the URI is unique on the desktop. You may also assure, that the URI is unique globally (in the whole world). The uri may be added to the resource as additional property pimo:uri but as the URI used as RDF identifier anyway, this is just to simplify modeling in Protégé Example of such resources are:- branchinrome - The folder for the project files
file:///C:/Documents/Paul/documents/projects/branchinrome/ - Picture23.jpg - A photo of the new office in Rome
file:///C:/myDocuments/Paul/documents/projects/branchinrome/photos/image23.jpg - Meeting at 29.9.2005 - meeting in outlook, topic is the branch office in Rome.
gnowsis://paul@example.com/resources/outlook/appointment/00000000ECD4B99358B9814B9DAFE2255CD8AE9A44EF3000
Relations between ResourceManifestation and Things
The main relation between mental concepts, expressed as pimo:Thing and files, expressed as pimo:ResourceManifestation are links of the type pimo:occurrence. An occurrence connection means that the mental concept expressed in the Thing X can be found in the document found in the ResourceManifestation Y. The definition of occurrences and the idea behind are taken from the XML Topic Maps standard, where an occurrence means [XTM].An occurrence is any information that is specified as being relevant to a given subject.Adding an occurrence triple between a thing and a ResourceManifestation expresses exactly that. The inverse property of pimo:occurrence is pimo:isOccurrenceOf. These triples are generated by the user quite frequently and can be also created by the system automatically. For example, Paul notes that the website of the city of Rome is an occurrence of his thing paul:Rome by saying:
- paul:Rome - pimo:occurrence - <http://www.comune.roma.it/>
- <http://www.comune.roma.it/> - rdf:type - pimo:Webpage
- keep it simple. Represent the complex in the land of pimo:Thing
- the occurrence is annotated anyway: the resourse has a type and some other properties
Here an example of how a user interface might look like that supports this kind of tagging of e-mails. The user enters a few letters of the intended tag, and the system searches for possible things to tag with. The user interface should filter to only show the most relevant tags, ie down to seven tags to show. In the gnowsis semantic desktop this will be supported by text analysis components.

So how to annotate that e-mail in a fancy way?
An e-mail is by definition a pimo:ResourceManifestation which can only be annotated using the relations pimo:occurrence and pimo:occurrenceRef. So when you want to just "tag" the e-mail, use these two relations. What happens now if you want to say that this e-mail is "related" to another thing or "part-of" this project? Or use more custom properties you created? Then you reify the e-mail into an instance of type pimo:Thing (or a suitable subclass). So the first thing to do is ask the user what kind of document this e-mail is. Is it just a pimo:Document or is it a paul:BusinessPlan? Is it a paul:Invitation? Normally, you would reify text resources as Documents and for example address book entries as persons. After having it reified, the e-mail is now existing both as a ResourceManifestation and as Thing. The thing representing the e-mail can be annotated as you like, using your own properties etc. During the process of reificating resources as things, it is surely possible to create as much data attached to the thing as possible. For example, you may relate other persons to the reified e-mail if they are mentioned in the to: and from: fields of the e-mail and if they already are existing as persons, or you might suggest to the user that the software can also reify the persons. More about this below, in section Automatic Creation of Things from ResourceManifestations. Result is:- email:123123 rdf:type pimo:Email (this existed)
- email:123123 rdfs:label "Hi Paul!".
- paul:MailHiPaul pimo:groundingOccurrence email:123123
The personal namespace
Every person has - as already explained in philosophical background - a personal view of the world. So the mental models inside one's mind are subjective, personal. No two people conceptualize the concept "Rome" the same. This is reflected in the PIMO by employing a personal namespace for each user. The personal namespace could be any namespace that can be referenced and resources in the namespace can be downloaded via a protocol. Sadly, the HTTP protocol is not a good candidate for the personal namespace. Why? Because if Paul would use the namespace http://www.example.com/~paul/pimo# to represent his mental models, this would require that he is always online while modeling. Or, if he models offline, it would require him to have a website called www.example.com which he has to buy and host. Then he would have to install a web-service there that serves his ontology. In the real world, most people do not have a website, and those who have do not keep it up to date. Also, if Paul has a change in his life that separates him from www.example.com, his data would be lost (at least frozen in the last state).For the semantic desktop, and our gnowsis implementation (at least the March2006 version), we decided to use the gnowsis URI scheme to identify concepts and resources of individuals. The reason is that we do not use http as a communication protocol but instead use the peer-to-peer protocol XMPP (jabber). This allows any user to host the RDF resources where ever they want. If they are online, via the jabber protocol, if they are offline via the jabber server. For this and other reasons Frank Osterfeld and Malte Kiesel created the advanced jabber server nabu [ref-nabu]. Even if the user's do not employ the possibility of P2P communication, the idea of the namespaces are practical.
The URI scheme functions as follows:gnowsis://[jabber id]/resources/pimo/example: gnowsis://paul@example.com/resources/pimo/ The different parts are:
- gnowsis - a semantic desktop implementation
- jabber id - the jabber-id of the user (typically username@jabberhost)
- resources - the place for rdf resources
- pimo - the pimo of the user
Identification in the PIMO
Things and classes of things inside the user's PIMO are identified using URIs from the user's PIMO namespace. These are clearly identifiers for the concepts behind. Refering to the article by Booth, we can say that RDF URIs used to identify instances of pimo:Class, pimo:Thing, and pimo:Slot identify the concept represented by these [Booth2003]. As the URIs in the personal information management domain identify the concept bound to the subjective view of the user, the namespace is bound to the person. We suggested to use Jabber-ids as part of the namespace, they are a technical identification of the user, like an e-mail address. See the section above.Public identification for communicating things via the internet can be achieved using the pimo:occurrenceRef property. This points to a URI of a resource that can be used to publicly identify the thing. For the personal Thing "Rome (a City, a Location)" from Paul's PIMO, the website of Rome and the wikipedia entry can identify the city:@prefix paul: <gnowsis:paul@example.com/resources/pimo/>. @prefix pimo: <http://ontologies.opendfki.de/repos/ontologies/pim/pimo#>. @prefix wikipediaen: <http://en.wikipedia.org/wiki/>. paul:Rome a pimo:City; pimo:occurrenceRef wikipediaen:Rome, <http://www.comune.roma.it/>. If the identifier of a thing is not an addressable URI (=locator) but only an identifier, we recommend to use the property pimo:identifierRef or sub-properties of it. Books can be identified using their ISBN number in this field. They can also be identified using an occurrenceRef to the webpage of the book at the amazon.com website, to allow this common practice of identification.
To identify persons, either the property containing the pimo:jabberId can be used or a pimo:mbox to the e-mail-address of a person. URIs for e-mail addresses are inspired by the foaf-project. Which raises the problem of spammers and the use of e-mail addresses, they suggest to use the sha1-sum of the e-mail address to solve this problem. We added both properties to the pimo:Ontology (as sub-properties) to be compatible with this important feature of foaf. Additionally we suggest to always use the lowercase version of an e-mail address. If you have two e-mail addresses, prefer the shortest one (in character length) but try to add all. So people can be identified in one of these ways:
- paul:Peter pimo:occurrenceRef mailto:peter@example.com
- paul:Peter pimo:mbox_sha1sum "975ce9cdf7784ab91280d6a0be434efe8fe3f102"
- paul:Peter pimo:mbox mailto:peter@example.com
If there is no useful URI scheme for a resource, we suggest that you as developer make up a URI scheme. Many successful software projects have done so (for example, mozilla with their XUL URIs). If you need URIs to identify resources, and have no clue what to do, you can use a sub-space inside the gnowsis URI space, which was made up by us for the gnowsis system. See above for more details and contact the author.
Metadata of a PIMO - the pimo:PersonalInformationModel and pimo:metaDefines
The class pimo:PersonalInformationModel is used to represent instances of a PIMO. For example, Paul's PIMO is also identified inside using an instance of pimo:PersonalInformationModel. This instance is used to link the created things and relations to the PIMO. Every instance of thing, for example paul:Rome, was created inside the PIMO and to note this, a link between the PIMO and the Thing can be made, called pimo:metaDefines.The instance inside Paul's model is identified using a URI from Paul's namespace: paul:PIMOInstance. Explicitly, we would find these triples:
- paul:PIMOInstance - rdf:type - pimo:PersonalInformationModel
- paul:PIMOInstance - pimo:hasNamespace - paul:
- paul:PIMOInstance - pimo:metaDefines - paul:Peter, paul:Tim, paul:Rome, paul:BranchOfficeRome, paul:ProjectStatusMeetingRome
Importing domain ontologies - what's with foaf?
Luckily, you are not alone. Other people might write very useful classes and relations for the PIMO language you might want to import. Also, the existing RDFS vocabularies like foaf-relationships may be interesting. We call these ontologies "domain ontologies" as they are often called so in literature.There are two ways to represent and get such a domain ontology:
- the domain ontology is expressed using the PIMO - import it
- the domain ontology is expressed using RDFS or OWL - convert it to PIMO, then import it.
Let us look at the conversion process to PIMO first and then how you import a PIMO domain ontology. The conversion from RDF(S) is straightforward, you have to:
- add an instance of pimo:DomainOntology, you may use the same existing instance of OWL:Ontology if you like. For example, foaf already defines an OWL:Ontology instance for its metadata.
- For all properties, they have to be sub-properties from one of pimo:partOf, pimo:hasTopic, pimo:related, or pimo:describingProperty.
- Each property needs an inverse property.
- Each class has to be subclassed from pimo:Thing. Please choose wisely to be compatible with the current PIMO.
- Each property and class has to have a pimo:metaDefines. Basically, each and every mentioned resource that is used as a subject in a triple has to have a pimo:metaIsDefinedBy.
Importing a domain ontology into a user's PIMO is then practically done by downloading the domain ontology and storing it into the user's PIMO storage. You should be able to savely mix domain ontologies into the PIMO storage, as each resource in a domain ontology must be tagged by pimo:metaIsDefinedBy. After importing, you then add a relation between the user's instance of PersonalInformationModel and the imported ontology using the pimo:metaImportedDomainOntology to say that you imported it. Attention: if the imported ontology X defines to need another ontology Y by pimo:metaImports, then you must import Y also. All or nothing.
Note that you cannot import one person's PIMO into another person's PIMO, as you cannot look at the thoughts of other people we do not support nor encourage importing PIMOs. They are private to the user and if the user decides to tell something to others, parts may be communicated. We discourage to import the whole PIMO of Paul into the PIMO of Peter.
Mapping to synonymous things and classes
When domain ontologies are imported, the user might notice that things represented in his ontology are also represented in the domain ontology. For example, the class "Car" might be represented already in your ontology and a domain ontology you download. As the two classes are kept in different ontologies, it might not be suitable to merge them, then it is possible to say that they are just "other conceptualizations" of the same concept. For this we provide mappings that point to these other conceptualizations, namely to map classes, instances and properties.- pimo:hasOtherConceptualization - maps classes
- pimo:hasOtherRepresentation - maps instances
- pimo:hasOtherSlot - maps properties (slots)
User interfaces should show the user's representation primarily and link to the other representations. The user's point of view is always the right one.
Merging, deleting, and changing the type of a thing
What happens if the user notices that two things are the same? This will happen when algorithms create things automatically and fail during the search for matching things. For example, if a person is extracted from the data of an e-mail and created as an instance of pimo:Person while the person was already an instance there (but labeled using another name).Merging two things should reuse as much of the data as possible. If two things A and B are merged into thing C:
- One of A or B should be deleted, C should have the URI of A or B (for simplicity). Preferably a URI that contains the new label as local name. The URI of the deleted resource should be forgotten.
- C should have all classes of A and B as a union.
- C should have all relations, occurrences, etc of A and B
This leads us to changing the type of a thing and multiple types. Changing the types of a thing may imply that the properties and relations are not valid anymore, if the properties are strongly typed. It is suggested to keep existing properties and relations, even if the ontology is not valid then. If the user changes the type back to the old type, the properties are valid again and also the user may correct the errors manually. The user interface should remind the user of possible errors but should not enforce the correction. Weak typing and ontology-by-examples are the main principles behind this strategy, also that "the user is always right".
Deleting a class entails that instances are typed to the super-class of the deleted class. The topmost class is pimo:Thing. Classes defined in the pimo language cannot be deleted.
Deleting an instance also deletes all relations to and from this instance, including occurrence relations. Note, that also pimo:creationSupportedBy links may be deleted by this, which may result that an algorithm soon creates the instance again (if the user somewhere configured to created instances automatically). See below in section Automatic Creation of Things from ResourceManifestations. In this case, it may be better to use the pimo:metaHidden property to hide it.
Hiding things
Sometimes, things may be outdated or not needed anymore, but they should remain in the PIMO to be searchable. To do this, the user may hide a thing, class or property. Even the default classes like pimo:Person can be hidden, with an exception: pimo:Thing must not be hidden.The user interface should clearly state that "some things are hidden", to allow the user to find the resources again. Also, search results may choose to rank hidden objects very low or give indications about them. Hiding things may also be initiated automatically, for example by an algorithm that hides things that have not been used for a longer time. To hide something, use this triple:
- paul:Tim pimo:metaHidden "true".
Automatic Creation of Things from ResourceManifestations
It is obvious that many things that a user can think of are already represented in his resources. Paul already has many entries in his address book, aren't they all candidates for pimo:Thing instances? We know that many algorithms and approaches exist to automatically generate or extend ontologies based on data in text files or other data sources. For the PIMO, one of the requirements from gnowsis states to create active folders, or active things. Automaticly creating thing from ResourceManifestations is needed to make the semantic desktop work. For this, we create a few new relations that connect things to resources and define the existing relations better.Semantic Relations between Things and ResourceManifestations You are already familiar with pimo:occurrence and pimo:occurrenceRef relations. These connect things to resources in which the thing appears. In the context of automatically creating things, the created thing can be connected to the resource via the pimo:metaCreationSupportedBy property. If a PIM concept or instance was generated due to the existence of one ore more resources, here is the link to these elements. Binds an address book card to the person instance or a PDF file to the instance of "contract". It is also important for scanning resources: if a new resource is found, did it already support the creation of a PIMO-thing? The property pimo:metaCreationSupportedBy is intentionally not a sub-property of pimo:occurrence. If the resource is also an occurrence, use both properties to relate the two.An interesting question remains how a URI and a resource correlate to the thoughts and concepts in the mind of the user. From the user's point of view, it is important to correlate a thought in the mind to a resource in the computer. First there is a thought in the user's mind that leads to the decision to search for a resource in a computer; then the user has to find this resource that his thought is related to. This relation reaches beyond a simple occurrence or a pimo:metaCreationSupportedBy. When the idea represented in the thing is also represented in the resource, they are bound together semantically. The idea that is represented by the pimo:Thing is the same idea represented by the pimo:ResourceManifestation. For this, the property pimo:groundingOccurrence is reserved. This relations should be used in a one-on-one manner: one pimo:Thing is linked to one or zero pimo:ResourceManifestation and one pimo:ResourceManifestation is linked to one or zero pimo:Thing using the pimo:groundingOccurrence. The thought is like an electrical current that originates in the user's mind and searches for a place to discharge, a resource that matches the thought. We defined the grounding resource as follows [Sauermann2003]:
For a thought in the mind of the user there may exist a grounding resource. It is the resource that is associated best to the thought. The symbolic representation of a thought is the grounding resource.
Under discussion: There can only be one grounding resource for a thought?
The term grounding was also mentioned in Georg Dorffner's book "Konnektionismus". Concepts and symbols can be designed in a way that they can be experienced in states of a computer system. He noted that the term was used in the community of artificial intelligence and knowledge theory researchers. [Dorffner1991, page vi]. The reason to speak of a grounding is that the stream of thoughts continues to flow when it has reached the grounding, now the user can follow the links in the PIMO in addition to the associations in the mind. Technically, the pimo:groundingOccurrence is a sub-property of pimo:occurrence, hence a grounding is also an occurrence. The user interface should treat the grounding resource in a special way, perhaps integrating the metadata of the resource or highlighting the link.
The implementation of these features will follow, then we can also provide an RDF-Schema to configure active folders in a general way. The extension of the PIMO to represent these active folders will be part of the upcoming PIMO-api ontology. Look out for it in the next months. Contact the developers of gnowsis at some time.
Validation of a PIMO
The semantics of the PIMO language allow us to verify the integrity of the data. In normal RDF/S semantics, verification is not possible. For example, setting the domain of the property knows to the class Person, and then using this property on an instance Rome Business Plan of class Document creates, using RDF/S, the new information that the Document is also a Person. In the PIMO language, domain and range restrictions are used to validate the data. The following rules describe what is validated in the PIMO, a formal description is given in the gnowsis implementation's PIMO rule file.- All relating properties need inverse properties.
- Check domain and range of relating and describing properties.
- Check domain and range for rdf:type statements
- Cardinality restrictions using the protege statements
- Rdfs:label is mandatory for instances of ”Thing” and classes
- Every resource that is used as object of a triple has to have a rdf:type set. This is a prerequisite for checking domains and ranges. Above rules are checking semantic modeling errors, that are based on errors made by programmers or human users.
- All statements that have a predicate that has an inverse defined require another triple in the model representing the inverse statement.
Storing a PIMO in real life applications
For real-life applications (like the gnowsis beta version) we suggest to separate the storage of resources from the storage of the PIMO ontologies. All automatically extracted information about resources (i.e. from a lokal desktop search engine) should be in one store. All hand-crafted or algorithmically created data from the PIMO should be stored in a capable RDF database.User Interface Example

Summary
An analysis of existing ontology languages shows that many problems are solved, but in different ontologies and using different practical ways. The experience from the gnowsis workshops, implementations and many other papers have helped to create a useful ontology for the semantic destop. The suggested PIMO ontology builds a simple and practical ground for future personal information management applications. A practical example and guidelines how to create instances of the PIMO ontology should allow you to implement the PIMO and extend it according to your own ideas. The ontology itself and the according example can be downloaded here: [PIMO-lang, PIMO-example]With the described PIMO ontology and its practical description, I hope to give a good technical and philosophical ground for the upcoming gnowsis beta implementation.
Some parts are still missing, but will be addressed in upcoming publications on the PIMO.
Acknowledgements
First of all, thanks to my wife Ingrid Brunner, who allowed me to write this report mostly on my weekends. Just now, she is waiting for me to come home to go for dinner.In the DFKI Knowledge Management group, we had much input from the EPOS project and its team. Many thanks to Ludger Van Elst who created the PIM-Basic ontology which defines classes and topics and the PIM-Upper ontology which defines detailed classes for people, groups, events, process support, and possible associations to link instances of them. Heiko Maus and Sven Schwarz gave input in many ongoing discussions of the existing ontology languages. Lars Zapf is an expert on Topic Maps. Andreas Dengel, our director repeatedly insisted to keep it simple and usable and is supervising our work. The scenario for the semantic annotation of meetings was thoroughly researched by Man Luo, a diploma student of Leo Sauermann. Also many thanks to Frank Osterfeld and Norberto Fernández for proofreading and error-correction.
Input outside the DFKI and the core gnowsis project came from our friends in Berlin, where we repeatedly committed gnowsis workshops. Leading there, Richard Cyganiak and Anja Jentzsch gave much input regarding what they expect from gnowsis, thanks also to all other participants in the gnowsis workshops. Max Völkel from Karlsruhe also discussed with the authors.
Frequently Asked Questions
Q: This is so cool, how can I use that for myself?Thank you. We are currently coding these ideas into an open-source project called gnowsis. It will be part of other software, too, so you might find it somewhere else. If you want to use the PIMO, just download the gnowsis beta 0.9 from www.gnowsis.org in April 2006. Do not download anything before the beta, PIMO is not part of the alpha. If you are a developer, go to http://gnowsis.opendfki.de and see how the status of the project is. If you want to use the PIMO in your own applications, go ahead. The gnowsis project helps you as a java reference implementation.
You can also contact the author. The author is also happy if you cite his other publications.
Q: This is all so technical, where is the big theory behind?
Please read first the recommended literature indicated in the abstract. [RDF-PRIMER], [RDFS], [OWL], [TM-empolis]. The theoretical background is that we use the ideas of Topic Maps and remix them with RDF, SKOS and personal experience from the EPOS project and gnowsis.
This is intentionally technical, because we think that ontologies only work when you have covered different dimensions:
- ontology itself - you need to have a RDF(S) writeup of your ontology
- accompanying homepage - this document
- example data - see [PIMO-example]
- code that uses the ontology - see http://gnowsis.opendfki.de
Q: Why did you use RDF-Schema as basis and not OWL-Lite nor OWL-DL or SKOS? Or Topic Maps?
Our reasons where all inspired by the principle to keep it simple.
- OWL-DL would feature that something can be a class and an individual in one. needed by us but simpler to achieve with RDF-S.
- AnnotationProperty and Ontologyproperty were not needed
- As we focus on a simple frame-like structure with explicit classes, we don't use much of the existence/inference features of OWL-DL. or OWL full.
- We were mostly fond of InverseProperty and the nice support of it in Protégé already on RDF-S level
- RDF-Schema does not support InverseProperty but this is the only real need for OWL..
- SKOS does not support properties (like mbox).
Other ontologies were created out of other needs.
Q: You proposed to identify concepts like "Rome" using URIs from wikipedia. This might be a sound solution from the technical side, but first of all these URIs might change and second the URIs at wikipedia identify websites, and not the concepts behind. Isn't this half baked?
True, this solution is far from perfect. Don't suggest us to use the namespace YOU have invented to identify concepts. And don't suggest us to just "change the url of wikipedia a little, so that it is a conceptual uri". We won't just add ".concept" at the end or some other silly thing at the front. People always suggested different solutions for that during the last years and Leo Sauermann gets nervous when you mention these suggestions.
The real problem is: how to identify abstract concepts in software systems, that we use URIs as syntax is a start but the scheme and meaning of the URIs has also to be considered. Others have proposed solutions that also look nice, see [IdentificationIssues]. But from my point of view most of the other solutions either don't scale or are too artificial to be understood and broadly used. What we propose is to use the most popular web-sites can be used to say "this public resource can be used to identify the concept X", because it is popular and people can relate the individual pages on such a website to the concepts behind. For broad adoption of the semantic web, using popular websites can help as normal, non-technical people can relate to them. For movies, people can understand that a website on the Internet Movie Database (IMDB) describes exactly one movie and that this may be used for identification of the movie. Many movie and cinema websites link to IMDB anyway when content authors want their users to know more about a certain movie. Individual persons use the links on their weblogs