DAML+OIL Layers

May 28 2001

1.0 Introduction

DAML+OIL has its roots in description logic, which originally developed as a formalization of frame based systems. However, it is not easy to see how current popular modeling languages like UML or Entity Relationship relate to DAML+OIL, and support for the full DAML+OIL language requires a full fledged description logic inference engine. It seems unlikely that every application will use a Description Logic Engine. Furthermore large scale applications (100k classes and more) require simple inference mechanisms. Thus large scale application needs to built upon software which is not able to support the complete DAML+OIL language.

To avoid fragmentation and facilitate exchange of ontologies the definition of several layer of DAML+OIL seems worthwhile. Language users and tool vendors are able to specify their level of compliance with the full DAML+OIL specification.

In this paper we propose language features which can be supported within large-scale ontology-based applications without using an inference engine.

1.0 Classification or Constraint Checking?

An important use for schemata is to check instance data for compliance with the schema. However, within the description logic framework the focus is on classification of class definitions and instance data, not to check instance data for compliance with the schema.

Goal of this style guide is to provide a useful language, which

allows to define simple ontologies
enables checking of integrity constraints for instances
can be deployed in large scale applications (> 100k concepts)
can be deployed by average programmers

Checking ontologies for inconsistencies (and thus providing the expressive power to create inconsistencies) is not our primary concern. Rather we use an extensional interpretation of classes and check the instances of the classes for inconsistencies.

Notation: With rules we mean horn rules plus well founded negation. A domain independent full first order logic formula is a formula where the domain of interpretation can be changed without changing the truth value of forumula (as long as the interpretation of constants, functions and predicates are not changed). A non domain independent formula: EXISTS X not p(X)). An integrity constraint is a domain independent full first order logic formula. The comp-operator is the completion operator known from logic programming, which completes <- to <->. We note a RDF triple syntactically as subject[predicate->object].

Def.: Let D be a set of triples and rules operating on the triples. Let IC be a set of integrity constraints. D fulfills IC iff comp(D) |= IC.

Given an DAML+OIL ontology the next step is to identify the set of rules and integrity constraints, that a DAML+OIL ontoloy has to fulfill. So we divide a DAML+OIL ontology into two subsets of formulas: one subset is interpreted as deduction rules, the other subset is interpreted as integrity constraints, specifying requirements for a corresponding set of instance data. Please note that for efficiency reasons checking of IC is unlikely to be done by a general purpose inference engine. Instead it is assumed that specialized procedures will perform the checking.

2.0 Layers

2.1 DAML+OIL: Level 0

DAML+OIL is essential identical to RDF Schema ( with the semantic clarification and restrictions that DAML+OIL defined).

2.2 DAML+OIL: Level 1

DAML+OIL Level 1 focuses primarily on the enabling of integrity constraint checking. The language allows the definition of constraints, but does not regard them as a means to do classification. Therefore we focus on the semantics for constraint checking, not for classification.
Named classes are classes identified by a URI.

Class Elements for a given class c1 may contain:
- rdfs:subClassOf. Allowed values are named classes (e.g. c2). The transititivity of the subClassOf relationship is regarded as a deduction rule and a fact c1[rdfs:subClassOf->c2] is added to the factbase.
```
FORALL X, Y, Z 
           X[subClassOf->Z] <- X[subClassOf->Y] and Y[subClassOf->Z].
FORALL O,T   O[type->T] <-
      EXISTS S   (S[subClassOf->T] AND O[type->S]).
```
- daml:sameClassAs. Allowed values are named classes (e.g. c2). Equivalence axioms are added to the deduction rules and a fact c1[rdfs:sameClassAs->c2]is added to the factbase. Please not that for each variable X denoting a class it is also necessary to add X[sameClassAs->Y] to the rules.
```
FORALL X 
	X[sameClassAs->X].

FORALL X,Y 
	X[sameClassAs->Y] <- X[sameClassAs->Y].

FORALL X,Y,Z
	X[sameClassAs->Z] <- X[sameClassAs->Y] and Y[sameClassAs->Z].
```
- daml:equivalent. Allowed values are Resources . Same rules as above.
- zero or more enumeration elements (e.g. inst1, inst2, etc). Enumeration elements are added to the factbase, and are only used as a constraint, not as a deduction rules.
```
<- FORALL X 
	X[type->c1] -> X[sameIndividualAs->inst1] or X[sameIndividualAs->inst2]  or ...
```
Property Restrictions (daml:Restriction)
- daml:toClass (with only a named class as value). Integrity Constraint:
  A toClass constraint is fulfilled, if for the property in question p and instances of domain class c1, all values of p are in instances of the range class defined by the toClass constraint (eg. c2).
```
<- FORALL Y,X 
          Y[type->c1;p->X] -> X[type->c2].
```
- daml:hasValue.Integrity Constraint:
  A daml:hasValue constraint is fulfilled, if for the property in question p and instances of domain class c, at least one value of p is the value v defined by the hasValue constraint.
```
<- FORALL X
          X[type->c] -> X[p->v].
```
- daml:hasClass (with just a named class as value). Integrity Constraint:
  A daml:hasClass constraint is fulfilled, if for the property in question p and instances of domain class c1, at least one value of p is an instance of the expression defined by the hasClass constraint (e.g. c2).
```
<- FORALL X
          X[type->c1] -> EXISTS V X[p->V] and V[type->c2].
```
- daml:cardinality. Integrity Constraint:The daml:cardinality constraint is fulfilled for the property in question p and instance of the domain class c if p has exactly the specified number of values defined by the constraint. Note that the underlying logical language needs an aggregation operator to be able to express this integrity constraint.
- daml:maxCardinality . Integrity Constraint
- daml:minCardinality . Integrity Constraint
- daml:cardinalityQ (with just a named class as qualifier) . Integrity Constraint.
- daml:maxCardinalityQ (with just a named class as qualifier) . Integrity Constraint.
- daml:minCardinalityQ (with just a named class as qualifier) . Integrity Constraint.
Property elements (rdf:Property)
- rdfs:subPropertyOf. The following deduction rules are added ti the rules.
```
FORALL O,P,V 
          O[subPropertyOf->V] <- EXISTS W (O[subPropertyOf->W] 
          AND W[subPropertyOf->V]).
FORALL O,P,V 
          O[P->V] <- EXIST S S[subPropertyOf->P] AND O[S->V]. 
```
- rdfs:domain (named class as value). Integrity Constraint.
- rdfs:range (named class as value) . Integrity Constraint.
- daml:samePropertyAs. Deduction Rule.
- daml:equivalentTo. Deduction Rule.
- daml:inverseOf . Deduction Rule.
- daml:TransitiveProperty . Deduction Rule.
- daml:UniqueProperty. Integrity Constraint.
Instances
- daml:sameIndividualAs . Deduction Rule.
- daml:equivalentTo . Deduction Rule.
- daml:differentIndividualFrom (using the computation model of Horn Rules, this would be the default case).

2.2 DAML+OIL: Level 2

Level 2 is identical to the existing DAML+OIL language.

3.0 Removals from DAML+OIL Layer 1

Complex Class Expressions

Axioms ( Disjoint Declarations)

4 Proposed Changes

Unique Name Assumption

Problem description:

The unique name assumption is often quite useful when dealing with large datasets, since usually different name identify different entities (e.g. in catalogs). Therefore it would be often useful to make the unique name assumption the default case and equivalences should be stated as an exception. Not adopting the unique name assumption as the default case requires a large amount of daml:differentIndividualFrom statements.

Proposed Solution:

Dan Conolly suggested to use the daml:UniqueProperty element together with the fact, that literals are unique.
@@@ Does this also work for classes?
Two instances with a unique property (e.g. a label), which have different literals as values, need to be different.

<rdf:Description rdf:about="http://www.w3.org/2001/04dun/colors#colorName">
   <rdf:type rdf:resource="http://www.daml.org/2001/03/daml+oil#UniqueProperty"/> 
   <rdfs:domain rdf:resource="http://www.w3.org/2001/04dun/colors#Color"/> 
   <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#label"/> 
 </rdf:Description>


<rdf:Description rdf:about="http://www.w3.org/2001/04dun/colors#green">
  <rdf:type rdf:resource="http://www.w3.org/2001/04dun/colors#Color"/> 
  <colorName>green</colorName> 
</rdf:Description>

<rdf:Description rdf:about="http://www.w3.org/2001/04dun/colors#orange">
  <rdf:type rdf:resource="http://www.w3.org/2001/04dun/colors#Color"/> 
  <colorName>orange</colorName> 
</rdf:Description>

Property Restrictions

Viewing Property Restrictions as class expressions is a unusual viewpoint for people used to object oriented or frame based modeling. The was also a major point of critique to DAML+OIL. The reference description says: "Notice that the restrictedBy element which was associated with slot-restrictions in earlier versions of the language has now been removed, since it is completely synonymous with subClassOf."

While this might be true, users of OO language have a naive understanding what restrictions are, but not how it relates to subclasses.

Therefore the restrictedBy element is reintroduced into the language.

Complex Datatypes

Complex XML-Schema Datatypes currently don't have a RDF graph representation. This requires complex processing, storage and query software. If complex datatypes have a graph representation, then the existing RDF infrastructure is able to process and query datastructure definitions.

Languages Header

Problem description:
The ontology header defines a node of type ontology. This node is unrelated (in the RDF model) to the classes defined in that ontology - so a posing the query "Retrieve all classes defined in that ontology" (and similar queries) is not possible by just querying the RDF graph.

Proposed Solution:
Dan Connolly suggested to use property "rdfs:isDefinedBy" to indicate the ontology in which classes or properties are defined.

Example:

<Ontology rdf:about="http://www.daml.org/2001/03/daml+oil-ex.daml">
   <versionInfo>$Id: reference.html,v 1.10 2001/04/11 16:27:53 mdean Exp    $</versionInfo>
   <rdfs:comment>An example ontology</rdfs:comment>
   <imports rdf:resource="http://www.daml.org/2001/03/daml+oil"/>
</Ontology>


<daml:Class rdf:ID="Animal">
  <rdfs:label>Animal</rdfs:label>
  <rdfs:definedBy rdf:resource="http://www.daml.org/2001/03/daml+oil-ex.daml"/>
  <rdfs:comment>
    This class of animals is illustrative of a number of ontological idioms.
  </rdfs:comment>
</daml:Class>