12. hdrug_feature: The Hdrug Feature Library

The feature library provides extensive possibilities to compile feature equations into Prolog terms, and to view such compiled Prolog terms as feature-structures. The motivation for such an approach might be that you want feature structures for readability on the one hand, but Prolog terms and Prolog unification of such terms for effiency reasons internally. The package is heavily influenced by the work of Chris Mellish.

Types

Before feature structures can be compiled into terms, a number of type declarations need to be specified. The declarations that need to be defined are top/1, type/3 and at/1. These three definitions define a type hierarchy. This hierarchy has the shape of a tree. The top/1 definition defines the daughter nodes of the root of the tree. This root is always called `top'.

Attributes can be attached to a single type in the type hierarchy. If a type is associated with an attribute then this attribute is inherited by all of its subtypes. The top node of the type hierarchy can be seen as a variable. You can not specify any attributes for this type. The type/3 predicate defines for a given type (first argument) a list of subtypes (second argument) and a list of attributes (third argument).

The at/1 definitions define terminals of the tree that do not introduce attributes. It is an abbreviation of a type/3 definition in which the second and third argument are both the empty list.

As an example, consider the following type tree definition:

top([boolean,sign,cat]).
type(boolean,[+,-],[]).
at(+).
at(-).
type(sign,[],[cat,phon,sem]).
type(cat,[noun,verb],[agr]).
type(noun,[],[pro]).
type(verb,[],[aux,inv,subj]).

If this type definition is consulted by Hdrug, and if the directive:

:- type_compiler.

is called, then it is possible to view the type definition by choosing the `view type tk' menu. This gives rise to a tree on the canvas as type.png

The meaning of such a type tree can be understood as follows. The class of objects is divided in three mutually exclusive subclasses, called boolean, sign and cat. Objects of type boolean can be further subdivided into classes + or -. Objects of type sign can be further specified for a cat, phon or sem attribute.

The meaning of this type tree can also be understood by looking at the way in which objects of a certain type are represented as Prolog terms. This is illustrated as tree.png

Equational constraints

If the type definition is compiled, then the following predicates can be used: <=>/2, =>/2, ==>/2. The first predicate equates two paths, the second predicate assigns a type to a path, and the third predicate assigns an arbitrary Prolog term to a path.

A path is a Prolog term followed by a sequence of attributes, seperated by a colon (:). Therefore, given the previous example of a type tree, we can have the following equational constraint:

X:cat => noun.
    X = sign(_H,cat(noun(_G,_F),_E,_D),_C,_B,_A)
Y:cat:agr <=> Y:cat:subj:cat:agr.
    Y = sign(_O,cat(verb(_N,_M,_L,sign(_K,cat(_J,_E,_I),
                               _H,_G,_F)),_E,_D),_C,_B,_A)
Z:phon ==> [jan,kust,marie].
sign(_D,_C,[jan,kust,marie],_B,_A)

Lists

You can add (ordinary Prolog) lists to your type tree by the simple definition:

list_type(HeadAtt,TailAtt).

This will allow the use of attributes HeadAtt and TailAtt for referring to parts of lists. Furthermore, lists of typed objects will be shown appropriately. For example:

[-user].
| list_type(h,t).
| {user consulted, 20 msec 48 bytes}
^D
yes
| ?- X:t:h:cat => verb.

X = [_A,sign(_K,cat(verb(_J,_I,_H,_G),_F,_E),_D,_C,_B)|_L] ?

yes
| ?- X:t:h:cat => verb, show(fs,latex,[value(X)]).
....

X = [_A,sign(_K,cat(verb(_J,_I,_H,_G),_F,_E),_D,_C,_B)|_L] ?

Extensionality

Direct subtypes of type `top' are represented using an extra variable position. This is to make sure that objects are only identical if they have been unified. For some types this does not make much sense. Types that you want to consider as `extensional' in this way are to be declared with the predicate extensional/1. Boolean types (cf. below) are extensional by default. Providing an intentional/1 definition makes a boolean type intensional.

The following example illustrates the difference. Without the extensional predicate we have:

X:inv => -, X:aux => -, tty_fs(X).
{verb}
|aux <B> {-}
|inv <B>.

After declaring that boolean and `-' be extensional types (and recompiling the type tree), we get:

X:inv => -, X:aux => -, tty_fs(X).
{verb}
|aux {-}
|inv {-}.

The difference is that Hdrug does not show explicitly that the values of aux and inv are the same in the second example. This is redundant information because objects of extensional types always are the same if they have the same information content.

Unify_except

The library provides the predicates unify_except/3, unify_except_l/3 and overwrite/4. The first argument takes two feature terms and a path. The first and second argument are unified except for the value at the path}.

As an example (assuming the simple type system given above), we might have:

| ?- unify_except(X,Y,cat:agr).

X = sign(_G,cat(_F,_E,_D),_C,_B,_A),
Y = sign(_G,cat(_F,_H,_D),_C,_B,_A) ?

The predicate unify_except_l is similar, except it takes a list of paths rather than a single path as its third argument. Finally, the predicate overwrite/4 can be understood by looking at its definition:

overwrite(FS,FS2,Path,Type) :-
unify_except(FS2,FS,Path),
FS2:Path => Type.

Find_type

The meta-logical predicates find_type/2 and find_type/3 can be used to get the most specific type of a feature term. The first argument is the feature term, the second argument is a list of most specific types (for simple usage just consider the first element of this list). The optional third argument is a list of attributes that are appropriate for this type. For example:

| ?- X:agr <=> X:subj:agr, find_type(X,[Y|_]).

X = cat(verb(_G,_F,_E,cat(_D,_B,_C)),_B,_A),
Y = verb ?

It is clear that find_type/2,3 are meta-logical predicates by looking at the following example, where the conjuncts are swapped:

| ?- find_type(X,[Y|_]), X:agr <=> X:subj:agr.

X = cat(verb(_G,_F,_E,cat(_D,_B,_C)),_B,_A),
Y = top ? ;

Disjunction and Negation over Atomic Values

A special mechanism is provided for atomic values to allow for disjunction and negation over such atomic values. These atomic values are not declared in the type-system as shown above, but rather they are introduced by the predicate boolean_type/2. The first argument of this predicate is an identifier, the second argument of this predicate is a list of lists that is understood as a set product. For example, agreement features could be defined as:

boolean_type(agr,[[1,2,3],[sg,pl],[mas,fem,neut]])

So valid and fully specified values for agreement consist of an element from each of the three lists. The syntax for type-assignment is extended to include disjunction (';'), conjunction ('&') and negation ('~') of types. For example, to express that X has either singular masculine or not-second person agreement, we simply state:

X => ( sg & mas ; ~2 ).

The following example illustrates the use of this package:

| ?- [-user].
| boolean_type(agr,[[1,2,3],[sg,pl],[mas,fem,neut]]).
| {user consulted, 10 msec 368 bytes}

yes
| ?- type_compiler.

yes
| ?- X => ( sg & mas ; ~2 ).

X = agr(0,_L,_K,_J,_I,_H,_G,_G,_G,_G,_G,_F,_F,_E,_D,_C,_B,_A,1) ?

The example shows how complex terms are created for such boolean types. This is useful because disjunction and negation can be handled by ordinary unification in this way. Luckily the pretty printing routines will turn such complex turns back into something more readible:

| ?- X: agr => (sg & mas ; ~2 & neut), show(fs,latex,[value(X)]).

12.1. Hook Predicates

This section lists the hook predicates used by the hdrug_feature library.

12.1.1. top(Subtypes)

Defines all sub-types of top as a list of atoms.

12.1.2. type(Type,Subtypes,Attributes)

Defines a Type with Subtypes and Attributes. In general, Subtypes is a list of list of types. If a list of types [T0..Tn] is given, then this is automatically converted to [[T0..Tn]].

12.1.3. at(Type)

Type is an atomic type, i.e. without any sub-types and without any attributes.

12.1.4. list_type(Head,Tail)

Declares Head and Tail to be the attributes to refer to the head and the tail of objects of type `list'.

12.1.5. extensional(Type)

Declares Type to be an extensional type, i.e. no extra variable is added to objects of this type; extensional objects are identical if they have the same value for each of their attributes. Intensional objects are identical only if they have been unified.

12.1.6. boolean_type(Type,Model)

Declares Type to be a boolean type with Model as its model (list of list of atoms). For instance, boolean_type(agr, [[1,2,3], [sg,pl], [mas,fem,neut]]) defines that agr is such a boolean type.

12.1.7. intensional(Type)

Type must be a boolean type. Boolean types are extensional by default, unless this predicate is defined for them.

12.2. Predicates

This section lists the predicates exported by the hdrug_feature library.

12.2.1. hdrug_feature:pretty_type(Type)

pretty prints information on Type. Types should have been compiled with hdrug_feature:type_compiler.

12.2.2. hdrug_feature:find_type(?Term,-Types[,-Atts])

Types will be bound to the list of most informatives sub-types of Term; Atts will be bound to the list of all attributes of Term. Meta-logical. Types should have been compiled with hdrug_feature:type_compiler.

12.2.3. hdrug_feature:unify_except(T1,T2,Path)

T1 and T2 are Prolog terms. Path is a sequence of attributes separated by colons. The predicate evaluates T1:Path and T2:Path (in order to ensure that Path is consistent with both objects. Furthermore, T1 and T2 are unified except for the values at T1:Path and T2:Path. Types should have been compiled with hdrug_feature:type_compiler.

12.2.4. hdrug_feature:unify_except_l(T1,T2,ListOfPaths)

Similar to unify_except, except that the third argument now is a list of paths. T1 and T2 are Prolog terms. Each path in ListOfPaths is a sequence of attributes separated by colons. The predicate evaluates for each Path, T1:Path and T2:Path (in order to ensure that Path is consistent with both objects. Furthermore, T1 and T2 are unified except for all values at T1:Path and T2:Path for Path in ListOfPaths. Types should have been compiled with hdrug_feature:type_compiler.

12.2.5. hdrug_feature:overwrite(T1,T2,Path,Type)

Abbreviation for unify_except(T1,T2,Path), T2:Path => Type; i.e. T1 and T2 are identical, except that T2:Path is of type Type. Types should have been compiled with hdrug_feature:type_compiler.

12.2.6. hdrug_feature:(ObjPath => Type)

This predicate evaluates ObjPath, and assigns Type to the result (i.e. the result is unified with the Prolog term representation of Type). ObjPath is a Prolog term followed by a (possibly empty) list of attributes separated by the colon :. A path such as X:syn:head:cat refers to the cat attribute of the head attribute of the syn attribute of X. Type must be a type (Prolog atom) or a boolean expression of boolean types. Types should have been compiled with hdrug_feature:type_compiler.

12.2.7. hdrug_feature:(ObjPath /=> Type)

This predicate evaluates ObjPath, and ensures that it is not of type Type (i.e. the result is not allowed to subsume the Prolog term representation of Type). ObjPath is a Prolog term followed by a (possibly empty) list of attributes separated by the colon :. A path such as X:syn:head:cat refers to the cat attribute of the head attribute of the syn attribute of X. Type must be a type (Prolog atom) or a boolean expression of boolean types. Types should have been compiled with hdrug_feature:type_compiler. The implementation of this construct uses delayed evaluation.

12.2.8. hdrug_feature:(ObjPath ==> Term)

This predicate evaluates ObjPath, and unifies Term with the result. ObjPath is a Prolog term followed by a (possibly empty) list of attributes separated by the colon :. A path such as X:syn:head:cat refers to the cat attribute of the head attribute of the syn attribute of X. Term is an arbitrary Prolog term. This predicate is often used to include arbitrary Prolog terms inside feature structures. You can define a hook predicate catch_print_error/3 in order to define pretty printing for such terms. Types should have been compiled with hdrug_feature:type_compiler.

12.2.9. hdrug_feature:(ObjPathA <=> ObjPathB)

This predicate evaluates PathA and PathB, and unifies the results. ObjPathA and ObjPathB each is a Prolog term followed by a (possibly empty) list of attributes separated by the colon :. A path such as X:syn:head:cat refers to the cat attribute of the head attribute of the syn attribute of X. Types should have been compiled with hdrug_feature:type_compiler.

12.2.10. hdrug_feature:(PathA <?=?> PathB)

This predicate uses the if_defined/2 construct in order to unify two paths, provided each of the two paths is defined. It is defined by:

A <?=?> B :-
if_defined(A,Val),
if_defined(B,Val).

12.2.11. hdrug_feature:is_defined(Path,Bool)

This predicate evaluates Path. If this is possible (i.e. the attributes are all appropriate) then Bool=yes. Otherwise Bool=no.

12.2.12. hdrug_feature:if_defined(Path,Val[,Default])

This predicate evaluates Path, and unifies the result with Val. If the path cannot be evaluated (for instance because a feature is used which is not appropriate for the given type) then the predicate succeeds (in the binary case) or unifies Val with Default (in the ternary case). For example:

if_defined(X:head:subcat,List,[]),

could be used as part of the definition of a valence principle, in order to obtain the list value of the subcat attribute. However, for categories which have no subcat attribute, List is instantiated to [].

12.2.13. hdrug_feature:type_compiler[(Module)]

Compiles type declarations (loaded in Module or user) into definitions for the predicates =>/2, <=>/2, ==>/2, unify_except/3, overwrite/4. The type declarations consist of definitions for the hook predicates top/1, at/1, type/3, list_type/2, extensional/1, boolean_type/2, intensional/1. The top/1 declaration is required.

top(Subtypes) is an abbreviation for type(top,[Subtypes],[]).

at(Type) is an abbreviation for type(Type,[],[]).

type(Type,[T0,..,Tn],Atts), where each Ti is atomic, is an abbrevation for type(Type,[[T0,..,Tn]],Atts).

Each type is specified by a list (conjunction) of lists (exclusive disjunctions) of subtypes and a list of attributes.

Objects of type type(Type,[[A1..An],[B1..Bn],...,[Z1..Zn]],[Att1..Attn]) will be represented by the Prolog term Type(Ai',Bi',..,Zi',Att1',..,Attn',_)

For example, the declaration

type(sign,[[basic,complex],[nominal,verbal]],[mor,sem])

implies that everything of type sign is represented with a term sign(BorC,NorV,Mor,Sem,_) where the first argument represents the first sub-type (basic or complex and any associated information with these subtypes), the second argument represents the second subtype (nominal or verbal), the third argument represents the value of the `mor' attribute, and the fourth argument represents the value of the `sem' attribute. The fifth argument is introduced in order that such objects are `intensional': objects are identical only if they have been unified.

Assumptions:

`top' has no appropriate features, will always be denoted with Variable bottom has no appropriate features, will not be denoted -> failure hence top is only specified along one `dimension' (use top/1).

Other types can be further specified along several dimensions, hence can have more than one subtype, at the same time. Subtypes of a type are mutually exclusive (in the example above, you cannot be both nominal and verbal).

All types describe intensional objects (as in PATR II). For this purpose, during compilation an extra argument position is added to which you cannot refer. You can use extensional/1 for a specific type in order that this extra position is not added.

Boolean types.

The technique discussed in Chris Mellish' paper in Computational Linguistics is available to be able to express boolean combinations of simple types. First, boolean types are declared using the hook predicate boolean_type(Type,ListOfLists). For example, the declaration

boolean_type(agr,[[1,2,3],[sg,pl],[mas,fem,neut]])

declares that objects of type `agr' are elements of the cross-product of {1,2,3} x {sg,pl} x {mas,fem,neut}. Instead of simple types, boolean combinations are now allows, using the operators & for conjunction, ~ for negation and ; for disjunction.

?- X => (sg & ~fem ; pl).

X = agr(0,_A,_B,_C,_C,_D,_E,_F,_G,_H,_H,_I,_J,_K,_L,_M,_M,_N,1) ?