Taken:
_deste, _np en _vorfeld
words, lemmas
Attributen toevoegen, voorbeeld:
dact_attrib \
-m macros.txt \
infile.dact \
outfile.dact \
'//node[%PQ_vorfeld%]' 'VORFELD="true"' \
'//node[%PQ_np%]' 'NP="true"' \
'//node[node[@graad="comp"] and node[@lemma=("hoe", "deste") or (node[@lemma="des"] and node[@lemma="te"])]]' 'DESTE="true"'
Verschil in tijden, zonder of met hulpattribuut, alles zoeken met dbxml_match in Alpino Treebank,
uitvoer naar /dev/null:
time dbxml_match -m macros.txt cdb.dact '//node[%PQ_vorfeld%]' > /dev/null
time dbxml_match -m macros.txt cdb.dact '//node[@VORFELD]' > /dev/null
time dbxml_match -m macros.txt cdb.dact '//node[%PQ_np%]' > /dev/null
time dbxml_match -m macros.txt cdb.dact '//node[@NP]' > /dev/null
time dbxml_match -m macros.txt cdb.dact '//node[node[@graad="comp"] and node[@lemma=("hoe", "deste") or (node[@lemma="des"] and node[@lemma="te"])]]' > /dev/null
time dbxml_match -m macros.txt cdb.dact '//node[@DESTE]' > /dev/null
| zoekterm | zonder | met |
|---|---|---|
| vorfeld | 1:46 | 0:10 |
| np | 1:08 | 0:49 |
| deste | 0:00.20 | 0:00.04 |
Query’s vereenvoudigen. Van dit:
select count(distinct(sentid)) as zinnen, count(sentid) as items from ( match ... return distinct n.sentid, n.id ) as foo
… naar dit:
match ... with distinct n.sentid as sentid, n.id as id return count(distinct(sentid)) as zinnen, count(sentid) as items