[PetersWerkWiki] [TitleIndex] [WordIndex

Werkoverleg met GvN.

Taken:

  1. EarleyParser

Basistest, 1000 zinnen die zonder raden worden geparst:

   Precision          Recall       Crossing brackets
 Min.   :0.2000   Min.   :0.2143   Min.   :0.00000  
 1st Qu.:0.3734   1st Qu.:0.3874   1st Qu.:0.03065  
 Median :0.4682   Median :0.4759   Median :0.08959  
 Mean   :0.5107   Mean   :0.5086   Mean   :0.09545  
 3rd Qu.:0.6000   3rd Qu.:0.5964   3rd Qu.:0.15408  
 Max.   :1.0000   Max.   :0.9737   Max.   :0.30556  

Dit is aanzienlijk slechter dan met de eerdere grammatica zonder POS-nodes.

Vergelijk, in ~kleiweg/Earley/alpino :

../tview -c -k okdata.parse | less -r   # parse door Alpino
../tview -k basetest.parse | less -r    # parse door Earley


1000 zinnen met onbekende woorden, test met gokken volgens methode 2:

Tijd: ruim 37 uur (met eerdere grammatica slechts 9½ uur)

OK only
   Precision           Recall        Crossing brackets
 Min.   :0.05882   Min.   :0.01887   Min.   :0.0000   
 1st Qu.:0.29187   1st Qu.:0.28713   1st Qu.:0.1143   
 Median :0.35870   Median :0.35211   Median :0.1797   
 Mean   :0.37640   Mean   :0.36027   Mean   :0.1728   
 3rd Qu.:0.44000   3rd Qu.:0.41905   3rd Qu.:0.2286   
 Max.   :0.96154   Max.   :0.84615   Max.   :0.5833   

OK + FAILED + UNKNOWN
   Precision           Recall       Crossing brackets
 Min.   :0.05882   Min.   :0.0000   Min.   :0.0000   
 1st Qu.:0.29224   1st Qu.:0.2857   1st Qu.:0.1123   
 Median :0.35910   Median :0.3508   Median :0.1791   
 Mean   :0.38076   Mean   :0.3577   Mean   :0.1715   
 3rd Qu.:0.44350   3rd Qu.:0.4183   3rd Qu.:0.2281   
 Max.   :1.00000   Max.   :0.8462   Max.   :0.5833   

Fail:     0.7%


1000 zinnen met onbekende woorden, test met woordcategorieën uit Alpino.

Tijd: 2 uur, 10 minuten.

OK only
   Precision          Recall       Crossing brackets
 Min.   :0.2264   Min.   :0.1905   Min.   :0.0000   
 1st Qu.:0.4348   1st Qu.:0.4394   1st Qu.:0.0641   
 Median :0.5067   Median :0.5075   Median :0.1282   
 Mean   :0.5320   Mean   :0.5254   Mean   :0.1213   
 3rd Qu.:0.6032   3rd Qu.:0.5929   3rd Qu.:0.1781   
 Max.   :1.0000   Max.   :0.9444   Max.   :0.3210   

OK + FAILED + UNKNOWN
   Precision          Recall       Crossing brackets
 Min.   :0.2264   Min.   :0.0000   Min.   :0.00000  
 1st Qu.:0.4418   1st Qu.:0.4137   1st Qu.:0.04041  
 Median :0.5273   Median :0.4936   Median :0.11747  
 Mean   :0.5765   Mean   :0.4755   Mean   :0.10980  
 3rd Qu.:0.6540   3rd Qu.:0.5833   3rd Qu.:0.17078  
 Max.   :1.0000   Max.   :0.9444   Max.   :0.32099  

Fail:     9.5%


CategoryParsing