[PetersWerkWiki] [TitleIndex] [WordIndex

Zie ook: AlpinoTokenizerPerl

installeren

ln -s /net/aistaff/vannoord/z/Alpino/Tokenization/libtok_no_breaks.c .
perl -p -e 's/QDATUM|new_t_accepts|qentry|qinit|qinsert|qpeek|qremove|queue|replace_from_queue|resize_buf|t_accepts|transition_struct|trans|unknown_symbol/$&1/g' \
    /net/aistaff/vannoord/z/Alpino/Tokenization/libtok.c > libtok1.c
python3 setup.py install --prefix $HOME

gebruik

   1 import AlpinoTokenizer
   2 
   3 zin = "Dit is een test. En dit ook! Etc., enz."
   4 
   5 print(AlpinoTokenizer.tokenize(zin))        # zonder newlines
   6 print(AlpinoTokenizer.tokenize(zin, False)) # zonder newlines
   7 print(AlpinoTokenizer.tokenize(zin, True))  # met newlines
   8 
   9 AlpinoTokenizer.test()

attachments

Download alles: AlpinoTokenizerPython.tar.gz

  • [get | view] (2017-08-16 19:14:19, 2.1 KB) [[attachment:AlpinoTokenizer.c]]
  • [get | view] (2017-08-16 19:14:19, 0.1 KB) [[attachment:AlpinoTokenizer.h]]
  • [get | view] (2017-08-16 19:54:18, 2.5 KB) [[attachment:AlpinoTokenizer.i]]
  • [get | view] (2017-08-16 22:06:57, 0.3 KB) [[attachment:README.txt]]
  • [get | view] (2017-08-16 15:26:06, 1.0 KB) [[attachment:setup.py]]
  • [get | view] (2017-08-16 19:40:46, 0.3 KB) [[attachment:test.py]]
 All files | Selected Files: delete move to page copy to page


CategoryAlpino CategoryCorpora CategoryPython CategorySwig