Thai Word Segmentation Testing

A Procedure for measuring the performance of a Thai Word Segmentation software is as follow:

1. Download our Test Data(Member only)

  • Article
  • Encyclopedia
  • News
  • Novel
Click Here to Download

2. Use your software to segment the data. Produce twelve output files , one for each genre, using UTF-8 encoding only.

3. Test each output file with the BEST Evaluation Tool.

Please choose your document (.txt) :