Class ExtractReuters
- java.lang.Object
-
- org.apache.lucene.benchmark.utils.ExtractReuters
-
public class ExtractReuters extends Object
Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body
-
-
Constructor Summary
Constructors Constructor Description ExtractReuters(File reutersDir, File outputDir)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidextract()protected voidextractFile(File sgmFile)Override if you wish to change what is extractedstatic voidmain(String[] args)
-