I am looking for Gigaword corpus(distributed by LDC) reader. The corpus is in SGML format. So can anyone please suggest me already existing Gigaword corpus reader which would extract the relevant text from the these gigaword files. I found the lingpipe library which had a Gigaword corpus reader but its deprecated now and the library does not support it anymore.

asked Jun 29 '12 at 08:36

Lancelot's gravatar image

Lancelot
250172426

Be the first one to answer this question!
toggle preview

powered by OSQA

User submitted content is under Creative Commons: Attribution - Share Alike; Other things copyright (C) 2010, MetaOptimize LLC.