Group: NLPA

Sort: popular | newest

1. NLPA

org.nlpa » nlpaGPL

NLPA is a framework designed to operate in conjuction with BDP4J (https://github.com/sing-group/bdp4j) and able to extract texts from Twitter, Youtube Comments, text files, raw email files (.eml) or WARC (Web Archive) files. The extracted text can be preprocessed into a Dataset using task (org.bdp4j.pipe.Pipe) definitions. This framework incorporates more than 30 preprocessing tasks to transform the text.
Last Release on Jul 26, 2021