Package: boilerpipeR
Version: 1.3
Date: 2015-05-07
Title: Interface to the Boilerpipe Java Library
Author: See AUTHORS file.
Maintainer: Mario Annau <mario.annau@gmail.com>
Imports: rJava
Suggests: RCurl
Description: Generic Extraction of main text content from HTML files; removal
    of ads, sidebars and headers using the boilerpipe 
    (http://code.google.com/p/boilerpipe/) Java library. The
    extraction heuristics from boilerpipe show a robust performance for a wide
    range of web site templates.
License: Apache License (== 2.0)
URL: https://github.com/mannau/boilerpipeR
BugReports: https://github.com/mannau/boilerpipeR/issues
NeedsCompilation: no
Packaged: 2015-05-10 17:37:41 UTC; mario
Repository: CRAN
Date/Publication: 2015-05-11 00:20:25
