Public Notes
on
histre
harlantwood/html_massage
github.com
"Massages HTML how you want to: sanitize tags, remove headers and footers, convert to plain text."
"Summary
Remove headers and footers and navigation, and strip to only the "content" part of the HTML
Sanitize tags, removing javascript and styling
Convert HTML to markdown, plain text, or sanitized HTML"
#Ruby #repos #starred #parsing #html #html_parsing #parsers #pub
Show More
Collect and share the web
Get started for free