Defuddle is an open-source library I built to extract the main content and metadata from web pages. It can also return the content as Markdown.
I built Defuddle while working on Obsidian Web Clipper[1] (also MIT-licensed) because Mozilla's Readability appears to be mostly abandoned, and didn't work well for many sites.
Defuddle is also available as a CLI:
https://github.com/kepano/defuddle-cli
[1] https://github.com/obsidianmd/obsidian-clipper
Comments URL: https://news.ycombinator.com/item?id=44067409
Points: 3
# Comments: 0
Login to add comment
Other posts in this group
Article URL: https://chrispenner.ca/posts/interview
Comments URL: https://news.y
Article URL: https://jhellerstein.github.io/blog/crdt-turtles/

Article URL: https://getcreatr.com/
Comments URL: https://news.ycombinator.com/item?id=44069540
Article URL: https://saile.it/1145-pull-requests-per-day/
Comments URL: ht
Article URL: https://lwn.net/Articles/1020571/
Comments URL: https://news.ycombinator