Show HN: Firecrawl-Simple – Stable fork of Firecrawl optimized for self-hosting

Firecrawl Simple is a stripped down and stable version of firecrawl optimized for self-hosting and ease of contribution.

The upstream firecrawl repo contains the following blurb:

>This repository is in development, and we're still integrating custom modules into the mono repo. It's not fully ready for self-hosted deployment yet, but you can run it locally.

Firecrawl's API surface and general functionality were ideal for our Trieve sitesearch product, but we needed a version ready for s

6mo | Hacker news
Rd-TableBench – Accurately evaluating table extraction

Hey HN!

A ton of document parsing solutions have been coming out lately, each claiming SOTA with little evidence. A lot of these turned out to be LLM or LVM wrappers that hallucinate frequently on complex tables.

We just released RD-TableBench, an open benchmark to help teams evaluate extraction performance for complex tables. The benchmark includes a variety of challenging scenarios including scanned tables, handwriting, language detection, merged cells, and more.

We employed an indep

6mo | Hacker news
Show HN: Whirlwind – Async concurrent hashmap for Rust

Hey HN, this is Will and David from Fortress (https://news.ycombinator.com/item?id=41426998).

We use a lot of async Rust internally, and created this library out of a need for an async-aware concurrent hashmap since there weren’t many available in the Rust ecosystem.

Whirlwind is a sharded HashMap with a fully asynchronous API. Just as dashmap is a replacement for std::sync::RwLock

6mo | Hacker news

Keresés