Show HN: Data Bonsai: a Python package to clean your data with LLMs

I've been doing some data cleaning for my fine tuning projects using LLMs, and decided to just build a package for it as a side project. Check it out here: https://github.com/databonsai/databonsai

Some features:

- categorization (labelling), transformation and decomposition (text into structured format) - validates llm outputs

- batch mode batches up the inputs/outputs so you don't send the prompt (schema, fewshot examples) for eve


Szukaj