Evaluating LLMs for my personal use case