Show HN: ArchGW – an intelligent edge and service proxy for agents

Hey HN!

This is Adil, Salman and Jose and and we’re behind archgw [1]. An intelligent proxy server designed as an edge and AI gateway for agents - one that natively know how to handle prompts, not just network traffic. We’ve made several sweeping changes so sharing the project again.

A bit of background on why we’ve built this project. Building AI agent demos is easy, but to create something production-ready there is a lot of repeat low-level plumbing work that everyone is doing. You’re applying guardrails to make sure unsafe or off-topic requests don’t get through. You’re clarifying vague input so agents don’t make mistakes. You’re routing prompts to the right expert agent based on context or task type. You’re writing integration code to quickly and safely add support for new LLMs. And every time a new framework hits the market or is updated, you’re validating or re-implementing that same logic—again and again.

Putting all the low-level plumbing code in a framework gets messy to manage, harder to update and scale. Low-level work isn't business logic. That’s why we built archgw - an intelligent proxy server that handles prompts during ingress and egress and offers several related capabilities from a single software service. It lives outside your app runtime, so you can keep your business logic clean and focus on what matters. Think of it like a service mesh, but for AI agents.

Prior to building archgw, the team spent time building Envoy [2] at Lyft, API Gateway at AWS, specialized NLP models at Microsoft Research and worked on safety at Meta. archgw was born out of the belief that rule-based, single-purpose tools that handle the work around resiliency, processing and routing prompts should move into a dedicated infrastructure layer for agents, but built on the battle-tested foundational of Envoy Proxy.

The intelligence in archgw comes from our fast Task-specific LLMs [3] that can handle things like agent routing and hand off, guardrails and preference-based intelligent LLM calling. Here are some additional details about the open source project. archgw is written in rust, and the request path has three main parts:

* Listener subsystem which handles downstream (ingress) and upstream (egress) request processing. * Prompt handler subsystem. This is where archgw makes decisions on the safety of the incoming request via its prompt_guard hooks and identifies where to forward the conversation to via its prompt_target primitive. * Model serving subsystem is the interface that hosts all the lightweight LLMs engineered in archgw and offers a framework for things like hallucination detection of our these models

We loved building this open source project, and our belief is that this infra primitive would help developers build faster, safer and more personalized agents without all the manual prompt engineering and systems integration work needed to get there. We hope to invite other developers to use and improve Arch. Please give it a shot and leave feedback here, or at our discord channel [4] Also here is a quick demo of the project in action [5]. You can check out our public docs here at [6]. Our models are also available here [7].

[1] https://github.com/katanemo/archgw [2] https://www.envoyproxy.io/ [3] https://huggingface.co/collections/katanemo/arch-function-66... [4] https://discord.com/channels/1292630766827737088/12926307682... [5] " rel="nofollow">