After a long week of coding, you might assume San Francisco’s builders would retreat into the Bay Area’s mountains, beaches or vibrant clubbing scene. But in reality, when the week stops, the AI hackathons begin.
In the last few years, San Francisco has exploded with AI hackathons. On any given Saturday or Sunday, technologists give talks on the latest advances in AI, network and — most importantly — build ideas into working demonstrations. Sometimes, hackathons offer prizes in the form of cash or cloud credits, but the real winners walk away with the inkling of a startup.
“There’s no better place in the world to build the most ambitious project of your life than San Francisco,” says Agency AI co-founder Alex Reibman. “You frequently see tons of competitions — like hackathons — but they’re not competing against each other. It’s just as much collaborative as it is competitive.”
Last summer at a San Francisco hackathon, Reibman decided to try his hand at building AI agents that could scrape the web. Agents are a hot topic in Silicon Valley as the AI boom peaks. The term isn’t precisely defined, but generally describes AI-based bots that can perform tasks automatically, using interfaces and services that were not originally designed to be automated — a kind of replacement for mundane tasks that used to require human intervention.
But Reibman immediately ran into a problem. “They sucked,” said Reibman in an interview. “The agents failed like 30 to 40% of the time, and often in unexpected ways.”
To fix that, Reibman’s team built internal debugging tools to see where their agents were going wrong. They ended up getting the agents to work a little better, but the debugging tools themselves ended up stealing the show and winning the hackathon.
“I started showing the tools at a bunch of hackathons and events in San Francisco, and people started asking for access to them,” said Reibman. “That was basically the confirmation I needed: Instead of building an agent ourselves, we should build tools to make it easier to build agents.”
So Reibman started Agency alongside his co-founders Adam Silverman and Shawn Qiu, offering tools to observe what AI agents are actually doing, and catch where they’re going wrong. A year later, those tools ultimately became Agency’s core product, the AgentOps platform, which is now used by thousand of teams monthly, Reibman tells TechCrunch. The startup has now raised $2.6 million in pre-seed funding, led by 645 Ventures and Afore Capital.
Chief operating officer Adam Silverman tells TechCrunch that AgentOps is like “multi-device management for agents,” analyzing everything the agent does to ensure it doesn’t go rogue.
“You want to understand whether your agent is going to go rogue and identify what limitations you can put in place,” said Silverman in an interview. “A lot of the work is being able to visually see where your guardrails exist, and whether the agent abides by them, before tossing them into production.”
The startup partners with Cohere and Mistral, AI model developers that also offer agent creation services, so that customers can use the AgentOps’ dashboard to see how agents interact with the world, and how much each one costs. Agency is model-agnostic, meaning it works with several different AI agent frameworks, but is integrated with popular tools such as Microsoft’s AutoGen, crewAI and AutoGPT.
Beyond the AgentOps’ dashboard, Agency also offers consulting services (Reibman was previously at consulting firm EY) to help businesses get started building agents. Agency wouldn’t share any customers by name, but shared that hedge funds, consultants and marketing firms are using their tools.
For example, Reibman says Agency helped create an AI agent that writes blog posts about companies the customer is working with. Now, the same customer uses the AgentOps dashboard to track the agent’s performance and costs.
Major players like OpenAI and Google are likely to build out their agent products in the coming months, and AI startups like Agency have to figure out how to work alongside those advancements, not against them.
“There’s so many layers in the stack, it’s not likely the LLM provider will try to capture all of them,” said Reibman. “OpenAI and Anthropic are building the agent builders, but there’s all these layers around it to make sure you have a production-ready code base.”