A new company is emerging from stealth today with backing from Google’s AI-focused venture fund to help businesses compile their open-source AI infrastructure and reduce their engineering overheads.
Cake integrates and secures more than 100 components for enterprises, including data source adapters (e.g. Apache Hadoop), data ingestion (e.g. Apache Kafka), data labelling (e.g. Label Studio), vector and graph databases (e.g. Milvus or Neo4j), generative AI APIs and related tools (e.g. Anthropic), among many other categories.
This hints at why Cake is called what it is — it takes the various “layers” that constitute the AI stack, and integrates them into a more digestible, production-ready format suitable for business.
‘Big picture problem’
Founded out of New York in 2022 by Misha Herscu (CEO) and Skyler Thomas (CTO) — pictured above — Cake launched last year and is already working with customers like AI bioscience startup Altis Labs and data intelligence insurtech Ping. However, the company hasn’t been making much noise in public until now.
On top of its formal unveiling today, Cake said it has raised $13 million since its inception. This includes $3 million in pre-seed funding through its formative couple of years, and a recent $10 million seed round led by Google’s Gradient Ventures.
“We haven’t been super secretive; we’ve just been building, and working with customers,” Herscu explained to TechCrunch in an interview last week.
Previously, Herscu founded an AI company called McCoy Medical Technologies that was focused on machine learning infrastructure for radiology, and sold it in 2017 to IT vendor TeraRecon. He later joined New York VC firm Primary Venture Partners as “operator in residence,” where he pursued his next venture by chatting with hundreds of data science and AI executives.
“I did over 200 customer discovery calls, asking what their biggest pain points and bottlenecks are,” Herscu said. “The biggest problem wasn’t a single part of the stack, such as setting up a vector database or data pipeline. It was that there are a ton of different components across a very rich ecosystem. How do you go about integrating everything reliably, and making it production ready?”
This is what Herscu refers to as the “big picture problem,” and is where his new business enters the fray.
Cake is all about making sense of the myriad open-source components that constitute the modern AI stack, and providing bundled, managed, open-source AI infrastructure for small teams. This isn’t about building a business around a single open-source project as countless companies have done; instead, it’s about assembling and serving a curated selection of open-source projects across an entire stack and making it run smoothly.
Let’s say a large financial services company has millions of documents containing complex financial data, and it wants to do RAG (retrieval augmented generation) against these files to improve the quality of the responses to natural-language queries. If an off-the-shelf product isn’t up to the task, or is unsuitable for compliance reasons, the company would have to build its own system by installing and stitching multiple different components. That’s a time-consuming endeavor that Cake can take care of.
Elsewhere, a hospital might need to construct a secure system for analyzing images from CT scans, or an e-commerce company might want to upgrade its recommendation engine. These are all potential use-cases for Cake.
“We do run the gamut, but I’d say our sweet spot is definitely when companies are going beyond what you can do with a simple, off-the-shelf product,” Herscu said.
Parallel development
Cake’s CTO Thomas previously worked at IBM as a chief architect, and more recently he was a distinguished engineer and director of strategy at Hewlett Packard Enterprise, which acquired a previous company he worked at called MapR.
Thomas says he has worked across hundreds of projects through the years, with large and smaller customers, and he noticed a trend permeating pretty much all of them — every one was using open-source tools in some way, much of it fresh out of research labs. Still, using them in the enterprise wasn’t easy.
“It takes a huge amount of time for even the largest enterprises to take what’s coming out of the labs and integrate it into what they do,” Thomas told TechCrunch. “A lot of that is because most of it isn’t ready for the enterprise — it might not have authentication and authorization, and enterprises have to do that themselves.”
There are parallels to what Cake is striving for here. In Europe, we have the likes of Finnish Aiven, a $2 billion unicorn, which is doing something similar but with a focus on data infrastructure. Perhaps the most obvious comparison would be Red Hat, which IBM acquired for $34 billion and is best known for its enterprise-grade Linux operating system (RHEL).
“In the early days of Linux, there were thousands of open-source packages that everyone wanted to use, but weren’t integrated and weren’t secure,” Thomas said. “There just wasn’t a support model for it, and so the Red Hats of the world made Linux safe for the enterprise. We want to do a similar thing for AI today.”
While there are plans to eventually introduce a hosted version of Cake, for now companies have to run it in their own environments. For many, this won’t be an issue because data privacy stipulations mean they can’t send data outside their own systems anyway. But a hosted version might be appealing to organizations with lower compliance obligations.
“It is actually easier for us if we can control the cloud,” Herscu added.
Aside from lead investor Gradient, Cake’s seed round saw participation from its pre-seed investor Primary Venture Partners, as well as Alumni Ventures, Friends & Family Capital, Correlation Ventures, and Firestreak Ventures.
The hitherto unannounced $10 million seed round, which closed back in April, is indicative not only of the founders’ backgrounds but also the company’s traction. Herscu said that the company is already looking toward its next financing round, with tentative plans to raise again around the middle of 2025.
“From a traction standpoint, we look more like a Series A company already. We were able to get there pretty quickly,” Herscu said. “When we go to the Series A, it’ll probably look more like a Series B.”