Meet Twin Labs, a Paris-based startup that wants to build an automation product for repetitive tasks, such as onboarding new employees to all your internal services, reordering items when you’re running out of stock, downloading financial reports across several SaaS products, reaching out to potential propects and more.
“Twin’s starting point is a science-fiction idea. We saw the development of the technical capabilities of LLMs — foundation models. And the question we asked ourselves was whether we’d be able to duplicate ourselves by training an AI agent on the way we perform our tasks,” Twin Labs co-founder and CEO Hugo Mercier told me.
In Twin Labs’ case, the most interesting thing isn’t what they’re doing — improving internal processes — but how they’re doing it. The company relies on multimodal models with vision capabilities, such as GPT-4 with Vision (GPT-4V), to replicate what humans usually do.
Before landing on multimodal models, Twin Labs first tried to develop autonomous agents using traditional LLMs. “We’ve tested lots of things, we’ve implemented research papers, we’ve tested open-source GitHub repositories. Overall, the conclusion is that LLMs are completely unreliable. This means that LLMs are making the wrong decisions,” Mercier said. “In the end, the task isn’t done.”
According to him, GPT-4V has been trained on a lot of different software interfaces and the underlying code bases, which unlocked new possibilities. “When you show an interface, it understands the feature behind the button,” Mercier said.
Unlike Zapier and other automation products, Twin Labs doesn’t rely on APIs and designing complicated multi-step processes. Instead, Twin Labs is more like a web browser. The tool can automatically load web pages, click on buttons and enter text.
For instance, if you’re hiring someone, you might need to add this person’s information in your payroll system, send an invitation to Slack, create a Google Workspace account and invite your new employee to create an account with the healthcare insurance provider.
Companies usually keep a long list of tasks and just go through the list every time there’s a new team member. These tasks aren’t complicated but it’s extremely important to do them well, in the right order and with some specific checkboxes ticked. That’s why it’s going to be important to be able to train Twin Labs’ AI assistant using screen recording and natural language descriptions.
But the startup isn’t there yet — it is working toward this vision. Hugo Mercier and Joao Justi, the two co-founders, spent the last six months building a prototype of this product. They also raised $3 million in pre-seed funding from Betaworks, Motier Ventures and many angel investors, such as Florian Douetteau (Dataiku), Thomas Wolf (HuggingFace), Charles Gorintin (Alan), Mehdi Ghissassi (Deepmind), Romain Huet (OpenAI), Irwan Bello (OpenAI), Romuald Elie (Deepmind), Yan-David Erlich (Weight & Biases), Olivier Pomel (Datadog), Rodolphe Saadé (CMA CGM), Thibaud Elziere (Hexa), Quentin Nickmans (Hexa), Philippe Corrot (Mirakl) and Rand Hindi (Snips, Zama).
There are still many challenges ahead for Twin Labs’ autonomous agent system. For example, completing a task costs quite a bit of money — but API and infrastructure costs are rapidly going down in the AI space. Twin Labs will first ship a product with a library of pre-trained tasks to make sure that they work well. After that, the startup expects that it will open up its platform so that clients can create their own tasks.
While many people associate AI products with a chatbot interface, Twin Labs’ approach is interesting as it’s an innovative way to interact with AI models. “We really wanted to get down to the nitty-gritty of what people do on a day-to-day basis, and how we can take over some of the things that are actually a bit of a hassle for them,” Mercier said.