In 2016 at TechCrunch Disrupt New York, several of the original developers behind what became Siri unveiled Viv, an AI platform that promised to connect various third-party apps to perform just about any task. The pitch was enticing – but never fully realized. Samsung then acquired Viv, integrating a simplified version of the technology into its Bixby voice assistant.
Six years later, a new team claims to have cracked the code for a universal AI assistant – or at least come a little closer. In a product lab called Adept that emerged from stealth today with $65 million in funding, they are – in the words of the founders – “building[ing] general intelligence that allows humans and computers to work together creatively to solve problems.
It is noble. But Adept co-founders CEO David Luan, CTO Niki Parmar and chief scientist Ashish Vaswani sum up their ambition to perfect an “overlay” in computers that works with the same tools as people. This overlay will be able to respond to commands such as “generate a monthly compliance report” or “draw stairs between these two points in this plan,” Adept claims, all using existing software like Airtable, Photoshop, Tableau, and Twilio to to do work. .
“[W]We are training a neural network to use all the software tools in the world, building on the vast amount of existing capabilities that people have already created. Luan told TechCrunch in an email interview. “[W]With Adept, you can focus on the work you love most and ask our [system] to take on other tasks… We expect the employee to be a good student and to be highly coachable, to become more helpful and aligned with every human interaction.
From Luan’s description, what Adept creates looks a bit like robotic process automation (RPA), or software robots that leverage a combination of automation, computer vision, and machine learning to automate repetitive tasks like filling out forms and responding to emails. But the team insists that their technology is far more sophisticated than what RPA vendors like Automation Anywhere and UiPath offer today.
“We are building a general system that helps people do things in front of their computers: a universal AI collaborator for every knowledge worker… We are training a neural network to use all the software tools in the world, in us building on the vast amount of existing capabilities that people have already created,” Luan said. “We believe that the ability of AI to read and write text will continue to be valuable, but that being able to do things on a computer will be much more valuable for the company… [M]Text-trained models can write beautiful prose, but they can’t act in the digital world. You can’t ask [them] to book you a flight, write a check to a salesperson or conduct a scientific experiment. True general intelligence requires role models who can not only read and write, but act when people ask them to do something.
Adept isn’t the only one exploring this idea. In a February paper, scientists at Alphabet-backed DeepMind describe what they call a “data-driven” approach to teaching AI to control computers. By having an AI observes the keyboard and mouse commands of people performing computer tasks “following instructions”, such as booking a flight, the scientists were able to show the system how to perform more than a hundred tasks with precision “human”.
Not so by chance, DeepMind co-founder Mustafa Suleyman recently teamed up with LinkedIn co-founder Reid Hoffman to launch Inflection AI, which, like Adept, aims to use AI to help humans work more effectively with computers.
Adept’s apparent differentiator is a brain pool of AI researchers from DeepMind, Google, and OpenAI. Vaswani and Parmar helped launch the Transformer, an AI architecture that has garnered considerable attention in recent years. Dating back to 2017, Transformer has become the architecture of choice for natural language tasks, demonstrating an ability to summarize documents, translate between languages, and even classify images and analyze biological sequences.
Among other products, OpenAI’s GPT-3 language generator was developed using Transformer technology.
“Over the next several years, everyone piled on the Transformer, using it to quickly solve decades-old problems. When I was leading engineering at OpenAI, we evolved the Transformer into GPT-2 (the predecessor of GPT-3) and GPT-3,” Luan said. “Google’s efforts to scale Transformer models have yielded [the AI architecture] BERT, powering Google search. And several teams, including members of our founding team, have trained Transformers who can write code. DeepMind has even shown that the Transformer works for protein folding (AlphaFold) and Starcraft (AlphaStar). Transformers have made general intelligence tangible for our field.
At Google, Luan was the overall technical lead for what he describes as the “big model effort” at Google Brain, one of the tech giant’s preeminent research divisions. There, he trained bigger and bigger transformers with the goal of eventually creating a general model to power all machine learning use cases, but his team hit a clear limit. The best results were limited to models designed to excel in specific areas, such as analyzing medical records or answering questions about particular topics.
“Since the beginning of the field, we have wanted to build models with a flexibility similar to that of human intelligence – ones that can work for a wide variety of tasks… [M]Machine learning has seen more progress in the past five years than in the previous 60,” Luan said. “Historically, long-term AI work has been the domain of large tech companies, and their concentration of talent and computation has been beyond reproach. Looking ahead, we believe the next era of AI breakthroughs will require solving problems at the heart of human-computer collaboration.
Whatever the final form of its product and business model, can Adept succeed where others have failed? If so, the windfall could be substantial. According to At Markets and Markets, the market for business process automation technologies – technologies that streamline enterprise workloads in customer-facing and back-office environments – will grow from $9.8 billion in 2020 to $19 .6 billion by 2026. One 2020 survey by process automation vendor Camunda (a biased, granted source) found that 84% of organizations anticipate increased investment in process automation due to industry pressures, including increased work from a distance.
“Adept’s technology seems plausible in theory, [but] talking about Transformers having to be ‘able to act’ seems a bit like a misdirection to me,” said Mike Cook, an artificial intelligence researcher at the Knives & Paintbrushes research collective, which is not affiliated with Adept. , to TechCrunch via email. “Transformers are designed to predict the next things in a sequence of things, that’s all. For a Transformer, it makes no difference whether that prediction is a letter in text, a pixel in an image, or a call to API in a piece of code, so this innovation doesn’t seem more likely to lead to general artificial intelligence than anything else, but it could produce AI better suited to assisting with simple tasks.”
It is true that the cost of training advanced AI systems is lower than it once was. With a fraction of OpenAI’s funding, recent startups including AI21 Labs and Join managed to build models comparable to the GPT-3 in terms of capabilities.
Meanwhile, continued innovations in multimodal AI — AI that can understand the relationships between images, text, and more — put a system capable of translating requests into a wide range of computational commands into the realm of possibility. . So it works like OpenAI InstructGPTa technique that improves the ability of language models such as GPT-3 to follow instructions.
Cook’s main concern is how Adept trained its AI systems. He notes that one of the reasons other Transformer models have had such success with text is that there is an abundance of sample text to learn from. A product like Adept would presumably need lots of examples of successfully completed tasks in applications (e.g. Photoshop) paired with textual descriptions, but that data doesn’t naturally occur in the world.
In the February DeepMind study, the scientists wrote that, to collect training data for their system, they had to pay 77 people to perform more than 2.4 million demonstrations of computer tasks.
“[T]The training data is probably created artificially, which raises many questions both about who was paid to create it, about its scalability to other domains in the future, and whether the trained system will have the kind of depth that the other Transformer models have. “, Cook said. “His [also] by no means a “pathway to general intelligence”… This might make it perform better in some areas, but it will likely perform worse than a system trained explicitly on a particular task and application.
Even the best-laid roadmaps can run into unforeseen technical challenges, especially when it comes to AI. But Luan trusts Adept’s founding senior talent, which includes Google’s former model production infrastructure manager (Kelsey Schroeder) and one of Google’s original production speech recognition model engineers (Anmol Gulati).
“[W]Although general intelligence is often described in the context of human replacement, it is not our North Star. Instead, we believe AI systems should be built with people at the center,” Luan said. “We want to give everyone access to increasingly sophisticated AI tools that allow them to achieve their goals in collaboration with the tool; our models are designed to work hand in hand with people. Our vision is one where people stay in the driver’s seat: discovering new solutions, enabling more informed decisions, and giving us more time for the work we really want to do.
Greylock and Addition co-led Adept’s funding round. The round also saw participation from Root Ventures and angels, including Behance Founder Scott Belsky (Behance Founder), Airtable Founder Howie Liu, Chris Re, Tesla Autopilot Head Andrej Karpathy, and Sarah Meyohas.