Just by looking at the hype happening in all media and driven by the fast pace of changes in the AI industry, one could be tempted to think that everything’s coming up roses, with generative AI LLMs being the panacea for all technological challenges.
Emil Ștețco, founder and CEO of Zetta Cloud:
Having worked in the AI industry as an “engipreneur” (a combination of engineer and entrepreneur) for over a decade, I’d like to share some of my thoughts on the current AI LLMs landscape and our vision for building the future here at Zetta Cloud. The content that follows will blend narrative discourse with bullet points, numbers, and facts, reflecting my engineering inclination, which prevents me from staying in the narrative mode for an extended period of time. I’ll cut to the chase, without any unnecessary elaboration.
GPU Challenges for AI Startups
Here are some undisputed facts about Artificial Intelligence Large Language Models (LLMs):
- Training AI LLMs not only requires data but also heavily relies on GPUs.
- The cost of GPU cards is still very high, making it more difficult for AI startups to acquire the resources they need to develop successful products.
- The rising costs of GPUs have become a hurdle for many AI startups seeking the resources for effective product development.
- Other AI startups are building on top of existing mammoth models using APIs, potentially jeopardizing their own essential IP and creating a significant dependency on these third-party models.
This blog post offers an excellent deep dive into the GPU demand of the industry. Here are a couple of numbers that will give you a very good idea about what we are talking about in terms of how many GPUs are needed (stats as of July 2023):
- GPT-4 was likely trained on somewhere between 10,000 to 25,000 A100s.20
- Meta has about 21,000 A100s, Tesla has about 7,000 A100s, and Stability AI has about 5,000 A100s.21
- Falcon-40B was trained on 384 A100s.22
- Inflection used 3,500 H100s for their GPT-3.5 equivalent model.23
You can do the simple math (NVIDIA GPU A100 is priced at approximately $10,000 per unit) to understand how huge the costs are for only having a small (but essential) portion of the infrastructure needed to train mammoth LLMs.
Having said that, are we in danger of going back to the “Age of AI Dictatorship” (see our Manifesto for more insights) where AI is controlled by a few big tech companies affording the plethora of GPUs to train mammoth generative models that are consumed by many?
Before tentatively answering this question, let us look closer into the matter while adding the other end of the language model spectrum, the AI Expert Models, into the picture.
A Closer Look at AI Expert Models
Unlike the vast and generalized Large Language Models (LLMs), AI Expert Models are designed with a specific focus in mind. They are tailored to excel in particular domains or tasks such as named entity recognition, sentiment analysis, and various text classification tasks like spam detection or topic categorization. These Expert Models often provide more precise and efficient solutions for those specialized areas
- Resource Efficiency: Unlike LLMs, Expert Models can be trained and operated on more modest hardware, including commodity equipment. This makes them accessible to a broader range of developers and organizations.
- High Volume Processing: Expert Models can process large amounts of domain-specific data efficiently, offering a targeted approach that often outperforms generalized models in specific tasks.
In the following sections, we will explore the advantages of LLMs, the potential synergy between Expert Models and LLMs, and the emerging role of No-Code AI platforms. Together, these insights will help us understand the evolving landscape of AI and how we can mitigate the risk of returning to an era of AI centralization.
- Versatility: Unlike expert models designed for specific tasks, LLMs can handle a wide range of tasks without task-specific training data.
- Dynamic Text Generation: One of the standout advantages of LLMs in the realm of text generation is their ability to produce contextually relevant and coherent long-form content.
The AI landscape is marked by the coexistence of specialized Expert Models and versatile Large Language Models (LLMs). Each brings its unique strengths to the table, and when combined, they can redefine the boundaries of what AI can achieve.
Expert Models, tailored for specific tasks, often require fewer computational resources compared to their larger counterparts. This makes them cost-effective and environmentally friendly, especially when deployed at scale. They can handle vast amounts of data pertaining to their domain with remarkable speed and accuracy. This makes them indispensable for tasks that demand real-time processing, such as financial transactions or open-source intelligence (OSINT) systems.
On the other hand, LLMs are designed to understand and generate a wide array of content. Their broad training data allows them to tackle diverse tasks, from answering trivia questions to crafting narratives. They excel in producing contextually relevant and coherent content.
However, the true potential of AI is realized when we harness the resource efficiency and processing power of Expert Models with the adaptability and Generative AI capabilities of LLMs. Imagine a system where one or more Expert Models quickly process domain-specific data throughout a discovery phase and then hand it over to an LLM to craft a detailed, contextually relevant report or response. Such a synergy not only optimizes performance but also ensures a broader application spectrum.
However, setting up a machine learning system is not easy. First, there’s the whole thing about gathering data. Then you’ve got to either pick out a solid model or build one from scratch. You might need to hunt down a tool to tag the data and set up a data pipeline. Once that’s done, you’ve got to dive into training the model, tweak with hyperparameters, sort out any data glitches, and then, the big step: get a server ready for deployment and make sure your model plays nice with it. This whole process used to take up a significant amount of time, up to months.
But here’s where the game changes. What if we introduce a no-code AI platform into the mix? Instead of navigating with each step individually, this platform streamlines everything. With a user-friendly interface, it covers everything from data collection to model deployment. This means anyone, even without deep technical knowledge, can harness the combined power of Expert Models and LLMs, making the once daunting task of setting up ML systems as straightforward as a few clicks.
There’s a rising concern that major tech corporations might dominate the AI landscape, leading to what some term an “AI Dictatorship.” This scenario implies that only those with extensive computational resources, like many GPUs, would spearhead significant AI initiatives. So, are we in danger of going back to the “Age of AI Dictatorship”? Yes, we are, but there’s a silver lining. AI is becoming more democratic through open-source models and tools that allow rapid productization of AI engines on commodity hardware. The no-code approach to machine learning serves as a bridge, guiding users from the initial phase of data collection to the final stages of model deployment. Thus, even those without an in-depth AI background can harness potent AI capabilities. With such tools at our disposal, the intent is clear: decentralize AI’s power, ensuring it’s not a privilege limited to just the tech giants but a resource accessible to a broader audience.
This is our way to make sure everyone has a chance to benefit from AI.