β‘Selection of a foundation models
Last updated
Last updated
Foundation models are Large Language Models (LLMs) pre-trained on extensive datasets, capable of handling a multitude of downstream tasks. Given that training these models from scratch is highly complex, time-consuming, and extremely costly, only a select few institutions possess the necessary resources to do so.
To illustrate the scale of this challenge: A 2020 study by Lambda Labs estimated that training OpenAI's GPT-3 model (which consists of 175 billion parameters) would take 355 years and approximately $4.6 million using a Tesla V100 cloud instance.
At present, the AI community is experiencing what is being termed as its "Linux moment". Developers currently have to choose between two types of foundation models, making trade-offs between performance, cost, usability, and flexibility: proprietary models and open-source models.