Data management

In traditional MLOps, we often deal with data-intensive ML models. Training a neural network from scratch necessitates a significant volume of labeled data, and even the fine-tuning of a pre-trained model typically requires several hundred samples. We acknowledge that large datasets will inevitably contain imperfections, but data cleaning remains a critical part of the ML development process.

In the realm of LLMOps, the fine-tuning process shares similarities with that of MLOps. However, prompt engineering introduces a zero-shot or few-shot learning paradigm. This approach involves using a limited number of carefully selected samples, highlighting the need for high-quality, meticulously curated data rather than large volumes of potentially imperfect data. This shift emphasizes the importance of precision and quality in data selection for LLMOps.

PreviousHow is LLMOps different than MLOps?NextExperimentation

Last updated 1 year ago