Experimentation

In MLOps, the experimental process remains relatively consistent, whether you're training a model from scratch or fine-tuning a pre-existing one. You track various inputs, such as the model's architecture, hyperparameters, and data augmentations, and evaluate outputs, such as performance metrics.

However, LLMOps introduces a critical choice: should you opt for prompt engineering or fine-tuning? While fine-tuning in LLMOps follows a similar trajectory to that in MLOps, prompt engineering introduces a distinct experimental setup. It necessitates a different approach, with a focus on managing and optimizing prompts. The challenge lies not only in creating effective prompts but also in systematically tracking and evaluating their performance. This shift underscores the evolving nature of experimentation in the era of large language models.

Last updated