Day 12 — Fine-Tuning vs Prompting
This is the question that trips up most freshers in AI interviews:
"When would you fine-tune a model versus just improving the prompt?"
Most students give a vague answer about "when you need better performance." That is not an answer. Today you will learn the actual framework.
The Mental Model
Think of a pre-trained LLM like GPT-4 as a highly educated generalist. It knows a lot about everything.
Prompting is like giving this generalist detailed instructions before a task. "You are a medical professional. Answer questions using simple language. Always recommend seeing a doctor for serious symptoms." The person does not change — you are just giving them context and instructions.
Fine-tuning is like sending the generalist back to school for specialisation. After training, they think differently about a specific domain. The model's weights — its actual learned knowledge — change.
When Prompting Is Enough (Most of the Time)
Use prompting when:
The task is well-defined and the base model understands the domain
If you want GPT-4 to write cover letters in a specific format, a detailed system prompt with examples is almost always enough. The model already knows what a cover letter is. It just needs formatting instructions.
You need to change tone, style, or persona
"You are a strict technical interviewer at a product company. Ask one question at a time. Give brief feedback after each answer." This is entirely a prompting job.
The task changes frequently
If you need different behaviour for different use cases, prompting lets you switch instantly. Fine-tuning creates a fixed model for a specific use case.
Budget matters
Prompting costs nothing beyond the API call. Fine-tuning on GPT-4 costs hundreds to thousands of dollars and takes days. Even fine-tuning smaller open-source models requires GPUs.
Practical example: At resumeportfolio.in, the career coach, cover letter generator, and mock interviewer are all prompt-based. No fine-tuning. A well-crafted system prompt with the student's portfolio data as context achieves results that would cost thousands to replicate through fine-tuning.
When Fine-Tuning Actually Makes Sense
Use fine-tuning when:
The task requires knowledge the base model does not have
A model trained on internet data knows nothing about your company's internal documents, proprietary data, or domain-specific terminology. If you need the model to speak fluently in your specific domain — medical diagnosis using your hospital's terminology, legal analysis using your firm's case history — fine-tuning helps.
Note: RAG (which you built on Day 9) solves a lot of this without fine-tuning. Use RAG first.
You need consistent output format at scale
If you are processing 100,000 documents and need every output in an exact JSON schema, fine-tuning creates a model that reliably produces that format. Prompting works for this too, but fine-tuning reduces errors at scale.
Latency and cost at very high volume
Fine-tuning a smaller model (7B parameters) to match the quality of a larger model (70B) for a specific task reduces inference cost significantly. At millions of API calls per day, this matters.
The task involves a very specific style or voice
If you are building a product where outputs must sound exactly like a specific person or brand — not approximately, but exactly — fine-tuning on examples of that voice produces better results than prompting.
The Decision Tree
Does the base model understand the domain?
↓ No
Use RAG to add knowledge — try prompting with retrieved context first
↓ Still not good enough after RAG
Consider fine-tuning
↓ Yes (model understands domain)
Is the issue tone, format, or persona?
↓ Yes → Prompting
↓ No
Is the issue consistency at very high volume?
↓ Yes → Consider fine-tuning small model
↓ No → Prompting with better examples
The honest answer in 90% of cases: try prompting first. Add few-shot examples (showing the model 3-5 examples of good input-output pairs in the prompt). If that is not enough, try RAG. If that is not enough, then consider fine-tuning.
How to Explain This in an Interview
Interviewer: "When would you fine-tune versus prompt?"
You: "I use a decision framework. First, I check if the base model already understands the domain — if it does, I start with prompting and few-shot examples. If the issue is adding external knowledge, I use RAG before considering fine-tuning. Fine-tuning makes sense when I need consistent specialised output at high volume, when the domain is proprietary, or when I need a smaller model to match a larger one's quality for cost reasons. In practice, most tasks that seem to need fine-tuning can be solved with a well-designed prompt and good retrieval."
Key Terms
| Term | Meaning |
|---|---|
| Fine-tuning | Updating a model's weights by training on new examples |
| Prompting | Giving the model instructions without changing its weights |
| Few-shot prompting | Showing the model 3-5 examples in the prompt |
| RAG | Adding external knowledge at inference time (no weight updates) |
| Inference | Running the model to get an output |
| Parameters | The numbers that define what a model knows |
Day 12 of 15 — AI Survival Kit for Engineers