Prompt engineering

Your prompt is the entire contract between you and the AI. A vague prompt gets vague results. Specific instructions with examples produce consistent output. Right now, summaries vary in format, length, and sometimes include unwanted metadata. Let's fix that.

Outcome

Refine the summarizeReviews prompt to produce consistent, well-formatted summaries that match review sentiment and follow a specific structure.

Fast Track

Calculate averageRating and add tone guidance (1-2: negative, 3: neutral, 4-5: positive)
Add 3 few-shot examples showing "Customers like..." format with pros/cons structure
Add response cleanup: .trim().replace(/^"/, "").replace(/"$/, "").replace(/[\[$]\d+ words[\]$]/g, "")

Hands-on Exercise 2.3

Improve the prompt for production-quality summaries:

Requirements:

Calculate average rating to determine tone
Add 3 few-shot examples showing ideal summary format
Include tone guidance (negative/neutral/positive) based on ratings
Specify output constraints (length, format, what to avoid)
Clean up the AI response (trim quotes, remove word counts)

Implementation hints:

Few-shot examples teach by showing, not telling
Use .reduce() to calculate average rating
Use regex to clean common AI formatting artifacts
Keep examples in the prompt consistent with your requirements
Add maxTokens and temperature parameters

Solution

Update lib/ai-summary.ts:

lib/ai-summary.ts

import { generateText } from "ai";
import { Product } from "./types";
 
export async function summarizeReviews(product: Product): Promise<string> {
  const averageRating =
    product.reviews.reduce((acc, review) => acc + review.stars, 0) /
    product.reviews.length;
 
  const prompt = `Write a summary of the reviews for the ${
    product.name
  } product. The product's average rating is ${averageRating} out of 5 stars.
 
Your goal is to highlight the most common themes and sentiments expressed by customers.
If multiple themes are present, try to capture the most important ones.
If no patterns emerge but there is a shared sentiment, capture that instead.
Try to use natural language and keep the summary concise.
Use a maximum of 4 sentences and 30 words.
Don't include any word count or character count.
No need to reference which reviews you're summarizing.
Do not reference the star rating in the summary.
 
Start the summary with "Customers like…" or "Customers mention…"
 
Here are 3 examples of good summaries:
Example 1: Customers like the quality, space, fit and value of the sport equipment bag case. They mention it's heavy duty, has lots of space and pockets, and can fit all their gear. They also appreciate the portability and appearance. That said, some disagree on the zipper.
Example 2: Customers like the quality, ease of installation, and value of the transport rack. They mention that it holds on to everything really well, and is reliable. Some complain about the wind noise, saying it makes a whistling noise at high speeds. Opinions are mixed on fit, and performance.
Example 3: Customers like the quality and value of the insulated water bottle. They say it keeps drinks cold for hours and the lid seals well. Some customers have different opinions on size and durability.
 
Hit the following tone based on rating:
- 1-2 stars: negative
- 3 stars: neutral
- 4-5 stars: positive
 
The customer reviews to summarize are as follows:
${product.reviews
    .map((review, i) => `Review ${i + 1}:\n${review.review}`)
    .join("\n\n")}`;
 
  try {
    const { text } = await generateText({
      model: "anthropic/claude-sonnet-4.5",
      prompt,
      maxOutputTokens: 1000,
      temperature: 0.75,
    });
 
    // Clean up the response
    return text
      .trim()
      .replace(/^"/, "")
      .replace(/\"$/, "")
      .replace(/[\[\(]\d+ words[\]\)]/g, "");
  } catch (error) {
    console.error("Failed to generate summary:", error);
    throw new Error("Unable to generate review summary. Please try again.");
  }
}

Breaking Down the Improvements

1. Calculate average rating:

const averageRating =
  product.reviews.reduce((acc, review) => acc + review.stars, 0) /
  product.reviews.length;

Used to determine tone and provide context to the AI.

2. Clear constraints:

Use a maximum of 4 sentences and 30 words.
Don't include any word count or character count.
Do not reference the star rating in the summary.

3. Few-shot examples:

Here are 3 examples of good summaries:
Example 1: Customers like the quality, space, fit and value...
Example 2: Customers like the quality, ease of installation...
Example 3: Customers like the quality and value...

These teach the AI the exact format and style you want.

4. Tone guidance:

Hit the following tone based on rating:
- 1-2 stars: negative
- 3 stars: neutral
- 4-5 stars: positive

5. Model parameters:

maxOutputTokens: 1000,      // Limit output length
temperature: 0.75,    // Balance creativity and consistency

6. Response cleanup:

return text
  .trim()                              // Remove whitespace
  .replace(/^"/, "")                   // Remove leading quote
  .replace(/"$/, "")                   // Remove trailing quote
  .replace(/[\[\(]\d+ words[\]\)]/g, ""); // Remove word counts like "(30 words)"

Try It

Save the file and visit a product page
Compare before/after on /mower:
- Before: "The Mower3000 receives mixed reviews. Rating: 3.0 stars. (45 words)"
- After: "Customers mention the Mower3000 is quiet and autonomous but struggles with slopes and boundary wire setup. Some love it, others find it misses spots. Opinions are mixed on reliability."
Test different sentiments:
- /mower (mixed, ~3.0) - Should be neutral
- /ecoBright (~4.0) - Should be positive
- /aquaHeat (~4.3) - Should be positive
Check consistency - Refresh the same page multiple times:
- Always starts with "Customers like..." or "Customers mention..."
- No word counts or star ratings in output
- Consistent length and format
Check AI Gateway dashboard:
- Token usage per request ~800-1000 tokens
- Cost ~$0.002 per summary
- Consistent response times

Prompt Engineering Techniques Used

Few-Shot Prompting: Show 3 examples instead of describing the format. The AI learns from examples better than descriptions.

Constraint Specification: Be explicit about what you don't want ("Don't include word count") not just what you do want.

Tone Mapping: Connect data (average rating) to desired output (tone). The AI uses this context to adjust language.

Output Formatting: Specify the starting phrase ("Customers like...") to ensure consistency across all summaries.

Parameter Tuning:

maxTokens: 1000 limits output length
temperature: 0.75 balances creativity with consistency (0.0 = deterministic, 1.0 = creative)

Understanding Temperature

Temperature controls randomness:

Temperature	Behavior	Best For
0.0 - 0.3	Deterministic, focused	Code, facts, structured data
0.4 - 0.7	Balanced	Summaries, explanations
0.8 - 1.0	Creative, varied	Creative writing, brainstorming

For review summaries: 0.75 gives enough variation to sound natural while maintaining consistency.

Token Usage Comparison

Before (basic prompt):

Input tokens: ~300
Output tokens: ~80
Total: ~380 tokens
Cost: ~$0.0011

After (engineered prompt):

Input tokens: ~600 (longer prompt with examples)
Output tokens: ~100 (slightly longer summaries)
Total: ~700 tokens
Cost: ~$0.0021

Worth it? Yes. $0.001 extra per summary for consistent, production-quality output is a great trade-off.

Commit

git add lib/ai-summary.ts
git commit -m "feat(ai): improve summaries with prompt engineering"
git push

Done-When

Average rating calculated for tone guidance
Few-shot examples added to prompt
Tone guidance based on ratings
Output constraints specified
Response cleanup removes quotes and word counts
All summaries start with "Customers like..." or "Customers mention..."
Summaries adapt tone to match review sentiment

What's Next

Your summaries are now consistent and production-ready. In the next lesson, you'll replace the blocking generateText call with streamText to show users content as it's generated—word by word—instead of waiting for the full response.

Sources: