Vercel Logo

Prompt engineering

Your prompt is the entire contract between you and the AI. A vague prompt gets vague results. Specific instructions with examples produce consistent output. Right now, summaries vary in format, length, and sometimes include unwanted metadata. Let's fix that.

Outcome

Refine the summarizeReviews prompt to produce consistent, well-formatted summaries that match review sentiment and follow a specific structure.

Fast Track

  1. Calculate averageRating and add tone guidance (1-2: negative, 3: neutral, 4-5: positive)
  2. Add 3 few-shot examples showing "Customers like..." format with pros/cons structure
  3. Add response cleanup: .trim().replace(/^"/, "").replace(/"$/, "").replace(/[\[\(]\d+ words[\]\)]/g, "")

Hands-on Exercise 2.3

Improve the prompt for production-quality summaries:

Requirements:

  1. Calculate average rating to determine tone
  2. Add 3 few-shot examples showing ideal summary format
  3. Include tone guidance (negative/neutral/positive) based on ratings
  4. Specify output constraints (length, format, what to avoid)
  5. Clean up the AI response (trim quotes, remove word counts)

Implementation hints:

  • Few-shot examples teach by showing, not telling
  • Use .reduce() to calculate average rating
  • Use regex to clean common AI formatting artifacts
  • Keep examples in the prompt consistent with your requirements
  • Add maxTokens and temperature parameters

Solution

Update lib/ai-summary.ts:

lib/ai-summary.ts
import { generateText } from "ai";
import { Product } from "./types";
 
export async function summarizeReviews(product: Product): Promise<string> {
  const averageRating =
    product.reviews.reduce((acc, review) => acc + review.stars, 0) /
    product.reviews.length;
 
  const prompt = `Write a summary of the reviews for the ${
    product.name
  } product. The product's average rating is ${averageRating} out of 5 stars.
 
Your goal is to highlight the most common themes and sentiments expressed by customers.
If multiple themes are present, try to capture the most important ones.
If no patterns emerge but there is a shared sentiment, capture that instead.
Try to use natural language and keep the summary concise.
Use a maximum of 4 sentences and 30 words.
Don't include any word count or character count.
No need to reference which reviews you're summarizing.
Do not reference the star rating in the summary.
 
Start the summary with "Customers like…" or "Customers mention…"
 
Here are 3 examples of good summaries:
Example 1: Customers like the quality, space, fit and value of the sport equipment bag case. They mention it's heavy duty, has lots of space and pockets, and can fit all their gear. They also appreciate the portability and appearance. That said, some disagree on the zipper.
Example 2: Customers like the quality, ease of installation, and value of the transport rack. They mention that it holds on to everything really well, and is reliable. Some complain about the wind noise, saying it makes a whistling noise at high speeds. Opinions are mixed on fit, and performance.
Example 3: Customers like the quality and value of the insulated water bottle. They say it keeps drinks cold for hours and the lid seals well. Some customers have different opinions on size and durability.
 
Hit the following tone based on rating:
- 1-2 stars: negative
- 3 stars: neutral
- 4-5 stars: positive
 
The customer reviews to summarize are as follows:
${product.reviews
    .map((review, i) => `Review ${i + 1}:\n${review.review}`)
    .join("\n\n")}`;
 
  try {
    const { text } = await generateText({
      model: "anthropic/claude-sonnet-4.5",
      prompt,
      maxOutputTokens: 1000,
      temperature: 0.75,
    });
 
    // Clean up the response
    return text
      .trim()
      .replace(/^"/, "")
      .replace(/\"$/, "")
      .replace(/[\[\(]\d+ words[\]\)]/g, "");
  } catch (error) {
    console.error("Failed to generate summary:", error);
    throw new Error("Unable to generate review summary. Please try again.");
  }
}

Breaking Down the Improvements

1. Calculate average rating:

const averageRating =
  product.reviews.reduce((acc, review) => acc + review.stars, 0) /
  product.reviews.length;

Used to determine tone and provide context to the AI.

2. Clear constraints:

Use a maximum of 4 sentences and 30 words.
Don't include any word count or character count.
Do not reference the star rating in the summary.

3. Few-shot examples:

Here are 3 examples of good summaries:
Example 1: Customers like the quality, space, fit and value...
Example 2: Customers like the quality, ease of installation...
Example 3: Customers like the quality and value...

These teach the AI the exact format and style you want.

4. Tone guidance:

Hit the following tone based on rating:
- 1-2 stars: negative
- 3 stars: neutral
- 4-5 stars: positive

5. Model parameters:

maxOutputTokens: 1000,      // Limit output length
temperature: 0.75,    // Balance creativity and consistency

6. Response cleanup:

return text
  .trim()                              // Remove whitespace
  .replace(/^"/, "")                   // Remove leading quote
  .replace(/"$/, "")                   // Remove trailing quote
  .replace(/[\[\(]\d+ words[\]\)]/g, ""); // Remove word counts like "(30 words)"

Try It

  1. Save the file and visit a product page

  2. Compare before/after on /mower:

    • Before: "The Mower3000 receives mixed reviews. Rating: 3.0 stars. (45 words)"
    • After: "Customers mention the Mower3000 is quiet and autonomous but struggles with slopes and boundary wire setup. Some love it, others find it misses spots. Opinions are mixed on reliability."
  3. Test different sentiments:

    • /mower (mixed, ~3.0) - Should be neutral
    • /ecoBright (~4.0) - Should be positive
    • /aquaHeat (~4.3) - Should be positive
  4. Check consistency - Refresh the same page multiple times:

    • Always starts with "Customers like..." or "Customers mention..."
    • No word counts or star ratings in output
    • Consistent length and format
  5. Check AI Gateway dashboard:

    • Token usage per request ~800-1000 tokens
    • Cost ~$0.002 per summary
    • Consistent response times

Prompt Engineering Techniques Used

Few-Shot Prompting: Show 3 examples instead of describing the format. The AI learns from examples better than descriptions.

Constraint Specification: Be explicit about what you don't want ("Don't include word count") not just what you do want.

Tone Mapping: Connect data (average rating) to desired output (tone). The AI uses this context to adjust language.

Output Formatting: Specify the starting phrase ("Customers like...") to ensure consistency across all summaries.

Parameter Tuning:

  • maxTokens: 1000 limits output length
  • temperature: 0.75 balances creativity with consistency (0.0 = deterministic, 1.0 = creative)

Understanding Temperature

Temperature controls randomness:

TemperatureBehaviorBest For
0.0 - 0.3Deterministic, focusedCode, facts, structured data
0.4 - 0.7BalancedSummaries, explanations
0.8 - 1.0Creative, variedCreative writing, brainstorming

For review summaries: 0.75 gives enough variation to sound natural while maintaining consistency.

Token Usage Comparison

Before (basic prompt):

  • Input tokens: ~300
  • Output tokens: ~80
  • Total: ~380 tokens
  • Cost: ~$0.0011

After (engineered prompt):

  • Input tokens: ~600 (longer prompt with examples)
  • Output tokens: ~100 (slightly longer summaries)
  • Total: ~700 tokens
  • Cost: ~$0.0021

Worth it? Yes. $0.001 extra per summary for consistent, production-quality output is a great trade-off.

Commit

git add lib/ai-summary.ts
git commit -m "feat(ai): improve summaries with prompt engineering"
git push

Done-When

  • Average rating calculated for tone guidance
  • Few-shot examples added to prompt
  • Tone guidance based on ratings
  • Output constraints specified
  • Response cleanup removes quotes and word counts
  • All summaries start with "Customers like..." or "Customers mention..."
  • Summaries adapt tone to match review sentiment

What's Next

Your summaries are now consistent and production-ready. In the next lesson, you'll replace the blocking generateText call with streamText to show users content as it's generated—word by word—instead of waiting for the full response.


Sources: