Project: Automated Structured Content Summarizer
In this project, we will build a content processing pipeline that takes raw articles, parses the text, and guarantees output as structured JSON containing categories, keywords, and summaries.
1. Schema Definition Design
We enforce these attributes inside our output JSON Schema:
title: A clean formatted headline.summary: A concise summary paragraph.category: A string restricted to predefined values.keywords: An array of lowercase strings representing keyword tags.
2. Implementing the Summarizer Pipeline
Create the backend parsing utility function:
// src/services/summarizer.ts
import { openai } from "../lib/openai";
const SummaryOutputSchema = {
name: "article_summary",
strict: true,
schema: {
type: "object",
properties: {
title: { type: "string" },
summary: { type: "string", description: "A three-sentence breakdown of key takeaways." },
category: {
type: "string",
enum: ["Engineering", "Product Design", "Marketing", "Security"]
},
keywords: {
type: "array",
items: { type: "string" },
description: "Max 5 lowercase keyword tags."
}
},
required: ["title", "summary", "category", "keywords"],
additionalProperties: false
}
};
export async function summarizeText(articleText: string) {
try {
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [
{
role: "system",
content: "You are a professional research compiler. Parse the raw text and structure it."
},
{ role: "user", content: articleText }
],
// Force response structure compliance
response_format: {
type: "json_schema",
json_schema: SummaryOutputSchema
}
});
const outputJsonString = response.choices[0].message.content || "";
// Safely parse JSON
return JSON.parse(outputJsonString);
} catch (err: any) {
console.error("Summarization pipeline failed:", err.message);
return null;
}
}3. Database Sync Integration
When a user submits an article link:
- Fetch the raw text content.
- Call
summarizeTextto generate the structured metadata payload. - Save the result directly into PostgreSQL tables using Prisma:
const metadata = await summarizeText(rawText);
if (metadata) {
await prisma.article.create({
data: {
title: metadata.title,
summary: metadata.summary,
category: metadata.category,
tagsList: metadata.keywords, // Directly save the validated array
}
});
}Published on Last updated: