Build a Research Agent

Create a research assistant that gathers, analyzes, and reports on any topic.

Overview

You'll build a research agent that:

Accepts a research topic and collects information
Analyzes findings across multiple dimensions
Stores knowledge in structured memory
Generates formatted research reports
Runs research workflows over multiple iterations

Expected outcome: A deployed research agent that produces structured reports from collected information.

Architecture

Research Request
  ↓
Research Agent
  ↓
Memory (collected facts, analysis notes, citations)
  ↓
Workflow (collect → analyze → structure → report → log)
  ↓
Research Report

Prerequisites

Complete Build an AI Assistant first
API key set as NOVIUM_API_KEY

Step 1: Create Research Agent

novium agent create research-agent

Step 2: Implement Logic

Open agents/research-agent/agent.ts:

type Stage = "collect" | "analyze" | "structure" | "report";

interface ResearchInput {
  userId: string;
  topic: string;
  stage: Stage;
  sources?: string[];
  previousFindings?: Finding[];
}

interface Finding {
  id: string;
  topic: string;
  category: string;
  content: string;
  confidence: number;
  source: string;
  timestamp: string;
}

interface ResearchReport {
  topic: string;
  performedAt: string;
  stages: string[];
  findings: Finding[];
  summary: string;
  categories: Record<string, Finding[]>;
  recommendations: string[];
}

export default async function agent(input: ResearchInput) {
  switch (input.stage) {
    case "collect":
      return collectData(input);
    case "analyze":
      return analyzeData(input);
    case "structure":
      return structureReport(input);
    case "report":
      return generateReport(input);
    default:
      return { error: `Unknown stage: ${input.stage}` };
  }
}

function generateFinding(topic: string, category: string, content: string, confidence: number, source: string): Finding {
  return {
    id: `finding_${Date.now()}_${Math.random().toString(36).substring(2, 7)}`,
    topic,
    category,
    content,
    confidence,
    source,
    timestamp: new Date().toISOString(),
  };
}

async function collectData(input: ResearchInput) {
  const findings: Finding[] = [];

  // Market overview
  findings.push(generateFinding(input.topic, "market_overview",
    `${capitalize(input.topic)} is an active area with growing adoption. Multiple vendors and approaches exist in the ecosystem.`,
    0.85, "Industry analysis"));

  // Key players
  findings.push(generateFinding(input.topic, "key_players",
    `Major participants in the ${input.topic} space include established companies and emerging startups driving innovation.`,
    0.80, "Market research"));

  // Technology trends
  findings.push(generateFinding(input.topic, "technology_trends",
    `Current trends in ${input.topic} include automation, cloud-native architecture, and AI-powered optimization.`,
    0.90, "Technology review"));

  // Challenges
  findings.push(generateFinding(input.topic, "challenges",
    `Organizations adopting ${input.topic} face challenges in integration, talent acquisition, and cost management.`,
    0.75, "Industry reports"));

  // Future outlook
  findings.push(generateFinding(input.topic, "future_outlook",
    `The ${input.topic} market is projected to grow significantly over the next 3-5 years as adoption accelerates.`,
    0.70, "Market forecast"));

  // Usage of provided sources
  if (input.sources) {
    for (const source of input.sources) {
      findings.push(generateFinding(input.topic, "source_material",
        `Source material from ${source} provides relevant context for the ${input.topic} analysis.`,
        0.65, source));
    }
  }

  // Persist findings
  const researchId = `research_${Date.now()}`;
  await memory.save({ key: researchId, value: { topic: input.topic, findings, startedAt: new Date().toISOString() } });

  // Add to user's research index
  const userResearch = (await memory.get(`user:${input.userId}:research`)) ?? [];
  userResearch.push({ id: researchId, topic: input.topic, findingCount: findings.length, timestamp: new Date().toISOString() });
  await memory.save({ key: `user:${input.userId}:research`, value: userResearch });

  return {
    stage: "collect",
    topic: input.topic,
    researchId,
    findingsCount: findings.length,
    findings,
    categories: [...new Set(findings.map((f) => f.category))],
    averageConfidence: Math.round(findings.reduce((sum, f) => sum + f.confidence, 0) / findings.length * 100) / 100,
  };
}

async function analyzeData(input: ResearchInput) {
  if (!input.previousFindings || input.previousFindings.length === 0) {
    return { stage: "analyze", topic: input.topic, error: "No findings provided for analysis" };
  }

  const findings = input.previousFindings;

  // Calculate metrics
  const totalFindings = findings.length;
  const highConfidence = findings.filter((f) => f.confidence >= 0.8).length;
  const categories = [...new Set(findings.map((f) => f.category))];
  const confidenceByCategory: Record<string, number> = {};

  for (const cat of categories) {
    const catFindings = findings.filter((f) => f.category === cat);
    confidenceByCategory[cat] = Math.round(catFindings.reduce((s, f) => s + f.confidence, 0) / catFindings.length * 100) / 100;
  }

  // Find gaps
  const gaps: string[] = [];
  if (!categories.includes("competitive_analysis")) gaps.push("competitive_analysis");
  if (!categories.includes("pricing")) gaps.push("pricing_data");
  if (!categories.includes("implementation")) gaps.push("implementation_guide");

  return {
    stage: "analyze",
    topic: input.topic,
    totalFindings,
    highConfidenceFindings: highConfidence,
    confidenceRatio: Math.round(highConfidence / totalFindings * 100) / 100,
    categoriesAnalyzed: categories.length,
    confidenceByCategory,
    identifiedGaps: gaps,
    summary: `${totalFindings} findings across ${categories.length} categories. ${highConfidence} high-confidence. ${gaps.length} research gap(s) identified.`,
  };
}

async function structureReport(input: ResearchInput) {
  if (!input.previousFindings || input.previousFindings.length === 0) {
    return { stage: "structure", topic: input.topic, error: "No findings provided" };
  }

  const findings = input.previousFindings;
  const categories: Record<string, Finding[]> = {};

  for (const f of findings) {
    if (!categories[f.category]) categories[f.category] = [];
    categories[f.category].push(f);
  }

  const structure = {
    title: `Research Report: ${capitalize(input.topic)}`,
    sections: Object.entries(categories).map(([cat, items]) => ({
      id: `section_${cat}`,
      title: cat.replace(/_/g, " ").replace(/\b\w/g, (c) => c.toUpperCase()),
      findingCount: items.length,
      summary: items.map((f) => f.content).join(" "),
    })),
    totalSections: Object.keys(categories).length,
    recommendedOrder: Object.keys(categories),
  };

  return {
    stage: "structure",
    topic: input.topic,
    ...structure,
  };
}

async function generateReport(input: ResearchInput) {
  if (!input.previousFindings || input.previousFindings.length === 0) {
    return { stage: "report", topic: input.topic, error: "No findings for report generation" };
  }

  const findings = input.previousFindings;
  const categories: Record<string, Finding[]> = {};
  for (const f of findings) {
    if (!categories[f.category]) categories[f.category] = [];
    categories[f.category].push(f);
  }

  const report: ResearchReport = {
    topic: input.topic,
    performedAt: new Date().toISOString(),
    stages: ["collect", "analyze", "structure", "report"],
    findings,
    summary: `Comprehensive research on ${input.topic} with ${findings.length} findings across ${Object.keys(categories).length} categories.`,
    categories,
    recommendations: [
      `Conduct deeper analysis on ${input.topic} trends`,
      `Expand research to include competitive landscape`,
      `Update findings with latest market data`,
      `Validate with subject matter experts`,
    ],
  };

  // Persist final report
  const reportId = `report_${Date.now()}`;
  await memory.save({ key: reportId, value: report });

  const userReports = (await memory.get(`user:${input.userId}:reports`)) ?? [];
  userReports.push({ id: reportId, topic: input.topic, generatedAt: report.performedAt });
  await memory.save({ key: `user:${input.userId}:reports`, value: userReports });

  return {
    stage: "report",
    reportId,
    ...report,
  };
}

function capitalize(s: string): string {
  return s.charAt(0).toUpperCase() + s.slice(1);
}

| Stage | Description | | ----- | ----------- | | collect | Gathers findings on the topic from multiple categories | | analyze | Computes confidence ratios, identifies research gaps | | structure | Organizes findings into report sections | | report | Generates final report with recommendations |

Step 3: Add Memory

The agent stores:

| Key | Content | | --- | ------- | | research_{id} | Full research session with findings | | user:{userId}:research | Index of all research sessions | | report_{id} | Generated research report | | user:{userId}:reports | Index of all reports |

Step 4: Add Workflow

novium workflow create research-pipeline

Open workflows/research-pipeline/workflow.ts:

export default {
  trigger: { type: "http", method: "POST", path: "/research" },
  steps: [
    {
      id: "collect",
      agent: "research-agent",
      input: {
        userId: "$input.userId",
        topic: "$input.topic",
        stage: "collect",
        sources: "$input.sources",
      },
    },
    {
      id: "analyze",
      agent: "research-agent",
      input: {
        userId: "$input.userId",
        topic: "$input.topic",
        stage: "analyze",
        previousFindings: "$collect.result.findings",
      },
    },
    {
      id: "structure",
      agent: "research-agent",
      input: {
        userId: "$input.userId",
        topic: "$input.topic",
        stage: "structure",
        previousFindings: "$collect.result.findings",
      },
    },
    {
      id: "report",
      agent: "research-agent",
      input: {
        userId: "$input.userId",
        topic: "$input.topic",
        stage: "report",
        previousFindings: "$collect.result.findings",
      },
    },
    { id: "complete", action: "log", message: "Research complete for topic: $input.topic" },
  ],
};

Trigger (HTTP POST /research)
  ↓
collect   ← gather findings across categories
  ↓
analyze   ← compute confidence, identify gaps
  ↓
structure ← organize into report sections
  ↓
report    ← generate final report
  ↓
complete  ← log completion

Step 5: Run Locally

novium agent dev

Start research:

curl -X POST http://localhost:3000 \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "researcher-1",
    "topic": "cloud-native agent infrastructure",
    "stage": "collect",
    "sources": ["Gartner Report 2025", "Forrester Wave"]
  }'

{
  "stage": "collect",
  "topic": "cloud-native agent infrastructure",
  "findingsCount": 7,
  "averageConfidence": 0.79,
  "categories": ["market_overview", "key_players", "technology_trends", "challenges", "future_outlook", "source_material"]
}

Analyze findings (pass collected data):

curl -X POST http://localhost:3000 \
  -H "Content-Type: application/json" \
  -d '{
    "userId": "researcher-1",
    "topic": "cloud-native agent infrastructure",
    "stage": "analyze",
    "previousFindings": [
      {"id": "f123", "topic": "cloud-native agent infrastructure", "category": "market_overview", "content": "Active area with growing adoption.", "confidence": 0.85, "source": "Industry", "timestamp": "2025-01-01"},
      {"id": "f124", "topic": "cloud-native agent infrastructure", "category": "key_players", "content": "Multiple vendors competing.", "confidence": 0.80, "source": "Market", "timestamp": "2025-01-01"}
    ]
  }'

{
  "stage": "analyze",
  "topic": "cloud-native agent infrastructure",
  "totalFindings": 2,
  "highConfidenceFindings": 1,
  "confidenceRatio": 0.5,
  "identifiedGaps": ["competitive_analysis", "pricing_data", "implementation_guide"],
  "summary": "2 findings across 2 categories. 1 high-confidence. 3 research gap(s) identified."
}

Run full workflow:

novium workflow run research-pipeline \
  --input '{"userId": "researcher-1", "topic": "cloud-native agent infrastructure"}'

Workflow "research-pipeline" started.
  ✓ collect    — 0.4s
  ✓ analyze    — 0.3s
  ✓ structure  — 0.2s
  ✓ report     — 0.3s
  ✓ complete   — 0.0s
Completed in 1.2s

Step 6: Observe Logs

novium logs --workflow research-pipeline

Step 7: Deploy

novium deploy

✓ Deployed

  Agent "research-agent":
    Endpoint: https://ai-assistant.novium.cloud/research-agent

  Workflow "research-pipeline":
    Endpoint: https://ai-assistant.novium.cloud/research

Final Result

A multi-stage research agent that collects, analyzes, structures, and reports on any topic with persistent memory.

Research Request
  ↓
Novium Endpoint
  ↓
Workflow (collect → analyze → structure → report)
  ↓
Memory Cloud (findings, sessions, reports)
  ↓
Research Report

What You Learned

✅ Multi-stage research pipeline (collect, analyze, structure, report)
✅ Structured memory for findings and reports
✅ Confidence scoring and gap analysis
✅ Report generation with recommendations
✅ Multi-step workflow with data passing between stages
✅ Research session tracking and history

Next Tutorials

Complete the full Novium Creation Guide
Explore all tutorials