Your support team answers the same questions every day. Your docs cover most of them, but agents have to dig through five tabs to find the exact answer. Meanwhile, AI suggestion tools confidently hallucinate features that don't exist.

Why this is hard to build yourself

Static FAQ tools go stale within weeks of every release.
Public-LLM autocomplete invents endpoints, parameters, pricing — and your agents have to catch every one.
Building your own retrieval stack means standing up a crawler, an embedder, a vector store, and a query layer — a multi-quarter project.

How it works with Pellumin

Send each incoming ticket to Smart Search, scoped to your docs domain.
Display the cited answer and source cards directly in the agent's inbox.
Reply in two clicks, or one if the answer is clean enough to send as-is.

Architecture

1Inbound webhook (e.g. Zendesk) → your service
2Your service → Smart Search with `q = site:docs.example.com {ticket}`
3Render answer + source list in the agent UI
4Agent reviews, edits if needed, sends

Implementation (Python)

import os, httpx

API = "https://api.pellumin.com"
KEY = os.environ["PELLUMIN_API_KEY"]
DOCS = "docs.example.com"

def support_suggestion(ticket: str) -> dict:
    """Return a draft answer + source citations for a support ticket."""
    r = httpx.get(
        f"{API}/api/summary",
        params={"q": f"site:{DOCS} {ticket}"},
        headers={"Authorization": f"Bearer {KEY}"},
        timeout=20,
    )
    r.raise_for_status()
    return r.json()

suggestion = support_suggestion("How do I rotate my API key?")
# suggestion["answer"]  → markdown draft, ready to paste
# suggestion["results"] → docs links to send the user

Outcomes

Setup timeUnder a day

Per-reply cost≈ $0.016

Answer freshnessReal-time (every release)

Citations includedAlways

Build this →Try Smart Search

Autonomous Agents

A research agent that delivers reports, not bullet points

Plan → search → rank → synthesize, all behind a single endpoint.

Built for

Founders shipping AI productsAgent platform buildersR&D teams

Build vs. integrate1 quarter

→

With Pellumin1 afternoon

The problem

Your agent needs to produce structured, citable research on demand — not regurgitated paragraphs from a stale corpus. Stitching together a planner, multiple search calls, ranking, and extraction yourself burns weeks of engineering for a feature you'd rather not own.

Why this is hard to build yourself

Multi-step research is genuinely complex: planning, parallel I/O, ranking, extraction, synthesis — each a moving target.
Most search APIs return links, not answers. You still own the entire post-processing pipeline.
Token costs creep up as your agent spirals through iterations.

How it works with Pellumin

Hand a topic to Deep Search; let it plan its own sub-questions and run them in parallel.
Receive a structured Markdown report with a fixed shape and inline citations.
Show progress in your UI by reading sub-questions and source rankings from the response.

Architecture

1User asks a multi-faceted question
2Your agent decides depth is needed → calls Deep Search
3Deep Search internally plans 4–5 sub-questions, runs parallel searches, ranks the merged pool, extracts the top sources, and synthesizes a report
4You render the markdown report + citation chips in your UI

Implementation (Python)

import os, asyncio, httpx

API = "https://api.pellumin.com"
KEY = os.environ["PELLUMIN_API_KEY"]

async def deep_research(topic: str) -> dict:
    async with httpx.AsyncClient(timeout=90) as c:
        r = await c.get(
            f"{API}/api/research",
            params={"q": topic},
            headers={"Authorization": f"Bearer {KEY}"},
        )
        r.raise_for_status()
        return r.json()

report = asyncio.run(
    deep_research("compare modern rust web frameworks for production")
)
print(report["answer"])              # full markdown report
print(report["follow_up_questions"]) # planned sub-questions
print(len(report["results"]))        # ranked sources

Outcomes

Eng. weeks saved6–8 per agent

Per-report cost≈ $0.08

Time to first report10–30s

Sources per report~10 ranked

Build this →Try Deep Search

RAG / LLM apps

Live RAG that doesn't go stale

Skip the crawler, skip the vector DB, skip the re-indexing cron.

Built for

AI app buildersEmbedded chatbot teamsProduct engineers

Setup8–12 weeks

→

With Pellumin1 afternoon

The problem

Your chat product needs grounding in current information. Building a continuous-crawl + vector pipeline costs months of work and ongoing infra spend — for a feature that's only valuable when the data is fresh.

Why this is hard to build yourself

Vector indexes need re-embedding every time the source moves, which is constantly.
Cold storage of HTML, embeddings, and metadata stacks up quickly.
You're paying to refresh content most users never query.

How it works with Pellumin

Call Smart Search per turn for grounded answers, fetching live web content only when the user actually asks.
For deeper questions, escalate to Deep Search and cache the result for 24 hours.
No vector DB, no crawler, no embedding cost — you only pay for what users ask about.

Architecture

1User sends a chat message
2Your service routes: factual lookup → Smart Search (~2s), multi-faceted topic → Deep Search (~20s)
3Pass the cited answer plus the source array into your AI's context window
4Render response with the source cards rendered in the UI

Implementation (TypeScript)

const API = "https://api.pellumin.com";
const KEY = process.env.PELLUMIN_API_KEY!;

async function groundedChat(turn: string) {
  // 1. Get a cited answer from the live web.
  const sr = await fetch(
    `${API}/api/summary?q=${encodeURIComponent(turn)}`,
    { headers: { Authorization: `Bearer ${KEY}` } }
  );
  const grounding = await sr.json();

  // 2. Inline the answer + sources into your AI's context.
  return {
    reply: grounding.answer,
    sources: grounding.results,
  };
}

const out = await groundedChat("What changed in TLS 1.3 vs 1.2?");

Outcomes

Infra to maintainZero (just the API)

Data freshnessReal-time, per-query

Cost per chat turn≈ $0.016

Citations renderedInline + source list

Build this →Try in the Playground

Content & SEO

Briefing material your writers actually use

Hand every writer a structured, cited brief — generated in 20 seconds.

Built for

Content teamsSEO leadsEditorial managers

Brief-writing time60 min

→

With Pellumin5 min

The problem

Your writers spend the first hour of every article doing the same shallow research — five tabs of competitor posts, three news pieces, and a Wikipedia skim. The output varies wildly; some briefs are great, most are thin.

Why this is hard to build yourself

Manual research is the slowest, least-loved part of the workflow.
Outsourced research has unpredictable quality and review burden.
Templated AI tools regurgitate content from training data, missing this week's developments.

How it works with Pellumin

Drop the working title into Deep Search.
Receive a structured Markdown brief with sub-topics, sources, and citations.
Paste straight into the writer's doc, or post to Slack.

Architecture

1Writer adds working title to a Google Sheet / Notion DB
2A nightly cron calls Deep Search for each new title
3Markdown brief is appended to the row, with source URLs as a list
4Writer opens the brief, starts with research already done

Implementation (Python)

import os, httpx, json, sys

API = "https://api.pellumin.com"
KEY = os.environ["PELLUMIN_API_KEY"]

def brief(title: str) -> str:
    r = httpx.get(
        f"{API}/api/research",
        params={"q": title},
        headers={"Authorization": f"Bearer {KEY}"},
        timeout=120,
    )
    data = r.json()
    out = [f"# Working brief — {data['query']}", ""]
    out.append(data["answer"])
    out.append("\n## Sources to cite")
    for src in data["results"]:
        out.append(f"- [{src['title']}]({src['url']})")
    return "\n".join(out)

print(brief(sys.argv[1]))

Outcomes

Time to brief≈ 20 seconds

Cost per brief≈ $0.08

Citations included10+ ranked sources

ConsistencySame shape every brief

Build this →Try Deep Search

Market Intelligence

A daily intelligence digest, on autopilot

Wake up to a digest of competitors, pricing changes, and industry moves.

Built for

FoundersPMsStrategy & ops

Morning routine45 min

→

With Pellumin5 min

The problem

You want to know what's happening in your space without spending an hour every morning on it. Most monitoring tools either bury you in noise or stay too high-level to be useful.

Why this is hard to build yourself

Generic news aggregators surface everything except the signal you actually need.
Custom monitoring stacks are brittle: keyword-only matches miss context.
Synthesizing a digest still takes 30+ minutes of human time.

How it works with Pellumin

Define a small set of recurring research questions (competitors, regulatory shifts, pricing).
Schedule a daily job that runs Deep Search for each.
Receive synthesized, cited briefs by email or Slack every morning.

Architecture

1Cron job (or GitHub Action) runs daily at 7 AM
2For each topic in your watchlist → Deep Search
3Reports concatenated into a single Markdown digest
4Posted to a Slack channel / sent via email

Implementation (Python)

# scripts/daily_intel.py — run from cron
import os, datetime, httpx

API = "https://api.pellumin.com"
KEY = os.environ["PELLUMIN_API_KEY"]

TOPICS = [
    "latest pricing updates from our top 5 competitors",
    "industry hiring trends this week",
    "regulatory changes affecting our market",
]

def section(topic: str) -> str:
    r = httpx.get(f"{API}/api/research", params={"q": topic},
                  headers={"Authorization": f"Bearer {KEY}"}, timeout=120)
    data = r.json()
    return f"## {topic}\n\n{data['answer']}\n"

today = datetime.date.today().isoformat()
print(f"# Market Intelligence — {today}\n")
for t in TOPICS:
    print(section(t))

Outcomes

SetupSingle script + cron

Daily cost≈ $0.24 (3 topics)

Time saved30 min / day

FormatSame Markdown every day

Build this →Try Deep Search

Internal Tools

Live web search for your internal copilots

Give every internal tool — dashboards, briefings, ops — a current-information superpower.

Built for

ITInternal platform teamsOps & analytics

Time to liveWeeks

→

With PelluminAn hour

The problem

Internal tools at most companies are bricks: stale data, manual lookups, and a dozen tabs open. Embedding live web search makes them feel modern — but routing through your model provider, paying their search markup, and dealing with their rate limits is fiddly.

Why this is hard to build yourself

Internal traffic spikes irregularly (Mon morning everyone checks the dashboard).
Cost predictability matters: finance wants to know what each dashboard costs.
Compliance teams need to see what was queried, when, by whom.

How it works with Pellumin

Add Smart Search to any dashboard with a single endpoint call.
Per-workspace usage logs make finance and compliance happy.
Per-credit pricing makes cost forecasts predictable.

Architecture

1Internal tool calls Smart Search through a thin server-side proxy
2Workspace audit logs every query (visible to admins)
3Dashboard renders the cited answer + source cards

Implementation (TypeScript)

// Next.js Route Handler (server-side)
import { NextRequest, NextResponse } from "next/server";

const API = "https://api.pellumin.com";
const KEY = process.env.PELLUMIN_API_KEY!;

export async function GET(req: NextRequest) {
  const q = req.nextUrl.searchParams.get("q") ?? "";
  const r = await fetch(
    `${API}/api/summary?q=${encodeURIComponent(q)}`,
    { headers: { Authorization: `Bearer ${KEY}` } }
  );
  return NextResponse.json(await r.json());
}

Outcomes

Audit logPer-user, per-call

Cost predictabilityFlat per credit

SetupServer route only

Key safetyStays server-side

Build this →Try Smart Search

One API. Pick the pattern that fits.

1,000 free credits a month — no card required.

Start free →