Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mavera.io/llms.txt

Use this file to discover all available pages before exploring further.

Scenario

Claude’s 1M token context window can process entire annual reports, 10-K filings, or industry whitepapers in a single call — no chunking, no summarization chains, no lost context. This job feeds a full document to Claude Opus 4.6 for deep analysis, extracts audience segments and decision-maker profiles, then creates Mavera Custom Personas grounded in real market intelligence. Flow: Read document → Anthropic POST /v1/messages (Claude Opus 4.6, full document) → Extract segments → Mavera POST /personas → Data-driven personas

Code

import os, time, json, anthropic, requests

MV = os.environ["MAVERA_API_KEY"]
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}
client = anthropic.Anthropic()

with open("acme_annual_report_2025.txt") as f:
    document_text = f.read()
print(f"Document loaded: {len(document_text):,} chars (~{len(document_text)//4:,} tokens)")

# 1. Claude Opus analyzes the full document
analysis = client.messages.create(
    model="claude-opus-4-6-20250725", max_tokens=4096,
    input=[{"role": "user",
        "content": "Market intelligence analyst. Analyze this document and extract:\n\n"
            "1. **Audience Segments** 2. **Pain Points** 3. **Decision Criteria** "
            "4. **Communication Preferences** 5. **Budget Indicators**\n\n"
            "Return JSON: [{name, role, industry, pain_points, decision_criteria, "
            "communication_style, budget_range, key_quotes}]\n\n"
            f"DOCUMENT:\n{document_text}"}],
)
raw_output = analysis.content[0].text
print(f"Claude complete — {analysis.usage.input_tokens:,} input, {analysis.usage.output_tokens:,} output tokens")

# 2. Parse extracted personas
try:
    start = raw_output.index("[")
    end = raw_output.rindex("]") + 1
    personas_data = json.loads(raw_output[start:end])
except (ValueError, json.JSONDecodeError):
    personas_data = [{"name": "Extracted Persona", "description": raw_output[:500]}]
print(f"Extracted {len(personas_data)} persona segments")

# 3. Create Mavera personas
created = []
for p in personas_data[:6]:
    desc = (f"Role: {p.get('role', 'N/A')}. Industry: {p.get('industry', 'N/A')}. "
        f"Pain points: {', '.join(p.get('pain_points', [])[:3])}. "
        f"Style: {p.get('communication_style', 'professional')}. Budget: {p.get('budget_range', 'N/A')}.")
    mv = requests.post(f"{MV_BASE}/personas", headers=MV_H, json={
        "name": p.get("name", "Document Persona"), "description": desc,
    }).json()
    created.append({"id": mv["id"], "name": p.get("name")})
    print(f"  Created: {p.get('name')}{mv['id']}")
    time.sleep(0.3)

print(f"\nPersonas created: {len(created)}")
for c in created:
    print(f"  • {c['name']} ({c['id']})")

Example Output

Document loaded: 847,293 chars (~211,823 tokens)
Claude complete — 211,823 input, 2,847 output tokens
Extracted 4 persona segments
  Created: VP of Operations → per_8a3f2c
  Created: Chief Financial Officer → per_1b7d4e
  Created: IT Director (Mid-Market) → per_9c2a1f
  Created: Procurement Manager → per_4d8e3b

Error Handling

Claude Opus 4.6 supports 1M input tokens. A 500-page annual report is ~200K tokens — well within limits. For documents exceeding 1M tokens, split at natural boundaries (chapters, sections) and run multiple calls.
Tier 1 allows 50 req/min. This job uses 1 Claude call + up to 6 Mavera calls. The anthropic SDK retries 429 errors automatically with exponential backoff. For raw HTTP, add retry logic with 30-60s waits.
Claude may wrap JSON in markdown code fences. The substring extraction handles this. If parsing still fails, retry with a system prompt that says “return raw JSON, no markdown.”