Survey Response → Persona Discovery

Scenario

You ran a Typeform survey with 500+ responses covering demographics, goals, pain points, and preferences. The data is rich but raw — buried in CSV exports and manual pivot tables. This job pulls all responses via the API, sends them to Mave Agent with the instruction to identify distinct audience segments, then automatically creates Custom Personas for each discovered segment. The result is a data-grounded persona library built from real survey answers, not assumptions. Flow: Typeform GET /forms/{id}/responses → Aggregate structured + open-ended → Mave POST /api/v1/mave/chat: “Identify distinct audience segments. Create persona profiles.” → Parse segments → POST /api/v1/personas per segment

Architecture

Code

import os, json, requests, time

TF = os.environ["TYPEFORM_TOKEN"]
MV = os.environ["MAVERA_API_KEY"]
TF_BASE = "https://api.typeform.com"
MB = "https://app.mavera.io/api/v1"
TF_H = {"Authorization": f"Bearer {TF}"}
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

FORM_ID = os.environ.get("TYPEFORM_FORM_ID", "your_form_id")

# 1. Get form structure
form = requests.get(f"{TF_BASE}/forms/{FORM_ID}", headers=TF_H).json()
fields = {f["id"]: f.get("title", f["id"]) for f in form.get("fields", [])}
print(f"Form: {form.get('title', 'Untitled')} ({len(fields)} fields)")

# 2. Pull all responses (paginated, 2 req/sec limit)
all_responses = []
params = {"page_size": 1000}

while True:
    r = requests.get(f"{TF_BASE}/forms/{FORM_ID}/responses",
        headers=TF_H, params=params)
    if r.status_code == 429:
        time.sleep(1)
        r = requests.get(f"{TF_BASE}/forms/{FORM_ID}/responses",
            headers=TF_H, params=params)
    r.raise_for_status()
    data = r.json()
    items = data.get("items", [])
    all_responses.extend(items)

    if len(items) < 1000:
        break
    last_token = items[-1].get("token")
    params["before"] = last_token
    time.sleep(0.6)

print(f"Total responses: {len(all_responses)}")

# 3. Extract structured answer data
def extract_answer(answer):
    atype = answer.get("type", "")
    if atype == "choice":
        return answer.get("choice", {}).get("label", "")
    elif atype == "choices":
        return ", ".join(c.get("label", "") for c in answer.get("choices", {}).get("labels", []))
    elif atype == "text":
        return answer.get("text", "")
    elif atype == "number":
        return str(answer.get("number", ""))
    elif atype == "rating":
        return str(answer.get("number", ""))
    elif atype == "boolean":
        return "Yes" if answer.get("boolean") else "No"
    elif atype == "email":
        return answer.get("email", "")
    return str(answer)

respondents = []
for resp in all_responses:
    answers = {}
    for ans in resp.get("answers", []):
        field_id = ans.get("field", {}).get("id", "")
        field_title = fields.get(field_id, field_id)
        answers[field_title] = extract_answer(ans)
    if answers:
        respondents.append(answers)

# 4. Build summary for Mave
sample_size = min(len(respondents), 200)
sample = respondents[:sample_size]

field_summaries = {}
for field_title in list(fields.values())[:15]:
    values = [r.get(field_title, "") for r in sample if r.get(field_title)]
    if not values:
        continue
    if len(set(values)) <= 20:
        from collections import Counter
        counts = Counter(values).most_common(10)
        field_summaries[field_title] = f"Top answers: {', '.join(f'{v} ({c})' for v, c in counts)}"
    else:
        field_summaries[field_title] = f"Sample: {'; '.join(values[:10])}"

summary_block = "\n".join(f"**{k}**: {v}" for k, v in field_summaries.items())

# 5. Mave segment discovery
segments = requests.post(f"{MB}/mave/chat", headers=MV_H, json={
    "message": f"""Analyze {len(respondents)} survey responses from "{form.get('title', 'Survey')}".

FIELD SUMMARIES ({sample_size} sample):
{summary_block}

Tasks:
1) Identify 3-5 distinct audience segments based on answer patterns
2) For each segment: name, size estimate (%), key characteristics, pain points, goals, preferred communication style
3) Suggest demographic and psychographic attributes for persona creation
4) Note any surprising correlations or unexpected patterns

Format as structured JSON with a "segments" array."""
}).json()

print("=== Segment Discovery ===")
content = segments.get("content", "")
print(content[:2000])

# 6. Create personas from discovered segments
try:
    json_start = content.find("[")
    json_end = content.rfind("]") + 1
    if json_start == -1:
        json_start = content.find('"segments"')
        json_start = content.find("[", json_start)
        json_end = content.rfind("]") + 1

    if json_start >= 0 and json_end > json_start:
        parsed = json.loads(content[json_start:json_end])
    else:
        parsed = json.loads(content)
        if isinstance(parsed, dict):
            parsed = parsed.get("segments", [])
except (json.JSONDecodeError, ValueError):
    print("Could not parse segments — creating personas from analysis text")
    parsed = []

personas = []
for seg in (parsed if parsed else []):
    name = seg.get("name", seg.get("segment_name", "Unknown"))
    desc = seg.get("description", seg.get("characteristics", ""))
    pct = seg.get("size_percent", seg.get("percentage", "?"))

    r = requests.post(f"{MB}/personas", headers=MV_H, json={
        "name": f"TF Survey: {name}",
        "description": (
            f"Discovered from {form.get('title', 'survey')} ({len(respondents)} responses). "
            f"Est. {pct}% of audience. {desc}"
        ),
        "demographic": seg.get("demographic", {}),
        "psychographic": {
            "pain_points": seg.get("pain_points", []),
            "goals": seg.get("goals", []),
            "communication_style": seg.get("communication_style", ""),
        },
    })
    r.raise_for_status()
    personas.append({"name": name, "id": r.json()["id"], "pct": pct})
    print(f"Created: {name} ({pct}%) → {r.json()['id']}")
    time.sleep(0.3)

print(f"\nDiscovered {len(personas)} segments from {len(respondents)} responses")

Example Output

{
  "segments": [
    {
      "name": "Growth-Stage Operator",
      "size_percent": 34,
      "characteristics": "Series A-B companies, 20-100 employees, marketing or ops role",
      "pain_points": ["Manual reporting", "Tool sprawl", "No single source of truth"],
      "goals": ["Consolidate tools", "Automate workflows", "Prove ROI"],
      "communication_style": "Direct, data-driven, ROI-focused"
    },
    {
      "name": "Enterprise Evaluator",
      "size_percent": 28,
      "characteristics": "500+ employees, director-level+, procurement-aware",
      "pain_points": ["Security compliance", "Integration depth", "Vendor consolidation"],
      "goals": ["Replace legacy tools", "SOC 2 compliance", "Scale globally"],
      "communication_style": "Formal, risk-aware, needs business case"
    },
    {
      "name": "Solo Practitioner",
      "size_percent": 22,
      "characteristics": "Freelancers or small teams (<10), budget-conscious",
      "pain_points": ["Time constraints", "Too many hats", "Affordability"],
      "goals": ["Work faster", "Look professional", "Stay within budget"],
      "communication_style": "Casual, value-focused, quick wins"
    }
  ]
}

Created: Growth-Stage Operator (34%) → per_tf_gs_1
Created: Enterprise Evaluator (28%) → per_tf_ee_2
Created: Solo Practitioner (22%) → per_tf_sp_3
Discovered 3 segments from 487 responses

Error Handling

2 req/sec rate limit

Typeform’s limit is strict. The code includes 600ms delays between paginated calls and retries on 429 with 1s backoff. For forms with 5,000+ responses, expect 3+ minutes for full extraction.

Answer type handling

Typeform has 15+ answer types (choice, choices, text, number, rating, boolean, email, date, file_url, payment, etc.). The extractor covers the most common. Add cases for date, file_url, payment if your forms use them.

JSON parsing from Mave

Mave’s response may embed JSON in markdown code blocks. The parser looks for the first [ and last ] to extract the array. If parsing fails, the code prints the analysis text for manual review.

Typeform

Open-Ended Response → Focus Group Questions

⌘I

Scenario
Architecture
Code
Example Output
Error Handling

Overview

Salesforce

HubSpot

Pipedrive

Close CRM

Meta Ads

Google Ads

LinkedIn Marketing

TikTok

YouTube

Reddit

X / Twitter

LinkedIn Content

Vimeo

Wistia

Google Analytics (GA4)

Mixpanel

Amplitude

Segment

Mailchimp

Klaviyo

Customer.io

SendGrid

Typeform

SurveyMonkey

Qualtrics

Shopify

Stripe

BigCommerce

SEMrush

Ahrefs

WordPress

NewsAPI

Perigon

Alpha Vantage

Slack

Discord

Twilio

Notion

Asana

Linear

Jira

OpenAI

Anthropic

ElevenLabs

Deepgram

Greenhouse

Lever

LinkedIn Talent

G2

Trustpilot

Google Business

Yelp

Documentation Index

​Scenario

​Architecture

​Code

​Example Output

​Error Handling

Scenario

Architecture

Code

Example Output

Error Handling