Open-Ended Response → Focus Group Questions

Scenario

Your Typeform survey includes open-ended questions that generated hundreds of free-text responses. Reading them all is impractical, and keyword analysis misses nuance. This job extracts all open-ended answers, sends them to Mavera Chat for theme identification, then creates a Focus Group with targeted questions based on the discovered themes. The result is a deep-dive into the why behind each theme, with synthetic personas probing the nuances your survey couldn’t capture. Flow: Typeform responses → Filter open-ended fields → Mavera Chat: “Identify themes” → Parse themes → POST /api/v1/focus-groups with theme-specific questions → Qualitative depth on each theme

Architecture

Code

import os, json, requests, time
from openai import OpenAI

TF = os.environ["TYPEFORM_TOKEN"]
MV = os.environ["MAVERA_API_KEY"]
TF_BASE = "https://api.typeform.com"
MB = "https://app.mavera.io/api/v1"
TF_H = {"Authorization": f"Bearer {TF}"}
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

FORM_ID = os.environ.get("TYPEFORM_FORM_ID", "your_form_id")
PERSONA_IDS = os.environ.get("PERSONA_IDS", "").split(",")

# 1. Get form and identify open-ended fields
form = requests.get(f"{TF_BASE}/forms/{FORM_ID}", headers=TF_H).json()
open_fields = [f for f in form.get("fields", [])
               if f.get("type") in ("long_text", "short_text")]
print(f"Open-ended fields: {len(open_fields)}")

# 2. Pull responses
responses = []
params = {"page_size": 1000}
while True:
    r = requests.get(f"{TF_BASE}/forms/{FORM_ID}/responses",
        headers=TF_H, params=params)
    if r.status_code == 429:
        time.sleep(1)
    else:
        r.raise_for_status()
        data = r.json()
        responses.extend(data.get("items", []))
        if len(data.get("items", [])) < 1000:
            break
        params["before"] = data["items"][-1]["token"]
    time.sleep(0.6)

# 3. Extract open-ended answers grouped by field
open_field_ids = {f["id"] for f in open_fields}
field_titles = {f["id"]: f.get("title", f["id"]) for f in open_fields}
answers_by_field = {fid: [] for fid in open_field_ids}

for resp in responses:
    for ans in resp.get("answers", []):
        fid = ans.get("field", {}).get("id", "")
        if fid in open_field_ids and ans.get("type") == "text":
            text = ans.get("text", "").strip()
            if text and len(text) > 10:
                answers_by_field[fid].append(text)

# 4. Identify themes with Mavera Chat
mavera = OpenAI(api_key=MV, base_url=MB)

all_text = []
for fid, answers in answers_by_field.items():
    title = field_titles[fid]
    for ans in answers[:50]:
        all_text.append(f"[{title}] {ans[:200]}")

theme_result = mavera.responses.create(
    model="mavera-1",
    input=[{"role": "user", "content": f"""Analyze these {len(all_text)} open-ended survey responses.

Identify 5-7 distinct themes. For each theme:
- Theme name (2-4 words)
- Frequency estimate (what % of responses mention it)
- Representative quotes (3 examples)
- Underlying sentiment (positive, negative, mixed)
- Key insight

RESPONSES:
{chr(10).join(all_text[:100])}

Return as JSON: {{"themes": [...]}}"""}],
)

theme_content = theme_result.output[0].content[0].text
print("=== Discovered Themes ===")
print(theme_content[:1500])

# 5. Parse themes and create focus group questions
try:
    json_str = theme_content[theme_content.find("{"):theme_content.rfind("}")+1]
    themes = json.loads(json_str).get("themes", [])
except (json.JSONDecodeError, ValueError):
    themes = []

focus_questions = []
for theme in themes[:5]:
    name = theme.get("name", "Unknown")
    sentiment = theme.get("sentiment", "mixed")
    quotes = theme.get("representative_quotes", theme.get("quotes", []))
    quote_sample = quotes[0] if quotes else "N/A"

    focus_questions.append(
        f'Survey respondents mentioned "{name}" — e.g., "{quote_sample[:100]}". '
        f"How does this resonate with your experience? What would you add?"
    )

focus_questions.append("Which of these themes matters most to you? Why?")
focus_questions.append("What's missing from these themes? What topic should we have asked about?")

# 6. Run Focus Group
if not PERSONA_IDS or PERSONA_IDS == [""]:
    p = requests.post(f"{MB}/personas", headers=MV_H, json={
        "name": "TF Survey Respondent",
        "description": "Represents the typical respondent of this Typeform survey.",
    }).json()
    PERSONA_IDS = [p["id"]]

fg = requests.post(f"{MB}/focus-groups", headers=MV_H, json={
    "name": f"Theme Deep-Dive: {form.get('title', 'Survey')}",
    "persona_ids": PERSONA_IDS,
    "questions": focus_questions,
    "context": f"Based on analysis of {len(responses)} survey responses. Themes discovered: {', '.join(t.get('name','') for t in themes[:5])}",
    "responses_per_persona": 3,
}).json()

# 7. Poll for results
for _ in range(20):
    time.sleep(5)
    result = requests.get(f"{MB}/focus-groups/{fg['id']}",
        headers=MV_H).json()
    if result.get("status") == "completed":
        break

print(f"\nFocus Group: {fg['id']}")
for resp in result.get("responses", []):
    print(f"\n[{resp.get('persona_name', '?')}] {resp.get('question', '')[:70]}...")
    print(f"  → {resp.get('answer', '')[:250]}")

Example Output

=== Discovered Themes ===
1. "Tool Consolidation" (42%) — "I use 7 different tools and none talk to each other"
2. "Time to Value" (38%) — "We spent 3 months onboarding our last platform"
3. "Pricing Transparency" (31%) — "Hidden fees killed our budget mid-year"
4. "Team Adoption" (27%) — "I love it but my team won't switch from spreadsheets"
5. "Data Security" (19%) — "SOC 2 is table stakes, we need more"

Focus Group: fg_tf_themes_4k

[Growth-Stage Operator] Respondents mentioned "Tool Consolidation"...
  → Absolutely. The cognitive overhead of context-switching between 7 tools
    is worse than any single tool's limitations. What I'd add: it's not just
    about features — it's about having one place to think.

[Enterprise Evaluator] Which theme matters most to you? Why?
  → Data Security, hands down. Tool consolidation is a nice-to-have, but
    a security incident is existential. I'd also add "Vendor Risk Assessment
    Burden" — every new tool means another 6-week security review.

Error Handling

Short responses add noise

Responses under 10 characters (e.g., “N/A”, “none”) are filtered out. Adjust the threshold based on your survey’s typical response quality.

Theme count depends on response volume

With fewer than 50 open-ended responses, Mave may only find 2-3 themes. The code handles variable theme counts gracefully.

Focus Group question length

Questions derived from themes can be long. Keep the quote excerpt under 100 characters to prevent focus group prompt overflow.

Survey Response → Persona Discovery

NPS Response → Brand Voice Impact

⌘I

​Scenario

​Architecture

​Code

​Example Output

​Error Handling

Scenario

Architecture

Code

Example Output

Error Handling