Behavioral Cohort → Focus Group

Scenario

You’ve identified two behavioral cohorts in Amplitude — users who onboarded within 24 hours versus those who took more than 7 days. These groups have radically different retention curves, but you don’t know why the experience differs. You map each cohort to Mavera personas and run a Focus Group exploring how fast-onboarders and slow-onboarders experience the product differently — what drives urgency, what causes delay, and what each group needs.

Architecture

Code

import os, requests, time
from collections import Counter

AMP_KEY = os.environ["AMPLITUDE_API_KEY"]
AMP_SECRET = os.environ["AMPLITUDE_SECRET_KEY"]
MV = os.environ["MAVERA_API_KEY"]
MB = "https://app.mavera.io/api/v1"
MH = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}
amp_auth = (AMP_KEY, AMP_SECRET)

FAST_IDS = os.environ.get("FAST_ONBOARD_IDS", "").split(",")
SLOW_IDS = os.environ.get("SLOW_ONBOARD_IDS", "").split(",")

def fetch_activity(user_id):
    r = requests.get(
        "https://amplitude.com/api/2/useractivity",
        auth=amp_auth,
        params={"user": user_id},
    )
    if r.status_code == 429:
        time.sleep(int(r.headers.get("Retry-After", 10)))
        return fetch_activity(user_id)
    if r.status_code != 200:
        return {"events": [], "properties": {}}
    data = r.json().get("userData", {})
    return {"events": data.get("events", []), "properties": data.get("userProperties", {})}

def profile_cohort(user_ids, label):
    events_all = Counter()
    platforms = Counter()
    countries = Counter()
    event_counts = []

    for uid in user_ids[:20]:
        data = fetch_activity(uid)
        user_events = [e.get("event_type", "") for e in data["events"] if not e.get("event_type", "").startswith("$")]
        events_all.update(user_events)
        event_counts.append(len(user_events))
        props = data["properties"]
        if props.get("platform"):
            platforms[props["platform"]] += 1
        if props.get("country"):
            countries[props["country"]] += 1
        time.sleep(1)

    return {
        "label": label,
        "n": len(user_ids),
        "sampled": min(len(user_ids), 20),
        "avg_events": sum(event_counts) / max(len(event_counts), 1),
        "top_events": events_all.most_common(8),
        "platforms": platforms.most_common(3),
        "countries": countries.most_common(5),
    }

fast_profile = profile_cohort(FAST_IDS, "Fast Onboarders (<24h)")
slow_profile = profile_cohort(SLOW_IDS, "Slow Onboarders (>7d)")

fast_persona = requests.post(f"{MB}/personas", headers=MH, json={
    "name": "Amplitude: Fast Onboarder (<24h)",
    "description": (
        f"Users who completed onboarding within 24 hours. "
        f"Avg events: {fast_profile['avg_events']:.0f}. "
        f"Top actions: {', '.join(e for e, _ in fast_profile['top_events'][:5])}. "
        f"Platforms: {', '.join(p for p, _ in fast_profile['platforms'])}."
    ),
    "psychographic": {
        "onboarding_speed": "fast",
        "avg_events": fast_profile["avg_events"],
        "top_actions": [e for e, _ in fast_profile["top_events"][:5]],
    },
}).json()
time.sleep(0.3)

slow_persona = requests.post(f"{MB}/personas", headers=MH, json={
    "name": "Amplitude: Slow Onboarder (>7d)",
    "description": (
        f"Users who took 7+ days to complete onboarding. "
        f"Avg events: {slow_profile['avg_events']:.0f}. "
        f"Top actions: {', '.join(e for e, _ in slow_profile['top_events'][:5])}. "
        f"Platforms: {', '.join(p for p, _ in slow_profile['platforms'])}."
    ),
    "psychographic": {
        "onboarding_speed": "slow",
        "avg_events": slow_profile["avg_events"],
        "top_actions": [e for e, _ in slow_profile["top_events"][:5]],
    },
}).json()

def format_profile(prof):
    events = ", ".join(f"{e} ({c})" for e, c in prof["top_events"][:6])
    return f"  N={prof['n']}, sampled={prof['sampled']}, avg events={prof['avg_events']:.0f}\n  Top: {events}"

context = f"""Two behavioral cohorts from Amplitude onboarding data:

FAST ONBOARDERS (completed onboarding in <24 hours):
{format_profile(fast_profile)}

SLOW ONBOARDERS (took >7 days to complete onboarding):
{format_profile(slow_profile)}

Fast onboarders retain at 68% (30d). Slow onboarders retain at 22% (30d)."""

fg = requests.post(f"{MB}/focus-groups", headers=MH, json={
    "name": "Amplitude: Onboarding Speed Cohort Study",
    "persona_ids": [fast_persona["id"], slow_persona["id"]],
    "questions": [
        "Walk me through your first day with a new software tool. What makes you complete setup quickly versus putting it off?",
        "When you signed up, what was your immediate goal? Were you trying to solve a specific problem or just exploring?",
        "What would have made you complete onboarding faster? Be specific — was something confusing, unnecessary, or missing?",
        "If a tool required you to invite teammates during onboarding, would that speed you up or slow you down? Why?",
        "Rank what matters most in your first session: (1) Seeing sample data, (2) Completing a real task, (3) Customizing settings, (4) Reading documentation, (5) Watching a tutorial. Explain your #1.",
    ],
    "context": context,
    "responses_per_persona": 2,
}).json()

for _ in range(30):
    time.sleep(5)
    data = requests.get(f"{MB}/focus-groups/{fg['id']}", headers=MH).json()
    if data.get("status") == "completed":
        break

print(f"Focus Group: {data.get('id')} — {data.get('status')}\n")
for resp in data.get("responses", []):
    print(f"[{resp.get('persona_id','?')}] {resp.get('question','')[:80]}")
    print(f"  → {resp.get('answer','')[:300]}\n")

Example Output

Focus Group: fg_amp_cohort_9k2a — completed

[Fast Onboarder] Walk me through your first day with new software
  → I have a problem to solve TODAY. I sign up, skip the welcome tour,
    go straight to the feature I need. If I can't do something useful in
    10 minutes, I'll try the next tool. Setup wizards that force me
    through 8 steps before I can do anything are the fastest way to
    lose me — let me jump to the thing I came for.

[Slow Onboarder] Walk me through your first day with new software
  → I usually sign up during a meeting or after reading a blog post,
    then forget about it for a few days. When I come back, I've lost
    context. I need the tool to remind me why I signed up and pick up
    where I left off, not start from scratch.

[Fast Onboarder] If a tool required teammate invites during onboarding
  → Speed up. If I can invite my team early, I look like the person
    who found the solution. But ONLY if the invite step takes <30
    seconds and I can add them by email without configuring permissions.

[Slow Onboarder] Rank first-session priorities
  → #1: Seeing sample data. I don't have my own data ready on day 1.
    If the product shows me what it looks like with real content, I
    can evaluate it without doing any work. Empty states kill momentum.

Error Handling

Cohort user IDs

You must supply user IDs for each cohort. Export these from Amplitude cohorts (Amplitude → Cohorts → Export CSV) or query via the Behavioral Cohorts API. The User Activity endpoint does not accept cohort IDs directly.

Rate limiting per user lookup

Each /api/2/useractivity call counts against the 360/hour limit. For 20 users per cohort (40 total), this uses 40 of your 360 hourly queries. Add 1-second delays between calls.

Focus Group polling duration

2 personas × 5 questions × 2 responses = 20 total responses. Allow 90–150s for completion. The loop provides ~150s.

What’s Next

Amplitude Integration

Back to Amplitude integration overview

Pricing Focus Group

Validate willingness-to-pay with revenue-driven personas

Focus Groups API

Full reference for POST /api/v1/focus-groups

Personas API

Full reference for POST /api/v1/personas

User Activity Timeline → Journey Mapping

Revenue Analysis → Pricing Focus Group

⌘I

Scenario
Architecture
Code
Example Output
Error Handling
What’s Next

Overview

Salesforce

HubSpot

Pipedrive

Close CRM

Meta Ads

Google Ads

LinkedIn Marketing

TikTok

YouTube

Reddit

X / Twitter

LinkedIn Content

Vimeo

Wistia

Google Analytics (GA4)

Mixpanel

Amplitude

Segment

Mailchimp

Klaviyo

Customer.io

SendGrid

Typeform

SurveyMonkey

Qualtrics

Shopify

Stripe

BigCommerce

SEMrush

Ahrefs

WordPress

NewsAPI

Perigon

Alpha Vantage

Slack

Discord

Twilio

Notion

Asana

Linear

Jira

OpenAI

Anthropic

ElevenLabs

Deepgram

Greenhouse

Lever

LinkedIn Talent

G2

Trustpilot

Google Business

Yelp

Documentation Index

​Scenario

​Architecture

​Code

​Example Output

​Error Handling

​What’s Next