Comment Sentiment → Persona Validation

Scenario

Your product’s YouTube videos have hundreds of comments containing raw, unsolicited audience language — objections, praise, feature requests, and emotional reactions you can’t get from surveys. This job pulls up to 200 comments via commentThreads.list, then sends them to a Mavera research persona through Chat. The persona segments commenters into audience archetypes, identifies dominant sentiment themes, and surfaces the language patterns your marketing should mirror. The result is persona validation grounded in authentic audience voice.

Architecture

Code

import os, requests, time

YT = os.environ["YOUTUBE_API_KEY"]
MV = os.environ["MAVERA_API_KEY"]
YT_BASE = "https://www.googleapis.com/youtube/v3"
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

VIDEO_ID = "dQw4w9WgXcQ"  # Replace with your video ID

# 1. Fetch up to 200 comments (1 quota unit per page, ~20 per page)
comments = []
page_token = None
while len(comments) < 200:
    params = {
        "key": YT, "videoId": VIDEO_ID,
        "part": "snippet", "maxResults": 100,
        "order": "relevance", "textFormat": "plainText",
    }
    if page_token:
        params["pageToken"] = page_token

    r = requests.get(f"{YT_BASE}/commentThreads", params=params)
    if r.status_code == 403:
        print("Comments disabled or API quota exceeded")
        break
    r.raise_for_status()
    data = r.json()

    for item in data.get("items", []):
        snippet = item["snippet"]["topLevelComment"]["snippet"]
        comments.append({
            "text": snippet["textDisplay"][:500],
            "likes": snippet.get("likeCount", 0),
            "published": snippet.get("publishedAt", ""),
        })

    page_token = data.get("nextPageToken")
    if not page_token:
        break

print(f"Collected {len(comments)} comments")

# 2. Create a research analyst persona
persona = requests.post(f"{MV_BASE}/personas", headers=MV_H, json={
    "name": "YouTube Audience Analyst",
    "description": (
        "Senior qualitative researcher specializing in digital audience analysis. "
        "Expert at identifying audience segments from unstructured text, detecting "
        "sentiment patterns, and extracting the language that resonates with each segment. "
        "Thinks in terms of jobs-to-be-done and psychographic clusters, not just demographics."
    ),
}).json()

# 3. Build comment block sorted by engagement
comments.sort(key=lambda c: -c["likes"])
comment_block = "\n".join(
    f"[{c['likes']} likes] {c['text'][:300]}"
    for c in comments[:200]
)

# 4. Analyze via Mave Chat with persona context
analysis = requests.post(f"{MV_BASE}/mave/chat", headers=MV_H, json={
    "persona_id": persona["id"],
    "message": f"""Analyze these {len(comments)} YouTube comments as audience segments.

COMMENTS:
{comment_block}

Produce:
1. **Audience Segments** (3-5 clusters): Name each segment, estimate its share of comments, describe their motivation, and quote 2-3 representative comments verbatim.
2. **Sentiment Distribution**: % positive / neutral / negative with examples.
3. **Language Patterns**: Exact phrases, slang, and emotional triggers your marketing should adopt.
4. **Objections & Concerns**: Recurring pain points or skepticism themes.
5. **Content Opportunities**: Topics commenters ask about that you haven't covered.
6. **Persona Validation**: Do these segments match typical B2C buyer personas? Where do they diverge?""",
}).json()

print(f"\nAUDIENCE ANALYSIS — {len(comments)} comments from {VIDEO_ID}")
print("=" * 60)
print(analysis.get("content", "")[:2500])

Example Output

Collected 200 comments

AUDIENCE ANALYSIS — 200 comments from dQw4w9WgXcQ
============================================================

## Audience Segments

### 1. Power Users (32% of comments)
Motivation: Already own the product, seeking advanced tips and validation.
Quotes: "Been using this for 6 months — game changer for my workflow"
         "Can you do a deep dive on the API integration?"

### 2. Skeptical Evaluators (28%)
Motivation: Comparing alternatives, need proof before commitment.
Quotes: "How does this compare to [Competitor]? Switching costs are real"
         "Nice demo but show me real results from actual customers"

### 3. Aspirational Followers (22%)
Motivation: Follow the brand for inspiration, not yet in buying mode.
Quotes: "This is so cool, saving for when I can afford it"
         "Love the energy in this video, keep making content like this"

### 4. Feature Requesters (12%)
Motivation: Active users pushing for specific capabilities.
Quotes: "PLEASE add dark mode" / "When is the mobile app coming?"

### 5. Trolls & Off-Topic (6%)
Filtered — spam, unrelated memes, or single-emoji reactions.

## Sentiment Distribution
Positive: 58% | Neutral: 26% | Negative: 16%

## Language Patterns
- "game changer" (14 uses) — adopt as social proof language
- "finally" (9 uses) — signals long-awaited solution framing
- Question format "but does it...?" — address preemptively in ads

## Objections
- Price sensitivity (11 comments mention cost)
- Competitor comparison requests (8 comments)
- "Is this just another [category]?" — differentiation needed

Error Handling

Comments disabled

Videos with disabled comments return HTTP 403 with commentsDisabled. Check the video’s snippet.liveBroadcastContent and fallback to a different video.

Quota cost for pagination

Each commentThreads.list page costs 1 quota unit. Fetching 200 comments across 2 pages = 2 units — very efficient. The maxResults ceiling is 100.

Comment language filtering

Non-English comments may skew analysis. Add &searchTerms= or filter by detected language before sending to Mave if your audience is monolingual.

YouTube Integration

Personas API

Competitor Ad Analysis Showdown

Overview

Salesforce

HubSpot

Pipedrive

Close CRM

Meta Ads

Google Ads

LinkedIn Marketing

TikTok

YouTube

Reddit

X / Twitter

LinkedIn Content

Vimeo

Wistia

Google Analytics (GA4)

Mixpanel

Amplitude

Segment

Mailchimp

Klaviyo

Customer.io

SendGrid

Typeform

SurveyMonkey

Qualtrics

Shopify

Stripe

BigCommerce

SEMrush

Ahrefs

WordPress

NewsAPI

Perigon

Alpha Vantage

Slack

Discord

Twilio

Notion

Asana

Linear

Jira

OpenAI

Anthropic

ElevenLabs

Deepgram

Greenhouse

Lever

LinkedIn Talent

G2

Trustpilot

Google Business

Yelp

Documentation Index

​Scenario

​Architecture

​Code

​Example Output

​Error Handling