Candidate Pipeline → Employer Brand Personas

Scenario

Your Greenhouse pipeline has thousands of candidates across sources (LinkedIn, referrals, careers page, agencies) and stages (applied, phone screen, onsite, offer, hired, rejected). Each segment has a distinct experience of your employer brand. You extract candidates grouped by source and stage outcome, build Mavera personas representing each segment, then run Focus Groups to test your employer messaging before publishing it. Flow: Greenhouse GET /candidates → Filter by source/stage → Aggregate traits → Mavera POST /personas → POST /focus-groups → Employer messaging validation

Architecture

Code

import os, requests, time, base64
from collections import defaultdict

GH_KEY = os.environ["GREENHOUSE_API_KEY"]
MV = os.environ["MAVERA_API_KEY"]
GH_BASE = "https://harvest.greenhouse.io/v1"
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

gh_auth = base64.b64encode(f"{GH_KEY}:".encode()).decode()
GH_H = {"Authorization": f"Basic {gh_auth}"}

SOURCES = ["LinkedIn", "Referral", "Careers Page"]
OUTCOMES = ["hired", "rejected"]

def gh_get(path, params=None):
    r = requests.get(f"{GH_BASE}{path}", headers=GH_H, params=params or {})
    if r.status_code == 429:
        time.sleep(10)
        return gh_get(path, params)
    r.raise_for_status()
    return r.json()

# 1. Pull candidates and their applications
candidates = []
page = 1
while len(candidates) < 500:
    batch = gh_get("/candidates", {"per_page": 100, "page": page})
    if not batch:
        break
    candidates.extend(batch)
    page += 1

# 2. Group by source × outcome
segments = defaultdict(list)
for c in candidates:
    for app in c.get("applications", []):
        source_name = (app.get("source", {}) or {}).get("public_name", "Unknown")
        status = app.get("status", "active")
        if source_name in SOURCES and status in OUTCOMES:
            segments[(source_name, status)].append({
                "name": f"{c.get('first_name','')} {c.get('last_name','')}",
                "title": c.get("title", ""),
                "company": c.get("company", ""),
                "application_count": len(c.get("applications", [])),
            })

# 3. Create Mavera personas per segment
persona_ids = []
for (source, outcome), members in segments.items():
    if len(members) < 3:
        continue
    titles = list({m["title"] for m in members if m["title"]})[:5]
    companies = list({m["company"] for m in members if m["company"]})[:5]

    p = requests.post(f"{MV_BASE}/personas", headers=MV_H, json={
        "name": f"GH: {source} → {outcome.title()}",
        "description": (
            f"Candidates from {source} who were {outcome}. "
            f"N={len(members)}. Titles: {', '.join(titles[:3])}. "
            f"Companies: {', '.join(companies[:3])}."
        ),
        "demographic": {"job_titles": titles},
        "psychographic": {
            "source": source,
            "outcome": outcome,
            "mindset": f"Candidate who was {outcome} via {source}",
        },
    }).json()
    persona_ids.append({"id": p["id"], "label": f"{source} → {outcome.title()}"})
    print(f"Persona: {p['id']} — {source} → {outcome.title()} ({len(members)} candidates)")
    time.sleep(0.3)

# 4. Run Focus Group with employer messaging
EMPLOYER_MESSAGE = """Join a team that ships fast, learns faster, and celebrates wins together.
We offer competitive comp, unlimited PTO, and a culture where engineers own their roadmap.
"Best decision I ever made." — Senior Engineer, 2 years"""

fg = requests.post(f"{MV_BASE}/focus-groups", headers=MV_H, json={
    "name": "Employer Brand Messaging Test",
    "persona_ids": [p["id"] for p in persona_ids],
    "questions": [
        "How authentic does this employer message feel on a scale of 1-10?",
        "Would this message make you more or less likely to apply? Why?",
        "What specific claim feels most credible? Least credible?",
        "How would you describe this company's culture to a friend based on this message?",
    ],
    "context": EMPLOYER_MESSAGE,
    "responses_per_persona": 3,
}).json()

# 5. Poll for results
for _ in range(20):
    time.sleep(5)
    data = requests.get(f"{MV_BASE}/focus-groups/{fg['id']}", headers=MV_H).json()
    if data.get("status") == "completed":
        break

for resp in data.get("responses", []):
    label = next((p["label"] for p in persona_ids if p["id"] == resp.get("persona_id")), "?")
    print(f"\n[{label}] {resp.get('question','')[:60]}")
    print(f"  → {resp.get('answer','')[:250]}")

Example Output

{
  "personas_created": 6,
  "segments": [
    { "source": "LinkedIn", "outcome": "hired", "n": 82 },
    { "source": "LinkedIn", "outcome": "rejected", "n": 234 },
    { "source": "Referral", "outcome": "hired", "n": 45 },
    { "source": "Referral", "outcome": "rejected", "n": 67 },
    { "source": "Careers Page", "outcome": "hired", "n": 31 },
    { "source": "Careers Page", "outcome": "rejected", "n": 112 }
  ],
  "focus_group_sample": [
    {
      "persona": "LinkedIn → Rejected",
      "question": "How authentic does this employer message feel?",
      "answer": "5/10. 'Unlimited PTO' is a red flag — usually means nobody actually takes it. The engineer quote feels planted."
    },
    {
      "persona": "Referral → Hired",
      "question": "Would this make you more likely to apply?",
      "answer": "More likely. The 'own their roadmap' line matches what my referrer told me. Consistency matters."
    }
  ]
}

Error Handling

Rate limit (50 req/10 sec)

Greenhouse returns 429 with a Retry-After header. The code sleeps 10 seconds on 429. For bulk pulls (1000+ candidates), add exponential backoff and paginate with per_page=100.

Missing source data

Not all applications have a source object. The code defaults to "Unknown". Check your Greenhouse → Configure → Sources to ensure sources are assigned to all job boards.

Application status values

Valid statuses: active, rejected, hired. Custom stages show under current_stage rather than top-level status.

Auth encoding

Greenhouse uses HTTP Basic with the API key as username and an empty password. Always encode as base64(key + ':') — the trailing colon is required.

Greenhouse

Job Posting Optimization

⌘I

Scenario
Architecture
Code
Example Output
Error Handling

Overview

Salesforce

HubSpot

Pipedrive

Close CRM

Meta Ads

Google Ads

LinkedIn Marketing

TikTok

YouTube

Reddit

X / Twitter

LinkedIn Content

Vimeo

Wistia

Google Analytics (GA4)

Mixpanel

Amplitude

Segment

Mailchimp

Klaviyo

Customer.io

SendGrid

Typeform

SurveyMonkey

Qualtrics

Shopify

Stripe

BigCommerce

SEMrush

Ahrefs

WordPress

NewsAPI

Perigon

Alpha Vantage

Slack

Discord

Twilio

Notion

Asana

Linear

Jira

OpenAI

Anthropic

ElevenLabs

Deepgram

Greenhouse

Lever

LinkedIn Talent

G2

Trustpilot

Google Business

Yelp

Documentation Index

​Scenario

​Architecture

​Code

​Example Output

​Error Handling

Scenario

Architecture

Code

Example Output

Error Handling