Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mavera.io/llms.txt

Use this file to discover all available pages before exploring further.

Scenario

Enhance Mavera Speak sessions by assigning custom ElevenLabs voices to different personas. Each persona gets a unique voice, creating an immersive multi-voice audio experience. Flow: Mavera GET /personas → Match to ElevenLabs voices → POST /mave/chat (in-character) → ElevenLabs TTS per persona → Multi-voice audio

Code

import os, requests, time
EL_KEY = os.environ["ELEVENLABS_API_KEY"]
EL_BASE = "https://api.elevenlabs.io/v1"
MV = os.environ["MAVERA_API_KEY"]
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}
os.makedirs("speak_session_audio", exist_ok=True)
VOICE_POOL = [
    {"name": "Rachel", "id": "21m00Tcm4TlvDq8ikWAM", "type": "professional female"},
    {"name": "Drew", "id": "29vD33N1CtxCmqQRPOHJ", "type": "confident male"},
    {"name": "Clyde", "id": "2EiwWnXFnvU5JabPnv8n", "type": "warm authoritative"},
    {"name": "Domi", "id": "AZnzlk1XvdvUeBnXmlld", "type": "energetic female"},
    {"name": "Dave", "id": "CYw3kZ02Hs0563khs1Fj", "type": "casual conversational"},
]

# 1. Retrieve Mavera personas
resp = requests.get(f"{MV_BASE}/personas", headers=MV_H).json()
personas = (resp if isinstance(resp, list) else resp.get("data", []))[:5]
print(f"Personas: {len(personas)}")

# 2. Map personas to voices
mappings = []
for i, p in enumerate(personas):
    v = VOICE_POOL[i % len(VOICE_POOL)]
    mappings.append({"persona_id": p["id"], "name": p.get("name", f"Persona {i+1}"),
                     "voice_id": v["id"], "voice_name": v["name"]})
    print(f"  {p.get('name', 'N/A'):30s}{v['name']} ({v['type']})")

TOPIC = "What makes a marketing campaign truly memorable in 2026?"
tracks = []

for m in mappings:
    # 3. Generate in-character content
    chat = requests.post(f"{MV_BASE}/mave/chat", headers=MV_H, json={
        "message": f"You are {m['name']}. Respond to this discussion topic in 3-4 sentences, "
            f"staying in character. Speak naturally as in a roundtable.\n\nTopic: {TOPIC}",
        "persona_id": m["persona_id"],
    }).json()
    content = chat.get("content", "")
    print(f"\n[{m['name']}]: {content[:120]}...")
    time.sleep(1)

    # 4. Convert to audio with matched voice
    tts = requests.post(f"{EL_BASE}/text-to-speech/{m['voice_id']}",
        headers={"xi-api-key": EL_KEY, "Content-Type": "application/json"},
        json={"text": content, "model_id": "eleven_multilingual_v2",
              "voice_settings": {"stability": 0.45, "similarity_boost": 0.75,
                                 "style": 0.4, "use_speaker_boost": True}})
    if tts.status_code == 200:
        safe = m["name"].lower().replace(" ", "-").replace(",", "")[:30]
        path = f"speak_session_audio/{safe}.mp3"
        with open(path, "wb") as f: f.write(tts.content)
        tracks.append({"persona": m["name"], "voice": m["voice_name"],
                       "size_kb": len(tts.content) // 1024, "path": path})
        print(f"  → {path} ({len(tts.content) // 1024} KB)")
    time.sleep(2)

print(f"\nTotal: {len(tracks)} tracks, {sum(t['size_kb'] for t in tracks)} KB")

Example Output

  VP of Marketing   → Rachel  → vp-of-marketing.mp3 (67 KB)
  Product Manager   → Drew    → product-manager.mp3 (72 KB)
  Early-Stage Founder → Clyde → early-stage-founder.mp3 (58 KB)
  Gen Z Consumer    → Domi    → gen-z-consumer.mp3 (63 KB)

[VP of Marketing]: "Memorability comes from emotional resonance backed by data..."
[Gen Z Consumer]: "If it doesn't feel authentic, I scroll past in 0.3 seconds..."

Total: 4 tracks, 260 KB

Error Handling

Pre-made IDs are stable but can change. Verify with GET /voices before starting. If a mapped voice is missing, fall back to the next in the pool.
Voices repeat via modulo if personas exceed the pool. Expand by fetching all voices from GET /voices and selecting based on gender/accent metadata.
Concatenate with ffmpeg: ffmpeg -i "concat:track1.mp3|silence.mp3|track2.mp3" -c copy session.mp3. Generate gaps: ffmpeg -f lavfi -i anullsrc=r=44100:cl=mono -t 1 silence.mp3.