Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mavera.io/llms.txt

Use this file to discover all available pages before exploring further.

Scenario

TikTok is an audio-first platform — the same visual with different audio can produce wildly different engagement. This job takes video ad variants that share the same visual but use different audio tracks (trending sound vs. original music vs. voiceover-only), runs Video Analysis on each, and compares emotional intensity, mood congruence, and pacing alignment. The result tells you exactly which audio treatment maximizes emotional response for your visual content.

Architecture

Code

import os, requests, time

MV = os.environ["MAVERA_API_KEY"]
MV_BASE = "https://app.mavera.io/api/v1"
MV_H = {"Authorization": f"Bearer {MV}", "Content-Type": "application/json"}

VIDEO_VARIANTS = [
    {"path": "ads/product_demo_trending_sound.mp4", "label": "Trending Sound", "audio": "Espresso remix — high energy, beat drops at cuts"},
    {"path": "ads/product_demo_original_music.mp4", "label": "Original Music", "audio": "Custom lo-fi track — chill, ambient, brand-composed"},
    {"path": "ads/product_demo_voiceover.mp4", "label": "Voiceover Only", "audio": "Founder narration — direct, educational, no background music"},
    {"path": "ads/product_demo_asmr.mp4", "label": "ASMR / Product Sounds", "audio": "Product interaction sounds — tapping, pouring, unboxing"},
]

# 1. Upload all variants and run analysis
analyses = []
for variant in VIDEO_VARIANTS:
    upload = requests.post(f"{MV_BASE}/assets",
        headers={"Authorization": f"Bearer {MV}"},
        files={"file": (variant["label"] + ".mp4", open(variant["path"], "rb"), "video/mp4")},
    ).json()

    analysis = requests.post(f"{MV_BASE}/video-analysis", headers=MV_H, json={
        "asset_id": upload["id"],
        "analysis_types": ["emotional_arc", "mood_congruence", "pacing", "hook_score", "audio_impact"],
        "metadata": {"label": variant["label"], "audio_description": variant["audio"]},
    }).json()

    analyses.append({"id": analysis["id"], "label": variant["label"], "audio": variant["audio"]})
    time.sleep(0.5)

# 2. Poll all analyses
results = []
for a in analyses:
    for _ in range(30):
        time.sleep(3)
        status = requests.get(f"{MV_BASE}/video-analysis/{a['id']}", headers=MV_H).json()
        if status.get("status") == "completed":
            break
    r = status.get("results", {})
    results.append({
        "label": a["label"],
        "audio": a["audio"],
        "emotional_intensity": r.get("emotional_arc", {}).get("intensity_avg", 0),
        "peak_emotion": r.get("emotional_arc", {}).get("peak_emotion", "N/A"),
        "peak_timestamp": r.get("emotional_arc", {}).get("peak_timestamp", 0),
        "mood_congruence": r.get("mood_congruence", {}).get("score", 0),
        "pacing_score": r.get("pacing", {}).get("score", 0),
        "hook_score": r.get("hook_score", {}).get("score", 0),
        "audio_energy": r.get("audio_impact", {}).get("energy", 0),
    })

# 3. Comparative report
results.sort(key=lambda x: -x["emotional_intensity"])

print("SOUND/MUSIC IMPACT ANALYSIS — Same Visual, Different Audio")
print("=" * 70)
print(f"{'Variant':<22} {'Emotion':<10} {'Mood Fit':<10} {'Pacing':<10} {'Hook':<8} {'Audio E'}")
print("-" * 70)
for r in results:
    print(f"{r['label']:<22} {r['emotional_intensity']:.1f}/10   {r['mood_congruence']:.1f}/10   {r['pacing_score']:.1f}/10   {r['hook_score']}/100  {r['audio_energy']:.1f}")

# 4. Recommendation
best = results[0]
worst = results[-1]
delta = best["emotional_intensity"] - worst["emotional_intensity"]
print(f"\nWINNER: {best['label']}")
print(f"  Emotional intensity: {best['emotional_intensity']:.1f}/10 (peak: {best['peak_emotion']} at {best['peak_timestamp']}s)")
print(f"  Mood congruence: {best['mood_congruence']:.1f}/10")
print(f"  vs worst ({worst['label']}): +{delta:.1f} emotional intensity")
print(f"\nRECOMMENDATION: Use '{best['label']}' audio treatment for this visual.")
if best["mood_congruence"] < 6:
    print(f"  ⚠ Mood congruence is low ({best['mood_congruence']:.1f}). Audio energy may not match visual tone — test with audience.")

Example Output

SOUND/MUSIC IMPACT ANALYSIS — Same Visual, Different Audio
======================================================================
Variant               Emotion   Mood Fit  Pacing    Hook    Audio E
----------------------------------------------------------------------
Trending Sound        8.4/10    7.2/10    9.1/10    89/100  9.0
ASMR / Product Sounds 7.1/10    8.8/10    6.5/10    72/100  3.2
Voiceover Only        6.3/10    7.9/10    7.0/10    68/100  2.1
Original Music        5.8/10    6.1/10    5.4/10    55/100  5.5

WINNER: Trending Sound
  Emotional intensity: 8.4/10 (peak: excitement at 3.8s)
  Mood congruence: 7.2/10
  vs worst (Original Music): +2.6 emotional intensity

RECOMMENDATION: Use 'Trending Sound' audio treatment for this visual.

KEY INSIGHT: Trending Sound wins on emotion (+2.6 over Original Music)
and pacing (+3.7) because beat drops align with visual cuts. However,
ASMR scores highest on mood congruence (8.8) — the product sounds
feel most authentic to the visual. Consider ASMR for organic posts
and Trending Sound for paid amplification.

Error Handling

Ensure all variants share the exact same visual edit. If visuals differ even slightly (different color grades, trimmed frames), the audio comparison is contaminated. Use a single exported visual with separate audio mixes.
The audio_impact analysis type may require a specific Mavera plan tier. If unavailable, use emotional_arc + pacing as proxies for audio effect.
TikTok ads can be up to 500MB. Mavera asset uploads may have lower limits. Compress to 720p/1080p before uploading. H.264 codec recommended.

All TikTok jobs

Video Analysis