Likes lie. People double-tap while scrolling in bed, half-watching, half-asleep. A heart costs nothing and means almost nothing. Comments are different. Comments require a stop. They require typing on a phone keyboard. They require a person to feel something strongly enough to interrupt their scroll and broadcast that feeling under their own name.
That is why customer success teams, brand managers, and product researchers who want honest signal from TikTok stop counting hearts and start reading comments. The problem is volume. A modest viral video pulls 4,000 comments. A product-launch post from a megabrand pulls 80,000. No human team can read that. You need a pipeline.
This guide shows you how to build one end to end using the TikLiveAPI /post-comments/ and /post-comment-replies/ endpoints, Python, language detection, a transformer sentiment model, aspect extraction, and a Postgres schema you can query from any BI tool. By the end you will have a working system that turns raw TikTok comment threads into a sentiment dashboard your CEO can read in 30 seconds.
TikLiveAPI returns TikTok comment threads as structured JSON. The base URL is https://api.tikliveapi.com and every request authenticates with the header X-Api-Key: YOUR_API_KEY. Two endpoints carry the entire dataset for this pipeline.
Top-level comments come from /post-comments/. You pass the video URL, a count, and a cursor. The response looks like this:
{
"comments": [
{
"id": "7301234567890123456",
"video_id": "7298765432109876543",
"text": "Honestly the new formula smells so much better than v1",
"create_time": 1714512345,
"digg_count": 142,
"reply_total": 8,
"user": { "uniqueId": "skinfan22", "nickname": "Mara", "region": "US" },
"status": 1
}
],
"total": 4821,
"cursor": 20,
"hasMore": true
}
Key things to notice. The comment identifier is id (not cid, a common gotcha). The numeric engagement signal you care about is digg_count (likes on the comment itself). The signal that tells you a comment sparked conversation is reply_total. The pagination is a numeric cursor and hasMore is camelCase boolean.
Replies live behind /post-comment-replies/ and take video_id and comment_id. The reply objects use the same id field but do not include reply_total (replies-to-replies are not exposed). This matters when you write your schema and your fetcher because you cannot recurse infinitely; you stop at depth two.
Sentiment analysis is only useful when scoped. Pick one of four scopes and stick to it for the duration of your study.
sort_by=2 for date order. Pull the latest 100 videos every day.For this tutorial we will focus on a product-launch scenario with a fixed list of seed URLs.
The cursor pagination is straightforward. Page until hasMore is false, sleep briefly between calls to respect the 200 requests per minute rate limit, and persist every comment you see.
import os, time, requests
API_BASE = "https://api.tikliveapi.com"
HEADERS = {"X-Api-Key": os.environ["TIKLIVE_API_KEY"]}
def fetch_all_comments(video_url, page_size=50):
comments, cursor = [], 0
while True:
r = requests.get(
f"{API_BASE}/post-comments/",
headers=HEADERS,
params={"url": video_url, "count": page_size, "cursor": cursor},
timeout=30,
)
r.raise_for_status()
data = r.json()
batch = data.get("comments", [])
comments.extend(batch)
if not data.get("hasMore"):
break
cursor = data.get("cursor", cursor + page_size)
time.sleep(0.35) # ~170 req/min, well under the 200 rpm ceiling
return comments
One subtle point. The cursor returned by /post-comments/ is a numeric integer (an offset), not an opaque token. If a request times out you can resume from the last successful cursor without losing position.
Fetching every reply on every comment is wasteful and expensive. A typical viral video has 80 percent of comments with zero replies. Filter to comments where reply_total >= 3 or digg_count >= 10. Those are the threads where real conversation happened.
def fetch_replies(video_id, comment_id, page_size=50):
replies, cursor = [], 0
while True:
r = requests.get(
f"{API_BASE}/post-comment-replies/",
headers=HEADERS,
params={
"video_id": video_id,
"comment_id": comment_id,
"count": page_size,
"cursor": cursor,
},
timeout=30,
)
r.raise_for_status()
data = r.json()
replies.extend(data.get("comments", []))
if not data.get("hasMore"):
break
cursor = data.get("cursor", cursor + page_size)
time.sleep(0.35)
return replies
def enrich_with_replies(video_id, top_comments):
for c in top_comments:
if c.get("reply_total", 0) >= 3 or c.get("digg_count", 0) >= 10:
c["replies"] = fetch_replies(video_id, c["id"])
else:
c["replies"] = []
return top_comments
TikTok is global. A single product video gets comments in English, Spanish, Portuguese, Indonesian, Arabic, and emoji-only. If you pipe Indonesian text into an English-only sentiment model you will get garbage scored as neutral. Detect first, then route.
from langdetect import detect, DetectorFactory
DetectorFactory.seed = 42 # deterministic results
def detect_lang(text):
text = (text or "").strip()
if len(text) < 3:
return "und" # undetermined
try:
return detect(text)
except Exception:
return "und"
For higher accuracy on short social text, swap langdetect for Facebook's fasttext lid.176 model. It handles 176 languages and is dramatically better on three-word comments.
You have three real options. Pick based on volume, budget, and quality target.
Rule-based (VADER). Free, fast, English-biased. Great for a prototype, terrible for anything multilingual or sarcastic. Useful as a baseline you compare other models against.
Transformer pipeline. The workhorse. cardiffnlp/twitter-roberta-base-sentiment-latest on HuggingFace is trained on ~124M tweets and handles informal social text the way TikTok actually writes. For multilingual coverage use cardiffnlp/twitter-xlm-roberta-base-sentiment which speaks 8 languages.
LLM. Claude or GPT-4 give the best nuance, catch sarcasm, and can return structured aspect tags in one call. The cost is roughly 50 to 200 times the transformer approach. Reserve LLMs for the 5 percent of comments your transformer flags as low confidence.
Here is the transformer path, batched and GPU-aware.
from transformers import pipeline
import torch
device = 0 if torch.cuda.is_available() else -1
sentiment = pipeline(
"sentiment-analysis",
model="cardiffnlp/twitter-xlm-roberta-base-sentiment",
device=device,
truncation=True,
max_length=256,
)
LABEL_MAP = {"Positive": 1, "Neutral": 0, "Negative": -1,
"positive": 1, "neutral": 0, "negative": -1}
def score_batch(texts, batch_size=64):
results = sentiment(texts, batch_size=batch_size)
out = []
for r in results:
label = r["label"]
score = float(r["score"])
out.append({
"label": label.lower(),
"polarity": LABEL_MAP.get(label, 0),
"confidence": score,
})
return out
"This is great" is useful but vague. "The cream feels amazing but shipping took three weeks" carries two opposite signals about two different aspects. Aspect-based sentiment splits a comment into themes and scores each.
The cheap way is keyword bucketing. Define your aspects up front (they should map to internal team responsibilities) and tag any comment that mentions a keyword in that bucket.
ASPECTS = {
"product_quality": ["quality", "smell", "texture", "formula",
"ingredients", "feels", "works", "results"],
"shipping": ["shipping", "delivery", "arrived", "package",
"tracking", "fedex", "ups", "late", "fast"],
"customer_service": ["support", "service", "refund", "reply",
"responded", "rude", "helpful", "email"],
"price": ["price", "expensive", "cheap", "worth", "value",
"overpriced", "deal", "sale"],
}
def tag_aspects(text):
lower = (text or "").lower()
return [a for a, kws in ASPECTS.items() if any(k in lower for k in kws)]
For each tagged aspect, store the comment-level polarity against that aspect. A comment tagged with both product_quality and shipping creates two rows. For nuance, run the LLM only on the ambiguous cases (comments where multiple aspects appear with opposite likely sentiments).
Postgres handles this workload comfortably. Three tables: raw comments, sentiment scores, and aspect rows. Keep them normalized so you can rescore later when a new model ships without re-fetching from the API.
CREATE TABLE comments (
id BIGINT PRIMARY KEY,
video_id BIGINT NOT NULL,
parent_id BIGINT,
text TEXT NOT NULL,
language VARCHAR(8),
digg_count INTEGER NOT NULL DEFAULT 0,
reply_total INTEGER NOT NULL DEFAULT 0,
user_unique_id VARCHAR(64),
user_region VARCHAR(8),
create_time TIMESTAMPTZ NOT NULL,
fetched_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE sentiments (
comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
model VARCHAR(64) NOT NULL,
label VARCHAR(16) NOT NULL,
polarity SMALLINT NOT NULL,
confidence NUMERIC(4,3) NOT NULL,
scored_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
PRIMARY KEY (comment_id, model)
);
CREATE TABLE aspects (
comment_id BIGINT REFERENCES comments(id) ON DELETE CASCADE,
aspect VARCHAR(64) NOT NULL,
polarity SMALLINT NOT NULL,
PRIMARY KEY (comment_id, aspect)
);
CREATE INDEX ON comments(video_id, create_time);
CREATE INDEX ON sentiments(model, polarity);
Notice parent_id on the comments table - that is how you wire replies to their parent thread. Set it to NULL for top-level comments and to the parent id for replies fetched in Step 3.
Once the data is in Postgres the dashboard writes itself. The widgets your stakeholders actually use:
digg_count + 1. A single negative comment with 5,000 likes outweighs 200 neutral comments with zero likes, and your dashboard should reflect that reality.A skincare brand seeded 12 TikTok creator videos around a new serum launch. The pipeline pulled comments daily from the launch URL plus the 12 collab URLs. Total comments after 7 days: 38,400.
Day 1 sentiment skewed 71 percent positive, 22 percent neutral, 7 percent negative. By day 3 the negative share had jumped to 19 percent, driven almost entirely by a new shipping aspect cluster ("ordered Monday still nothing", "tracking says preparing label since Tuesday"). The dashboard's top complaint widget surfaced this on day 3 morning. Ops escalated to the fulfillment vendor the same afternoon. By day 5 the shipping aspect had dropped back to 4 percent of negative volume.
That is the entire value proposition. Without the pipeline, this signal would have surfaced as a Trustpilot review three weeks later, after the launch window had closed.
Comments are public, but treating them as a free-for-all dataset will create problems. Three rules.
Do not store comment text long-term without a documented retention policy. Twelve months is a defensible default for analytics; anything longer needs a stated purpose. Drop the raw text and keep only aggregates after the retention window.
Anonymize user identifiers before sharing. Your internal team can see user_unique_id for QA. Anything that leaves the analytics environment (board decks, vendor reports) should drop it or hash it.
Honor deletion. If a TikTok user deletes their comment, your daily re-fetch will show it missing. Build a reconciliation job that soft-deletes any comment id that disappears from the API for two consecutive fetches.
TikLiveAPI does not store TikTok content on its servers - the endpoints return live data on demand. The downstream storage is your responsibility, and so is the policy attached to it.
The full pipeline from a cron-triggered fetcher to a populated dashboard fits in roughly 400 lines of Python plus the SQL schema above. The cost is dominated by the API credit count: one credit per page of /post-comments/ and one credit per page of /post-comment-replies/. A video with 5,000 comments and average reply depth of 1.4 will cost you about 110 credits to fully drain.
If you want to validate the data shape before writing code, paste the endpoints into the playground with a real video URL. Register at /register/ to get 100 free credits which is enough to fully drain three medium videos, score them, and prove the dashboard out before you commit to a credit pack on the pricing page. Reach support if you need a higher rate limit than the default 200 requests per minute. The full endpoint reference lives in the documentation.
On English comments, twitter-roberta-base-sentiment-latest hits around 88 to 92 percent agreement with human raters in our internal tests. Multilingual XLM-R drops 5 to 8 points. Sarcasm and Gen Z slang are the main failure modes. Route low-confidence cases to an LLM for a quality bump.
Cost and signal quality. Most replies are short reactions ("same", "lol", emojis) that add noise without changing the aspect picture. Filtering to high-engagement parents keeps you on the threads that actually changed minds and saves 60 to 80 percent of your API credits.
Yes. Resolve the competitor's userid via /userid/, page their videos via /user-posts/, and run the same fetcher across every video. The aspect buckets stay the same. You will spot category-level patterns - if everyone in your vertical is getting complaints about shipping, that is a market opportunity, not a competitive disadvantage.
The API returns an error for that URL. Wrap the per-video fetcher in a try block, log the failure, and move on. Mark the video as deleted in your videos table so the daily job skips it next run. Do not retry-storm.
Hourly for active launch windows, daily for ongoing brand monitoring. Comment velocity on viral TikToks drops 90 percent after 72 hours, so you do not gain much from sub-hourly polling beyond the first three days. After day 7 most videos can move to a weekly refresh.
Ready to put what you read into code? Try our endpoints live or grab the full reference.