TikTok data is messy. The same field is camelCase in one endpoint and snake_case in the next, pagination uses cursors in some places and unix timestamps in others, and a "video" object can be missing entire branches because the post was deleted between your scrape and your write. If you build an ETL on top of TikLiveAPI without a test suite, you will not notice the rot for weeks. You will notice it the day your finance team asks why follower counts went flat for half your tenants.
This guide is for data engineers and analytics engineers running TikTok ingestion in production. We will walk through the five test types every TikTok ETL needs, then cover record-and-replay fixtures, shadow deployments, canary rollouts, CI wiring, and a small reusable harness. The examples use Python, pytest, and the responses library, but the patterns translate to any stack.
Run dbt tests or Great Expectations against the landed tables. Assert not-null on primary keys (aweme_id, id for users), accepted ranges on counters (follower_count >= 0), and uniqueness on the natural keys. These tests catch the case where a transform silently drops a column.
For each tenant table, assert max(loaded_at) > now() - interval '26 hours'. Twenty-six hours, not twenty-four, so a slow Sunday cron does not page anyone. Freshness is the single highest-ROI test you can write because silent staleness is the most common ETL failure mode.
The base URL is https://api.tikliveapi.com and every call carries X-Api-Key. Record a real response for each endpoint you use, store it as JSON, and write a contract test that asserts the top-level keys match what your parser expects. When the upstream shape changes, this test screams before bad data reaches the warehouse. The full endpoint list lives in the documentation.
Build a fixture directory mirroring the lib/example-responses/ pattern from the TikLiveAPI codebase: one JSON file per endpoint. Your tests load these and feed them through the real parser. No network. No flakes.
Generate edge cases the API will eventually hand you: empty videos[], unicode and emoji in nickname, hasMore: false on page one, comments where reply_total is zero but the replies endpoint still returns rows. Hypothesis will find shapes you did not write down.
You have two options. VCR.py records HTTP traffic into "cassettes" the first time a test runs and replays them after. Or you can keep things explicit and commit hand-curated JSON fixtures. For TikTok data I prefer the explicit JSON approach because the responses are large and you want code review to see when a fixture changes.
fixtures/
user_info_by_username.json # nested user{} + stats{} camelCase
user_posts.json # videos[] + cursor + hasMore
user_followers.json # followers[] + total + time + hasMore
user_following.json # followings[] (note the trailing s)
post_detail.json # flat snake_case, play/wmplay/hdplay
post_comments.json # comments[] with field 'id'
search_video.json # videos[] + cursor + hasMore
A loader keeps tests terse:
import json, pathlib
FIX = pathlib.Path(__file__).parent / "fixtures"
def load(name):
return json.loads((FIX / f"{name}.json").read_text())
Two schools. Mock at the HTTP layer with responses (or httpx_mock) so the test exercises your real client including retry and header logic. Or stub the whole client class with a fake that returns dicts. The first catches more bugs; the second runs faster. Use HTTP-level mocks for integration tests and full doubles for unit tests of pure transform code.
import responses
from etl.client import TikLiveClient
@responses.activate
def test_user_info_parses_nested_stats():
responses.add(
responses.GET,
"https://api.tikliveapi.com/userinfo-by-username/",
json=load("user_info_by_username"),
status=200,
)
client = TikLiveClient(api_key="test")
result = client.user_info("charlidamelio")
assert result["user"]["uniqueId"] == "charlidamelio"
assert result["stats"]["followerCount"] >= 0
Note the camelCase. If your transform code assumes follower_count here, it is broken. The user info endpoint is nested and camelCase; the post detail endpoint is flat and snake_case. Mixing these up is the most common bug I see.
EXPECTED_KEYS = {
"/userinfo-by-username/": {"user", "stats"},
"/user-posts/": {"videos", "cursor", "hasMore"},
"/user-followers/": {"followers", "total", "time", "hasMore"},
"/user-following/": {"followings", "total", "time", "hasMore"},
"/post-detail/": {"aweme_id", "play", "wmplay", "hdplay", "author"},
"/post-comments/": {"comments", "total", "cursor", "hasMore"},
}
def test_followers_uses_time_pagination():
payload = load("user_followers")
assert "time" in payload
assert "cursor" not in payload # followers pages by 'time', not cursor
def test_following_key_has_trailing_s():
payload = load("user_following")
assert "followings" in payload # NOT 'following'
def test_post_detail_has_hd_variant():
payload = load("post_detail")
for key in ("play", "wmplay", "hdplay"):
assert payload[key].startswith("http")
def test_comment_uses_id_field():
payload = load("post_comments")
first = payload["comments"][0]
assert "id" in first
assert "comment_id" not in first # field is 'id', not 'comment_id'
These four tests would have caught every TikTok-shape bug I have shipped in the last year. Run them in CI on every PR.
from hypothesis import given, strategies as st
@given(nickname=st.text(min_size=0, max_size=80))
def test_nickname_survives_unicode(nickname):
row = {"author": {"nickname": nickname, "unique_id": "u", "id": "1"}}
out = transform_video(row)
assert out["author_nickname"] == nickname # no encoding mangling
@given(videos=st.lists(st.dictionaries(st.text(), st.integers()), max_size=0))
def test_empty_videos_array_is_safe(videos):
payload = {"videos": videos, "cursor": "0", "hasMore": False}
assert list(iter_videos(payload)) == []
When you rewrite a transform, run the new version in parallel for a week. Write both outputs to staging tables (fct_videos__current, fct_videos__shadow) and run a daily diff query. If the shadow diverges by more than 0.1 percent of rows, the new code is not ready. Shadow mode catches the bugs your tests missed because nobody thought to write the test.
Pick one low-volume tenant. Enable the new pipeline for them only. Watch their dashboard for 48 hours. If the freshness test passes and the row counts look right, ramp to 10 percent of tenants, then 50, then all. Each ramp is a separate deploy, gated on a feature flag column in your tenants table. This pattern survived two TikTok response-shape changes in 2025 without a single customer-visible incident on a team I worked with.
name: tests
on: [push, pull_request]
jobs:
pytest:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: pip
- run: pip install -r requirements-dev.txt
- name: Run suite (no network)
env:
TIKLIVE_API_KEY: dummy-for-tests
TIKLIVE_BASE_URL: https://api.tikliveapi.com
run: pytest -q --maxfail=3 --durations=10
The mocked API key matters. Your client should refuse to start without one, and CI should never hit the real api.tikliveapi.com. If a test makes a real call, fail the build. The responses library does this by default in @responses.activate mode.
Every outbound request should carry a correlation ID you generate, log, and stash on the row in your warehouse. When a customer asks "why is this video missing", you grep the correlation ID and see exactly which TikLiveAPI call fed (or failed to feed) that row.
import uuid, logging, structlog
log = structlog.get_logger()
def fetch_user_posts(client, userid, cursor=0):
cid = str(uuid.uuid4())
log.info("tiktok.fetch.start", endpoint="/user-posts/",
userid=userid, cursor=cursor, correlation_id=cid)
resp = client.get("/user-posts/",
params={"userid": userid, "count": 30, "cursor": cursor},
headers={"X-Correlation-Id": cid})
log.info("tiktok.fetch.done", correlation_id=cid,
count=len(resp["videos"]), has_more=resp["hasMore"])
return resp, cid
Persist correlation_id as a column on the landing table. It is two extra bytes per row and it saves hours of debugging.
# tests/harness.py
import json, pathlib, responses
from contextlib import contextmanager
BASE = "https://api.tikliveapi.com"
FIX = pathlib.Path(__file__).parent / "fixtures"
def fixture(name):
return json.loads((FIX / f"{name}.json").read_text())
@contextmanager
def mock_endpoint(path, fixture_name, status=200):
with responses.RequestsMock() as rsps:
rsps.add(responses.GET, f"{BASE}{path}",
json=fixture(fixture_name), status=status)
yield rsps
def assert_keys(payload, *required):
missing = [k for k in required if k not in payload]
assert not missing, f"missing keys: {missing}"
Tests then read like prose:
def test_followers_page():
with mock_endpoint("/user-followers/", "user_followers"):
page = client.followers(userid="107955", count=30, time=0)
assert_keys(page, "followers", "total", "time", "hasMore")
The full suite should run in under five minutes on CI. If it crosses five minutes, engineers stop running it locally and bugs leak through. Tactics: parallelize with pytest-xdist, scope fixtures to module level when safe, and quarantine slow integration tests into a nightly job. Track suite duration as a metric and alert on regressions.
On a recent project we ingested 12 of the 37 TikLiveAPI endpoints into Snowflake. Final numbers after two sprints of test work: 84 percent line coverage, 71 unit tests, 23 contract tests, 14 property-based tests, full suite at 3 minutes 41 seconds. Two shape changes from upstream were caught by contract tests before they reached production. Zero data incidents in the following quarter. The fixture directory had 31 JSON files totaling 412 KB. The harness file was 38 lines. Total test-code-to-production-code ratio was 0.7, which is high but appropriate for ETL where bad data is expensive.
Do I need to test every one of the 37 endpoints? Test the ones you ingest. For each, write one contract test, one happy-path integration test, and at least one edge-case property test. Skip endpoints you do not call.
Should fixtures be committed or generated? Committed. Generated fixtures hide upstream changes inside a noisy diff. Commit them, review them, and update them deliberately with a small scripts/refresh_fixtures.py that hits the playground or the real API behind a flag.
How do I test pagination without hitting the network? Mock two consecutive responses: the first with hasMore: true and a cursor, the second with hasMore: false. For followers and following, mock the time field instead, since those endpoints page by timestamp rather than cursor.
What about rate limiting? Credit deduction and rate limiting happen on the API server, not in your ETL. Your tests should mock 429 responses and assert your client backs off correctly. Pricing details live on the pricing page.
Where do I file bugs in upstream shapes? Open a support ticket with the correlation ID, the request parameters, and the unexpected response. Also check the blog for shape-change announcements and review your profile for usage anomalies.
Tested ETLs are not slower to build. They are faster, because every shape change ships behind a red test instead of behind an angry customer email.
Ready to put what you read into code? Try our endpoints live or grab the full reference.