If you only watch TikTok or only watch Instagram, you are reading a half-finished book. Creators publish on both. Audiences scroll on both. Campaign budgets land on both. Yet most analytics stacks still bolt a TikTok scraper to one dashboard and a Meta Graph integration to another, never reconciling them. The result is double-counted reach, mismatched creator names, and attribution that quietly favors whichever platform reports first.
This post is for marketing ops engineers, growth marketers, and agency platform leads building the joined view themselves. We will be honest up front: TikLiveAPI only covers TikTok. It does not call Instagram, Meta, or any other network. The interesting work is gluing it to Meta's official APIs so the two sides feed one normalized creator dimension table. That is the pattern this post describes end-to-end.
Single-platform thinking distorts three decisions that marketing ops makes every week.
A creator with 500K TikTok followers and 300K Instagram followers does not have 800K reachable people. Audience overlap on lifestyle and beauty creators routinely sits between 20% and 45%. Without a joined identity, briefs over-promise impressions and CPMs look artificially cheap.
A creator who posts daily Reels but only weekly TikToks looks "inactive" in a TikTok-only dashboard. Their actual content velocity is high, just weighted toward Meta. Cross-platform last_post_time fixes this.
Meta Pixel and TikTok Pixel both claim the same conversion. Without a shared creator dimension and a deterministic dedupe rule, your monthly report becomes a negotiation.
Four signals consistently pay for the engineering effort.
/userinfo-by-username/; pull Instagram bio "website" via Meta's Instagram Graph API for Business. When either points at the other handle, you have a high-confidence match.create_time from TikLiveAPI's /user-posts/ against Instagram media timestamp tells you whether they treat the platforms as twins or as separate creative tracks.Here is the text architecture diagram. The whole point of the design is that nothing downstream of the normalization layer knows which platform a row came from.
+-------------------------+ +----------------------------------+
| TikTok side | | Instagram side |
| | | |
| TikLiveAPI | | Meta Graph API (Business) |
| X-Api-Key auth | | OAuth2 access tokens |
| | | |
| GET /userinfo-by- | | /{ig-user-id}?fields=... |
| username/ | | /{ig-user-id}/media |
| GET /userinfo-by-id/ | | |
| GET /user-posts/ | | Instagram Basic Display API |
| GET /user-followers/ | | (creator-owned, lightweight) |
| GET /search-user/ | | |
+-----------+-------------+ +------------------+---------------+
| |
v v
+--------+-------------+ +------------+--------+
| tiktok_raw (JSON) | | ig_raw (JSON) |
| partitioned by day | | partitioned by day |
+--------+-------------+ +------------+--------+
| |
+--------------------+--------------------+
v
+--------------+--------------+
| match_creators job |
| (bio_link + username + |
| manual review queue) |
+--------------+--------------+
v
+--------------+--------------+
| cross_platform_creators |
| canonical creator dimension |
+--------------+--------------+
v
+--------------+--------------+
| downstream: briefs, CRM, |
| BI, ads attribution |
+-----------------------------+
The shape matters: raw tables stay platform-specific (so you never lie about the provenance of a field), and a single match job is the only place that decides "this TikTok account and this Instagram account are the same human."
This is the hardest engineering problem in the pipeline. Three matching rules, applied in order, get you most of the way.
From TikLiveAPI's /userinfo-by-username/ the user.bioLink field carries the creator's external URL. If it contains instagram.com/{handle}, you have a one-way attestation from the creator themselves. Do the symmetric check from Instagram's website field. Both directions matching is the gold standard.
Roughly 60% of mid-tier creators use the same handle on both platforms. A normalized lowercase comparison of user.uniqueId against the Instagram username catches them. Flag, do not auto-merge - "marie" on TikTok and "marie" on Instagram are not necessarily the same person.
For everything else, route to a human. The queue should show: avatar, follower count, last 3 post thumbnails, bio text. A trained reviewer can resolve 80 to 120 candidates per hour. Store the decision with a reviewer id and timestamp so you can audit later.
Pull the TikTok side of a candidate match with one call to /documentation/users/user-info/ using the canonical username, or by id when you already have it. For discovery from a search term, /documentation/users/search/ is the right entry point.
Keep the canonical table small. Heavy fields live in raw tables; the dimension is just the spine.
CREATE TABLE cross_platform_creators (
creator_id BIGINT PRIMARY KEY AUTO_INCREMENT,
display_name VARCHAR(255) NOT NULL,
tiktok_username VARCHAR(255) NULL,
tiktok_user_id VARCHAR(64) NULL,
tiktok_sec_uid VARCHAR(128) NULL,
instagram_handle VARCHAR(255) NULL,
instagram_ig_id VARCHAR(64) NULL,
tiktok_followers INT NULL,
instagram_followers INT NULL,
tiktok_last_post DATETIME NULL,
instagram_last_post DATETIME NULL,
match_method ENUM('bio_link','username','manual') NOT NULL,
match_confidence DECIMAL(3,2) NOT NULL,
reviewed_by INT NULL,
reviewed_at DATETIME NULL,
created_at DATETIME NOT NULL,
updated_at DATETIME NOT NULL,
UNIQUE KEY uq_tt (tiktok_user_id),
UNIQUE KEY uq_ig (instagram_ig_id),
KEY idx_handles (tiktok_username, instagram_handle)
);
Notes on the shape:
tiktok_user_id and instagram_ig_id are the durable keys. Handles change; ids do not.tiktok_sec_uid because some TikLiveAPI endpoints prefer it.match_confidence lets you filter "only show bio_link-verified creators" in briefs without changing schema.With the dimension in place, combined reach is one query. Honest combined reach subtracts an overlap estimate.
SELECT
c.creator_id,
c.display_name,
c.tiktok_followers,
c.instagram_followers,
c.tiktok_followers + c.instagram_followers AS naive_reach,
ROUND(
(c.tiktok_followers + c.instagram_followers)
* (1 - o.overlap_estimate),
0
) AS deduped_reach
FROM cross_platform_creators c
LEFT JOIN audience_overlap_estimates o
ON o.creator_id = c.creator_id
WHERE c.match_confidence >= 0.80;
For overlap estimation, the most defensible cheap method is:
/user-followers/ (cursor through ~1,000 entries).This will not give you Nielsen-grade numbers. It will give you a defensible directional estimate, which is what briefs actually need.
An agency planning a product launch wants 5M deduped impressions across 12 creators. Without the joined view, they pick creators based on TikTok alone and over-deliver on Instagram by 30% (wasted budget) or under-deliver (missed KPI). With the joined view, the brief is built from deduped_reach directly.
Map TikTok to top-of-funnel (entertainment, discovery) and Instagram Stories/Reels to mid-funnel (consideration). The cross_platform_creators table lets a planner say "this creator does both well, give them the full-funnel package" versus "TikTok only, pair with a different IG creator."
Tracking a competitor brand's TikTok via TikLiveAPI's /user-posts/ tells you half the story. Pairing it with their Instagram /media endpoint gives you the full publishing calendar. Plot post times together and you can often see their content team's working hours.
Run two independent worker pools, one per platform. They share nothing except writes to raw tables. This isolates failure: a Meta rate-limit pause should never block TikTok ingestion. Use a job queue with per-platform concurrency limits.
Every downstream consumer joins on creator_id, never on handle. This is the single rule that prevents the "marie" collision problem from leaking into reports.
Refresh follower counts and last_post_time once or twice a day per creator. Reports read from the snapshot. This keeps API costs predictable and reports fast.
When you discover a new creator, backfill their posts in pages using the cursor + hasMore contract from /user-posts/. Key each row by aweme_id so reruns are safe.
This is the slide your legal team will ask for.
For an agency tracking 200 creators across both platforms with a daily refresh:
/userinfo-by-username/ + 1 /user-posts/ page per day = 400 calls/day = ~12,000 calls/month. Plus weekly follower sampling at 5 cursor pages each = ~4,000 more. Roughly 16K credits/month. Pricing on /pricing/.Want the full endpoint catalog before you commit? Browse /documentation/ or read the rest of the /blog/ for adjacent patterns.
No. TikLiveAPI is TikTok-only. Every endpoint listed in the docs returns TikTok data. Instagram data must come from Meta's own APIs. This post is about combining the two cleanly, not about a single provider doing both.
For business and agency use, the Instagram Graph API for Business inside Meta Graph API is the right answer. The Instagram Basic Display API is creator-token-scoped and only useful for accounts whose owner authorizes you directly.
Directional. It is good enough to compare creators against each other and to prevent obvious double-counting in briefs. It is not good enough to publish as a hard percentage to a brand.
No. Bio-link cross-reference is the closest you get to certainty. Always keep a manual review path for ambiguous cases, and store who reviewed what.
The numeric id returned by /userid/ and present inside user.id on /userinfo-by-username/. Store secUid as well, since some endpoints prefer it.
Send your API key in the X-Api-Key header against https://api.tikliveapi.com. All endpoints are GET.
Reach the team via /contact/ with the request URL, timestamp, and the masked first few characters of your API key.
Ready to put what you read into code? Try our endpoints live or grab the full reference.