If your weekly creator intelligence report still has a TikTok tab and a Reddit tab that nobody joins together, you are not doing cross-platform analysis. You are doing two siloed reports stapled into one PDF. The creators that matter to your brand do not live on one platform. They workshop ideas in a niche subreddit at 9pm, post the polished cut to TikTok at 7am, and read the comments from both audiences over coffee. If your tooling cannot see that loop, your campaign briefs are guessing.
This post is about how to combine TikLiveAPI (for the TikTok side) with the Reddit API and the Pushshift archive (for the Reddit side) into a single normalized creator graph that growth teams and agencies can actually query. Before we go further, a clear honesty note: TikLiveAPI does not call Reddit. It is a focused TikTok provider. Anyone telling you a single vendor will cleanly cover both surfaces is selling you wishful thinking. The architecture below treats each platform as a separate provider behind a shared schema, which is the only pattern that survives contact with production.
Marketing ops engineers tend to learn this the hard way. A creator hits 2 million followers on TikTok and the dashboard says they are a slam-dunk for a sponsored post. You buy the slot, the video underperforms, and a junior analyst eventually figures out that the same creator has been roasted in three relevant subreddits for two months. The TikTok number was real. The audience signal was not.
The opposite is also common. A subreddit identifies a creator as a credible voice in, say, indie skincare. Reddit-side sentiment is glowing. But pull their TikTok and the follower count is 4,000 with stagnant growth, meaning their actual distribution potential is much smaller than the Reddit chatter suggests. Either view alone gives you a wrong campaign decision. Joined, they give you a defensible one.
Concretely, here are the cross-platform signals worth tracking once both sides are wired up:
Here is the topology, drawn in text so you can paste it into a design doc without an image dependency:
+----------------------+ +----------------------+
| TikLiveAPI (GET) | | Reddit API (OAuth) |
| api.tikliveapi.com | | oauth.reddit.com |
+----------+-----------+ +-----------+----------+
| |
| X-Api-Key | Bearer token
v v
+----------+-----------+ +-----------+----------+
| tiktok_ingest.py | | reddit_ingest.py |
| (parallel polling) | | (parallel polling) |
+----------+-----------+ +-----------+----------+
| |
| raw JSON | raw JSON
v v
+---------------------------------------------+
| normalize + dedupe layer |
| (canonical_creator_id, last_seen_at) |
+-----------------------+---------------------+
|
v
+---------------------------------------------+
| cross_platform_creators (MySQL) |
| + pushshift archive for historic Reddit |
+---------------------------------------------+
|
v
+---------------------------------------------+
| BI / brief generator / alerting layer |
+---------------------------------------------+
Two things worth calling out. First, TikLiveAPI is purely GET with an X-Api-Key header, so its ingest worker is trivially parallelizable and idempotent. Second, Reddit's API needs OAuth and respects per-app rate limits per minute, so its worker needs a token cache and a queue. They are not symmetric. Treating them as if they were is the most common implementation mistake.
This is the hard part. Reddit and TikTok have no shared identity layer, so you are stitching identities from breadcrumbs. In production I run a three-stage match with decreasing confidence:
u/skintok_anna on Reddit matches @skintok_anna on TikTok and both bios reference each other, you have a near-certain match. Run this first because it is cheap and catches the obvious cases./userinfo-by-username/ returns a user.bioLink field for the external link in a creator's profile. If that link is a Linktree, Beacons, or personal site, scrape it and look for a Reddit URL. Conversely, if a Reddit user's profile or sidebar links to a TikTok handle, you have the back-edge. This stage catches the cases where handles differ but the creator publicly cross-references.For TikTok-side enrichment, the two endpoints you will lean on hardest are user search (to resolve handles by keyword when Reddit gives you a partial reference) and user info by username (to fetch the canonical id, bio link, and follower stats). Both are GET, both take X-Api-Key, both are documented in the full documentation.
The whole joined view sits on top of a single normalized table. Here is the schema I have iterated to over a couple of years of running this kind of pipeline:
CREATE TABLE cross_platform_creators (
canonical_creator_id BIGINT UNSIGNED PRIMARY KEY AUTO_INCREMENT,
-- TikTok side (from TikLiveAPI)
tiktok_username VARCHAR(64),
tiktok_user_id VARCHAR(32),
tiktok_followers INT UNSIGNED,
tiktok_heart_count BIGINT UNSIGNED,
tiktok_verified TINYINT(1),
tiktok_bio_link VARCHAR(512),
tiktok_last_post_at DATETIME,
-- Reddit side (from Reddit API + Pushshift)
reddit_handle VARCHAR(64),
reddit_account_created DATETIME,
reddit_link_karma INT,
reddit_comment_karma INT,
reddit_top_subreddits JSON,
reddit_last_post_at DATETIME,
-- Join metadata
match_method ENUM('username','bio_link','manual') NOT NULL,
match_confidence DECIMAL(3,2) NOT NULL,
match_reviewed_by VARCHAR(64),
first_seen_at DATETIME NOT NULL,
last_refreshed_at DATETIME NOT NULL,
UNIQUE KEY uniq_tt (tiktok_user_id),
UNIQUE KEY uniq_rd (reddit_handle),
KEY idx_refresh (last_refreshed_at)
);
A few design choices that are not obvious. tiktok_user_id rather than tiktok_username is the unique key because usernames are mutable on TikTok but the numeric id is stable. reddit_top_subreddits is JSON because a flat top-5 list saves a thousand JOINs in the BI layer. match_confidence is decimal so you can filter strict (>= 0.9) for paid campaign decisions and loose (>= 0.6) for exploratory discovery work.
Once the table is populated, the headline numbers your CMO will ask for are combined reach and audience overlap. Combined reach is easy and dangerous. Overlap is hard and honest.
For combined reach you simply pull the TikTok followers from TikLiveAPI's user-info response (the stats.followerCount field) and add a reasonable Reddit reach proxy. Reddit does not expose follower counts as a primary metric, so the cleanest proxy is the median upvote count of the creator's last 30 posts in their primary subreddits. It is not followers, but it does correlate with "people who will actually see a post" far better than Reddit's profile follower count, which is mostly noise.
For audience overlap you cannot do this precisely without first-party data, and you should tell your stakeholders that. What you can do is build a defensible estimate. Pull the creator's top 200 TikTok commenters via the user-posts endpoint plus per-video comments, pull the creator's top 200 Reddit commenters in their flagship subreddits, and check for handle overlap. The overlap rate scaled to total audience gives you a usable lower-bound estimate. Anyone who tells you they have an exact overlap number across these two platforms is either lying or has bought a panel-based estimate from a third party.
Three concrete things this joined table unlocks that neither side could do alone:
Instead of a brief that says "TikTok creators with 500k+ followers in skincare", the brief becomes "creators whose TikTok followers exceed 500k AND who are active commenters in r/SkincareAddiction with positive karma trends in the last 90 days". That second filter cuts your shortlist by 80% and the remaining 20% convert at materially higher rates because they are vouched for by the community your campaign is trying to reach.
Reddit is famously good at top-of-funnel education and TikTok is famously good at mid-funnel awareness. If you can identify creators who already operate on both surfaces with credibility, you can run a single campaign where the same person seeds a long-form post on Reddit and a short-form video on TikTok within the same 72-hour window. The joined table lets you find these dual-presence creators in seconds instead of weeks of manual sleuthing.
Tracking a competitor's TikTok presence misses half the picture if their growth playbook is built on Reddit seeding. By polling both sides daily and watching for new creator partnerships (a creator whose tiktok_bio_link suddenly points at a competitor's promo URL, or whose Reddit posts start referencing the competitor's product), you get a 7 to 14 day lead on competitive campaigns that would otherwise surface only when the TikTok ad library updates.
A few things I have learned the expensive way about running this in production:
tiktok_user_id, you lose the creators you have only matched on the Reddit side so far. Always join on the canonical id.match_confidence < 0.85 and re-check the bio_link both ways. Creators add and remove cross-links constantly and your match confidence should reflect that.This part is non-optional. Read it before you ship.
TikLiveAPI. Use of the API is governed by the TikLiveAPI terms. The data returned is what is publicly visible on TikTok profiles and posts. You are responsible for honoring takedown and deletion semantics in your own store. If a creator deletes a video on TikTok, you should treat your cached copy as stale and refresh it. Do not redistribute raw API responses to third parties; build derived analytics products instead.
Reddit API. Reddit's developer terms have tightened materially in the last two years. Free-tier rate limits are tight, commercial use requires a paid agreement, and user content is subject to user deletion requests that you must honor in any downstream store. If a Reddit user deletes their account, your reddit_handle row should be tombstoned, not silently kept.
Pushshift. Pushshift's access model has changed several times. As of writing, full access is restricted to verified moderators and researchers in many regions. Confirm your access tier before you build dependencies on it, and have a plan for the day the archive endpoint you rely on disappears. It has happened before.
For a mid-size agency tracking around 5,000 dual-presence creators with daily refresh:
No. TikLiveAPI is a focused TikTok provider. This post is about combining it cleanly with a separate Reddit provider. Anyone who needs both surfaces should plan for two integrations behind a shared schema, which is the approach described above.
You can, and several exist. The tradeoff is that they tend to charge a steep premium for breadth and they update individual platforms more slowly than focused providers. If TikTok freshness matters to you, a focused TikTok provider plus a focused Reddit integration almost always beats a generalist vendor on both data quality and cost.
Lean on the bio_link match path and the manual review queue. Many creators deliberately use different handles per platform for audience segmentation, and the only reliable way to stitch them is a public cross-reference somewhere (their personal site, a Linktree, a podcast feed). If no public cross-reference exists, treat them as two separate creators. Inventing a join you cannot defend is worse than missing one.
No. Anyone claiming exact cross-platform audience overlap without first-party login data is selling you a model output. Build a defensible estimate using shared commenter handles, present it with a confidence interval, and move on.
Build your ingest layer so that swapping the Reddit provider is a single-file change. The whole point of the normalized creator dimension table is that the rest of your stack does not care where the Reddit data came from. This is the same discipline that makes the TikLiveAPI integration resilient to your own internal refactors.
Start with the TikTok side because it is the simpler integration. Pull a sample of 100 creators through /userinfo-by-username/ and /search-user/, get the canonical id and bio_link into your store, then layer the Reddit side on top. Read the full documentation for the endpoint catalog, browse other patterns in the blog, and if you want to discuss a specific agency or in-house build, the contact page is the fastest path to a real conversation.
Ready to put what you read into code? Try our endpoints live or grab the full reference.