How to Detect Fake TikTok Followers Programmatically

By TikLiveAPI Team · Published on May 29, 2026

Influencer marketing is projected to top $30B in 2026, yet industry studies suggest 15-30% of followers on top-tier creator accounts are bots, inactive shells, or purchased growth. For agencies vetting rosters and brand managers signing six-figure deals, that gap between "followers" and "humans who could plausibly buy something" is where ROI quietly bleeds out. A creator with 500K followers and a 0.3% engagement rate is not a deal - it is a refund request waiting to be written.

The good news: most fake-follower patterns are mechanically detectable. You do not need a forensic team or a private TikTok data partnership. You need a paginated follower endpoint, a few statistical thresholds, and a scoring function. This guide walks through a production-ready pipeline using TikLiveAPI - pulling follower lists, computing audience "alive rate", cross-checking engagement against the displayed reach, and combining everything into a single 0-100 audience-quality score you can drop into your vetting spreadsheet.

The signals that give bots away

Before writing code, it helps to know what we are hunting for. Inauthentic followers fall into four buckets, and each leaves a measurable fingerprint:

Engagement-to-reach gap. A genuine 1M-follower account on TikTok averages 5-9% engagement on recent posts. When likes-plus-comments divided by followers falls below ~1% on a non-niche creator, the most common explanation is that the follower count is inflated.
Follower-to-following ratio anomalies. Real humans tend to follow more accounts than follow them back. Bot farms invert this: shell accounts follow thousands of creators (paid placements) and are themselves followed by almost no one. A follower whose own follower_count is in the single digits while following_count is in the thousands is a structural giveaway.
Audience activity patterns. Real audiences post, comment, and react. Bot followers usually have aweme_count (videos posted) of 0, no avatar, and a default-looking username (letters plus a long numeric tail).
Account characteristics. Empty signature (bio), missing avatar, no linked accounts (ins_id, youtube_channel_id, twitter_id all empty), and zero total_favorited. None of these is damning on its own, but together they form a high-confidence "shell account" signal.

Our job is to turn these signals into numbers, sample them across the follower list, and produce a defensible score.

Step 1: Pull the follower list

Everything downstream depends on a representative sample of followers. The user-followers endpoint returns up to 200 followers per page, paginated by a time timestamp cursor (not the cursor param other endpoints use - this is a quirk worth flagging in your codebase).

The response shape is a flat followers array. Each entry uses snake_case keys: sec_uid, unique_id, aweme_count, following_count, follower_count, favoriting_count, total_favorited, ins_id, youtube_channel_id, twitter_id. No user/stats nesting here - it is one flat object per follower, which makes batch analysis easier than working with the userinfo response shape.

Before sampling, you need the target's user ID, which comes from the user-id endpoint (username in, single id field out).

import os
import time
import requests

BASE_URL = "https://api.tikliveapi.com"
HEADERS = {"X-Api-Key": os.environ["TIKLIVE_API_KEY"]}

def resolve_user_id(username: str) -> str:
    r = requests.get(
        f"{BASE_URL}/userid/",
        params={"username": username.lstrip("@")},
        headers=HEADERS,
        timeout=30,
    )
    r.raise_for_status()
    return r.json()["id"]

def fetch_followers(userid: str, sample_size: int = 1000) -> list[dict]:
    followers, cursor = [], None
    while len(followers) < sample_size:
        params = {"userid": userid, "count": 200}
        if cursor is not None:
            params["time"] = cursor
        r = requests.get(
            f"{BASE_URL}/user-followers/",
            params=params,
            headers=HEADERS,
            timeout=30,
        )
        r.raise_for_status()
        page = r.json().get("followers", [])
        if not page:
            break
        followers.extend(page)
        # Pagination: newest follower's create_time becomes the next 'time' cursor.
        cursor = data.get("time")
        if cursor is None:
            break
        time.sleep(0.3)  # stay under the 200 req/min rate limit
    return followers[:sample_size]

A 1,000-follower sample is enough for stable percentages on accounts up to a few million followers. For mega-accounts (10M+), 2,000-3,000 gives tighter confidence intervals at the cost of more credits - and since 1 request equals 1 credit on TikLiveAPI, you can budget exactly: a 2,000-follower sample is 10 credits at 200 per page.

Step 2: Compute audience "alive rate"

The "alive rate" is the share of sampled followers who look like real, active users. We define an alive follower as one who has posted at least one video, has set an avatar, and has a non-empty signature (bio). All three conditions check independent signals - posting indicates active participation, avatar indicates effort to personalize, signature indicates the account exists for self-expression rather than just inflating someone else's count.

def is_alive(f: dict) -> bool:
    posted = (f.get("aweme_count") or 0) > 0
    has_avatar = bool(f.get("avatar") or f.get("avatar_thumb") or f.get("avatar_medium"))
    has_bio = bool((f.get("signature") or "").strip())
    return posted and has_avatar and has_bio

def alive_rate(followers: list[dict]) -> float:
    if not followers:
        return 0.0
    return sum(1 for f in followers if is_alive(f)) / len(followers)

Healthy benchmarks from manual audits across organic creators: alive rate above 0.70 is normal, 0.55-0.70 is acceptable for niche or new accounts, below 0.45 is a strong inauthenticity signal. Beauty and fashion creators tend to skew higher (lots of active fans), while gaming and finance skew slightly lower because many of their followers are lurker accounts that consume but rarely post.

Step 3: Follower-to-following ratio per follower

For each sampled follower, compute their own follower_count divided by following_count. Real users typically score between 0.2 and 5. Bots score below 0.05 because they follow thousands and are followed by no one.

def follower_ratio(f: dict) -> float:
    following = max(f.get("following_count") or 0, 1)
    return (f.get("follower_count") or 0) / following

def suspicious_ratio_share(followers: list[dict], threshold: float = 0.05) -> float:
    if not followers:
        return 0.0
    bots = sum(1 for f in followers if follower_ratio(f) < threshold
               and (f.get("following_count") or 0) > 50)
    return bots / len(followers)

The extra guard following_count > 50 avoids flagging brand-new real users (who have 5 followers and 8 followings) as bots. We only count accounts that have aggressively followed others - the classic bot-farm pattern.

Step 4: Engagement rate cross-check

Audience-level signals are powerful, but a sophisticated buyer can purchase higher-quality "aged" bots that pass the alive check. The independent cross-check is the creator's own engagement rate. Pull recent posts via the user-posts endpoint, sum likes and comments, and divide by the follower count.

The video response uses snake_case: digg_count (likes), comment_count, share_count, play_count. We exclude pinned videos and sponsored posts (is_ad true) from the engagement baseline because they distort the average.

def fetch_recent_posts(userid: str, count: int = 30) -> list[dict]:
    r = requests.get(
        f"{BASE_URL}/user-posts/",
        params={"userid": userid, "count": count},
        headers=HEADERS,
        timeout=30,
    )
    r.raise_for_status()
    return r.json().get("videos", [])

def engagement_rate(userid: str, follower_count: int) -> float:
    posts = [p for p in fetch_recent_posts(userid, 30) if not p.get("is_ad")]
    if not posts or follower_count <= 0:
        return 0.0
    totals = sum((p.get("digg_count") or 0) + (p.get("comment_count") or 0)
                 for p in posts)
    return totals / (len(posts) * follower_count)

Industry benchmarks vary by niche and follower tier. As a starting point: nano (under 10K) creators average 7-12%, micro (10K-100K) average 4-7%, mid-tier (100K-1M) average 2-5%, mega (1M+) average 1-3%. Anything below 0.5% on a mega account or below 1% on a mid-tier is a red flag worth pairing with the audience-side signals. For a deeper treatment, see our earlier post on calculating the real influencer engagement score.

Step 5: Combine into an audience quality score

None of the four signals is reliable on its own. A creator might score poorly on alive rate because their audience is older and just consumes, or score poorly on engagement because TikTok throttled them this week. Combined, they triangulate. The scoring function below weights alive rate at 35%, suspicious-ratio share (inverted) at 25%, engagement rate against a niche benchmark at 30%, and signature/bio diversity at 10%.

def audience_quality_score(
    followers: list[dict],
    er: float,
    niche_er_benchmark: float = 0.04,
) -> dict:
    alive = alive_rate(followers)
    bot_share = suspicious_ratio_share(followers)
    er_score = min(er / niche_er_benchmark, 1.5) / 1.5  # cap at 1.0
    # Signature diversity: share of unique non-empty bios in the sample.
    bios = [(f.get("signature") or "").strip() for f in followers]
    nonempty = [b for b in bios if b]
    diversity = (len(set(nonempty)) / len(nonempty)) if nonempty else 0.0

    score = (
        alive * 35
        + (1 - bot_share) * 25
        + er_score * 30
        + diversity * 10
    )
    return {
        "score": round(score, 1),
        "alive_rate": round(alive, 3),
        "bot_follower_share": round(bot_share, 3),
        "engagement_rate": round(er, 4),
        "bio_diversity": round(diversity, 3),
    }

The "bio diversity" signal catches a specific bot pattern: large batches of fake accounts often share identical or near-identical signatures because they were created from the same template. If 400 followers in your 1,000-sample share the bio "Living my best life", that is not coincidence. A diversity score below 0.6 on the non-empty bio subset is suspicious.

The full pipeline ties together with one driver function:

def audit_creator(username: str, niche_er: float = 0.04) -> dict:
    userid = resolve_user_id(username)
    followers = fetch_followers(userid, sample_size=1000)
    # The follower_count of the target itself comes from /userinfo-by-username/.
    r = requests.get(
        f"{BASE_URL}/userinfo-by-username/",
        params={"username": username.lstrip("@")},
        headers=HEADERS,
        timeout=30,
    )
    r.raise_for_status()
    body = r.json()
    target_followers = body.get("stats", {}).get("followerCount", 0)
    er = engagement_rate(userid, target_followers)
    return {
        "username": username,
        "followers_declared": target_followers,
        **audience_quality_score(followers, er, niche_er),
    }

Note the mixed-casing quirk: the userinfo-by-username endpoint returns stats.followerCount in camelCase, while the followers endpoint returns follower_count in snake_case on each entry. Do not assume uniform casing - this is the single most common bug we see in scraper code reviews. Test in the playground before committing.

If you want to push the scoring further, our guide to building a composite TikTok audience quality score extends this weighting approach with additional signals.

Step 6: Scoring a roster of 20 influencers

Here is what the output looks like when run across a sample fashion-vertical roster (numbers fabricated for illustration but reflective of real-world distributions):

roster = ["@creator_a", "@creator_b", ..., "@creator_t"]
results = sorted(
    (audit_creator(u, niche_er=0.05) for u in roster),
    key=lambda r: r["score"],
    reverse=True,
)
for r in results:
    print(f"{r['username']:<20} score={r['score']:>5}  "
          f"alive={r['alive_rate']}  bots={r['bot_follower_share']}  "
          f"ER={r['engagement_rate']}")

Typical distribution: 4-5 creators score above 75 (clear approve), 8-10 score 50-75 (negotiate or request a media kit), 5-7 score below 50 (decline or request fresh audience data). The creators who collapse hardest are usually those who bought a one-shot growth boost 18 months ago and now carry a permanent ballast of inactive accounts. The score will not recover unless TikTok itself purges those followers, which it does infrequently. And if you are still assembling the roster, our micro-influencer discovery workflow shows how to surface 10K-100K creators worth auditing in the first place.

Limitations and ethical considerations

A few honest caveats before you bake this into procurement:

Sampling bias. The followers endpoint returns followers in reverse-chronological order. If a creator bought followers two years ago and has grown organically since, your most recent sample may look clean. Mitigate by paginating deeper (5-10 pages) or sampling at random timestamps via the time param.
Niche skew. A meditation creator may have low engagement because their audience watches silently. A finance creator may have low alive rate because professionals lurk. Tune niche_er_benchmark per vertical rather than applying one global threshold.
Geographic skew. Some regions have lower bio-completion rates culturally. If most of an audience is in a single non-English market, recalibrate the alive thresholds.
Ethical use. This pipeline reads only publicly visible data (TikLiveAPI never requires the target's password or login). Still, treat scores as one input alongside content audits, sales conversation history, and pilot campaigns. Do not publish raw scores naming individual creators without giving them a chance to respond - the false-positive rate on a single 1,000-sample is not zero.
Data freshness. Each request returns live data, not cached snapshots. Re-run audits monthly for active rosters; the bot landscape evolves and so should your benchmarks.

Ready to start? Grab an API key on the pricing page (1 request equals 1 credit, credits never expire), test the followers endpoint live in the playground, and skim the full endpoint reference at documentation. Questions about volume pricing for vetting agencies? Reach the team via contact or check your usage dashboard at profile.

FAQ

How large a follower sample do I really need?

For accounts under 1M followers, a 1,000-sample yields percentages with roughly +/-3% confidence at 95%. For 1M-10M accounts, push to 2,000. For accounts above 10M, 3,000 gives diminishing returns - the sampling distribution stabilizes. At 200 followers per page, that is 5-15 credits per audit, which is trivial compared to a misallocated campaign budget.

Will my audit get the target creator's account banned or flagged?

No. The TikLiveAPI requests run against your API key, not the target's account. The target sees nothing - you are reading the same publicly visible follower list that anyone visiting the profile page would see, just paginated efficiently.

Why does the followers endpoint use a "time" cursor instead of "cursor" like other endpoints?

This is a TikTok-side quirk that the API surfaces honestly. Comment, post, and search endpoints paginate by an opaque cursor; the followers list paginates by the follow-event timestamp of the last returned entry. When porting code from one endpoint to another, watch for this. The response is still a single flat followers array regardless.

What is a realistic engagement-rate threshold by follower tier?

Rough rule of thumb: under 10K followers expect 7-12%, 10K-100K expect 4-7%, 100K-1M expect 2-5%, above 1M expect 1-3%. Below half the lower bound for the tier is a red flag worth investigating. Niche matters too - dance and beauty over-index, finance and B2B under-index.

Can I run this in production without hitting rate limits?

The standard rate limit is 200 requests per minute and can be raised on request. For a 1,000-follower audit plus engagement check, you are spending roughly 7 requests per creator (5 follower pages + userinfo + posts). That means up to ~28 audits per minute without queuing. Larger agencies running nightly cohorts should batch into off-peak windows and add a queue, but no special infrastructure is required to get started.