How to Scrape TikTok User Data with Ruby in 2026

Published on May 29, 2026

Ruby has always been the language developers reach for when they want to get something done fast without sacrificing readability. Whether you are bolting a data feature onto a Rails app, writing a Sinatra microservice, or just hacking together a one-off script, Ruby's expressive syntax makes data work feel almost conversational. That makes it a fantastic fit for TikTok data: you can pull profiles, posts, and follower counts in a few lines of code, then pipe the results straight into ActiveRecord, a CSV, or a Sidekiq job.

This tutorial walks through every step of scraping TikTok user data with Ruby using the TikLiveAPI REST endpoints. We will fetch profile info, resolve usernames to IDs, paginate through posts, handle errors gracefully, run requests concurrently, and finish with a daily follower tracker that writes to CSV and (optionally) saves to a Rails model.

Why Ruby for TikTok scraping

Ruby's strengths line up perfectly with what TikTok scraping demands:

  • Rails integration: drop the client into app/services/, schedule it with whenever or Sidekiq, persist to ActiveRecord. No glue code required.
  • Blocks and Enumerable: paginating a cursor-based API becomes a one-liner with Enumerator.new, and transforming responses with map, select, group_by, and each_slice reads like English.
  • Standard library: net/http, json, and csv are built in. You do not need a heavy framework just to talk to a JSON API.
  • Sweet syntax for munging: symbol-to-proc (&:foo), safe navigation (&.), and pattern matching in Ruby 3 make response parsing painless.

We will use the official TikLiveAPI service so you never have to reverse-engineer TikTok's private endpoints. One request equals one credit, and new accounts get 100 free credits at signup. See pricing for the credit packages.

Prerequisites

  • Ruby 3.0 or newer (we use pattern matching and Data.define)
  • A TikLiveAPI account and key. Sign up, verify your email, then grab the key from your profile page.
  • Optional gems: httparty (nicer ergonomics than net/http), async (concurrency), and sidekiq (background jobs).

This guide uses pure net/http for the core examples so it runs anywhere, then shows HTTParty and Async variants for production use.

Step 1: Read the API key from ENV

Never hardcode credentials. Export the key in your shell and read it with ENV.fetch, which raises a clear error if the variable is missing:

# In your shell or .env file:
# export TIKLIVE_API_KEY="your_api_key_here"

API_KEY = ENV.fetch("TIKLIVE_API_KEY")
BASE_URL = "https://api.tikliveapi.com"

For Rails, add the variable to config/credentials.yml.enc or use the dotenv-rails gem in development. The X-Api-Key header authenticates every request.

Step 2: Fetch user info with net/http

Let's start by calling /userinfo-by-username/ to fetch a public TikTok profile. The response contains a user object and a stats object, both using camelCase keys (uniqueId, followerCount, heartCount, etc).

require "net/http"
require "json"
require "uri"

def tiklive_get(path, params = {})
  uri = URI("#{BASE_URL}#{path}")
  uri.query = URI.encode_www_form(params) unless params.empty?

  req = Net::HTTP::Get.new(uri)
  req["X-Api-Key"] = API_KEY
  req["Accept"] = "application/json"

  res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
    http.read_timeout = 30
    http.request(req)
  end

  raise "HTTP #{res.code}: #{res.body}" unless res.is_a?(Net::HTTPSuccess)
  JSON.parse(res.body)
end

data = tiklive_get("/userinfo-by-username/", username: "tiktok")
puts "Nickname: #{data.dig("user", "nickname")}"
puts "Followers: #{data.dig("stats", "followerCount")}"
puts "Total likes: #{data.dig("stats", "heartCount")}"
puts "Videos: #{data.dig("stats", "videoCount")}"

Hash#dig is your friend here: it returns nil instead of raising when a key is missing, which keeps your code resilient to private accounts and edge cases.

Step 3: Resolve a username to a numeric userid

Most "list" endpoints (posts, followers, following) need a numeric userid, not a username. The /userid/ endpoint returns a single key id with the numeric TikTok UID as a string:

def resolve_userid(username)
  response = tiklive_get("/userid/", username: username)
  response.fetch("id")
end

userid = resolve_userid("tiktok")
puts "Resolved userid: #{userid}"

Cache this value. Userids are permanent, so storing them in your database avoids paying one credit every time you want to refresh a profile.

Step 4: Get user posts

The /user-posts/ endpoint returns up to 35 videos per request. The top level uses camelCase (hasMore) while each video item uses snake_case (aweme_id, play_count, digg_count, create_time, etc). Watch out for that mixed casing - it trips up everyone the first time.

def fetch_user_posts(userid, count: 35, cursor: 0)
  tiklive_get("/user-posts/",
    userid: userid,
    count: count,
    cursor: cursor
  )
end

result = fetch_user_posts(userid)
videos = result["videos"] || []

videos.each do |v|
  puts "#{v["title"]&.slice(0, 60)} | views=#{v["play_count"]} likes=#{v["digg_count"]}"
end

puts "More pages? #{result["hasMore"]}"
puts "Next cursor: #{result["cursor"]}"

Each video also includes video_id, cover, duration, play (no-watermark URL), wmplay (watermarked), comment_count, share_count, download_count, and an author object.

Step 5: Idiomatic pagination with Enumerator

Manual cursor loops are ugly. Ruby's Enumerator.new gives you a lazy, pull-based iterator that paginates on demand. Callers can use .first(100), .lazy.select { ... }, or .each_slice(50) without ever touching cursors directly.

def all_user_posts(userid)
  Enumerator.new do |yielder|
    cursor = 0
    loop do
      page = fetch_user_posts(userid, count: 35, cursor: cursor)
      (page["videos"] || []).each { |v| yielder << v }
      break unless page["hasMore"]
      cursor = page["cursor"]
      sleep 0.3 # be polite, stay under rate limits
    end
  end
end

# Grab the 200 most recent videos:
recent = all_user_posts(userid).first(200)
puts "Got #{recent.size} videos"

# Or use Enumerable methods - this stops as soon as it finds 10 viral hits:
viral = all_user_posts(userid).lazy.select { |v| v["play_count"] > 1_000_000 }.first(10)

This is one of Ruby's most underrated patterns. The same shape works for /user-liked/, /user-collections/, and /user-playlists/ too.

Followers and following: a different cursor

The follower and following endpoints use a time timestamp parameter instead of cursor, and /user-following/ returns a top-level key called followings (note the trailing s, not following). Easy to miss:

def all_followers(userid)
  Enumerator.new do |yielder|
    time = 0
    loop do
      page = tiklive_get("/user-followers/",
        userid: userid, count: 200, time: time)
      (page["followers"] || []).each { |f| yielder << f }
      break unless page["hasMore"]
      time = page["time"]
      sleep 0.3
    end
  end
end

# Top 50 followers by total likes received:
top = all_followers(userid)
  .first(1000)
  .sort_by { |f| -f["total_favorited"].to_i }
  .first(50)

Each follower item has snake_case fields: id, unique_id, nickname, signature, avatar, verified, follower_count, following_count, aweme_count, and total_favorited.

Step 6: Error handling and retries

Networks fail. APIs blip. A well-built scraper retries transient errors with exponential backoff but gives up fast on permanent ones (invalid key, missing user). Here is a small helper using rescue and retry the Ruby way:

class TikLiveError < StandardError; end
class TikLiveClientError < TikLiveError; end # 4xx
class TikLiveServerError < TikLiveError; end # 5xx

def with_retry(max_attempts: 4, base_delay: 0.5)
  attempt = 0
  begin
    attempt += 1
    yield
  rescue TikLiveServerError, Net::OpenTimeout, Net::ReadTimeout,
         Errno::ECONNRESET => e
    raise e if attempt >= max_attempts
    delay = base_delay * (2 ** (attempt - 1))
    warn "Attempt #{attempt} failed (#{e.class}); retrying in #{delay}s"
    sleep delay
    retry
  end
end

def tiklive_get(path, params = {})
  # ... build request as before ...
  res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |h| h.request(req) }

  case res
  when Net::HTTPSuccess        then JSON.parse(res.body)
  when Net::HTTPTooManyRequests then raise TikLiveServerError, "Rate limited"
  when Net::HTTPServerError    then raise TikLiveServerError, "HTTP #{res.code}"
  when Net::HTTPClientError    then raise TikLiveClientError, "HTTP #{res.code}: #{res.body}"
  else raise TikLiveError, "Unexpected #{res.code}"
  end
end

# Usage:
data = with_retry { tiklive_get("/userinfo-by-username/", username: "tiktok") }

The standard rate limit is 200 requests per minute, so a 0.3 second sleep between calls keeps you safely under. If you need more, contact support to raise the cap.

Step 7: Concurrency with threads or Async

For batch jobs - say, refreshing 500 profiles - sequential requests are slow. Ruby's MRI has a GIL, but I/O releases it, so threads are great for HTTP. Use a fixed-size queue to cap concurrency:

require "thread"

def fetch_many(usernames, concurrency: 8)
  queue = Queue.new
  usernames.each { |u| queue << u }
  results = {}
  mutex = Mutex.new

  workers = concurrency.times.map do
    Thread.new do
      until queue.empty?
        username = queue.pop(true) rescue nil
        break unless username
        data = with_retry { tiklive_get("/userinfo-by-username/", username: username) }
        mutex.synchronize { results[username] = data }
      rescue => e
        mutex.synchronize { results[username] = { error: e.message } }
      end
    end
  end

  workers.each(&:join)
  results
end

profiles = fetch_many(%w[tiktok khaby.lame charlidamelio bellapoarth], concurrency: 4)

For higher fan-out, swap threads for the async gem. It uses fibers, which are cheaper than threads and let you run hundreds of concurrent requests without OS thread overhead:

require "async"
require "async/http/internet"

Async do
  internet = Async::HTTP::Internet.new
  tasks = usernames.map do |u|
    Async do
      response = internet.get("#{BASE_URL}/userinfo-by-username/?username=#{u}",
        { "X-Api-Key" => API_KEY })
      JSON.parse(response.read)
    ensure
      response&.close
    end
  end
  results = tasks.map(&:wait)
ensure
  internet&.close
end

Step 8: Real-world example - daily follower tracker

Let's tie everything together. We will track follower counts for a list of accounts every day and append the results to a CSV. This is the kind of script you would schedule via cron or whenever-gem.

require "csv"
require "date"

ACCOUNTS = %w[tiktok khaby.lame charlidamelio mrbeast]
OUTPUT = "follower_log.csv"

def snapshot(username)
  data = with_retry { tiklive_get("/userinfo-by-username/", username: username) }
  {
    date: Date.today.iso8601,
    username: data.dig("user", "uniqueId"),
    followers: data.dig("stats", "followerCount"),
    following: data.dig("stats", "followingCount"),
    likes: data.dig("stats", "heartCount"),
    videos: data.dig("stats", "videoCount")
  }
end

write_header = !File.exist?(OUTPUT)

CSV.open(OUTPUT, "a", write_headers: write_header,
         headers: %i[date username followers following likes videos]) do |csv|
  ACCOUNTS.each_slice(4) do |batch|
    batch.each do |username|
      row = snapshot(username)
      csv << row.values
      puts "Logged #{username}: #{row[:followers]} followers"
    end
    sleep 1
  end
end

each_slice batches the work so you can pause between groups, which is handy if you grow the list to hundreds of accounts. Open the CSV in Excel or pipe it into a chart - you now have a daily follower history.

Rails ActiveRecord variant

In a Rails app, swap the CSV writer for a model. Create a FollowerSnapshot with username, followers, captured_on, and friends, then:

# app/services/tiktok_tracker.rb
class TiktokTracker
  def self.snapshot_all(usernames)
    usernames.each do |u|
      data = TikLive.user_info(u)
      FollowerSnapshot.create!(
        username: data.dig("user", "uniqueId"),
        followers: data.dig("stats", "followerCount"),
        videos: data.dig("stats", "videoCount"),
        captured_on: Date.today
      )
    end
  end
end

Sidekiq integration for background jobs

Hitting the API inline during a web request is a recipe for slow controllers. Push the work to Sidekiq:

# app/workers/tiktok_snapshot_worker.rb
class TiktokSnapshotWorker
  include Sidekiq::Worker
  sidekiq_options retry: 5, queue: :scraping

  def perform(username)
    data = TikLive.user_info(username)
    FollowerSnapshot.create!(
      username: data.dig("user", "uniqueId"),
      followers: data.dig("stats", "followerCount"),
      captured_on: Date.today
    )
  end
end

# Enqueue one job per account, fan out across workers:
ACCOUNTS.each { |u| TiktokSnapshotWorker.perform_async(u) }

# Or schedule daily via sidekiq-cron / whenever:
# every 1.day, at: '3:00 am' do
#   runner "ACCOUNTS.each { |u| TiktokSnapshotWorker.perform_async(u) }"
# end

Sidekiq's built-in retry handles transient failures, and the :scraping queue lets you cap concurrency separately from the rest of your app.

Beyond user data

The same client patterns work for the full 37-endpoint surface. A few popular extensions:

  • Post detail and comments - call /post-detail/ with a video URL to get the no-watermark play URL, then /post-comments/ to pull comment threads (the comment id field is id, not cid).
  • Search - /search-user/, /search-video/, and /search-challenge/ all use a keyword param plus cursor pagination.
  • Video download - /download-video/ returns no-watermark URLs (video standard, video_hd HD) ready to pipe into open-uri or save to S3 via the aws-sdk-s3 gem.
  • Music endpoints - /music-info/ and /music-posts/ let you track trending sounds.

Want to experiment before writing code? The interactive playground lets you fire requests in your browser and see the exact JSON shape.

FAQ

Do I need a TikTok account or password?

No. TikLiveAPI works against public TikTok data using a per-account API key. You never share TikTok credentials, and the dashboard never asks for them.

Which HTTP library should I use - net/http, HTTParty, or Faraday?

All three work. net/http is built in and has zero dependencies, which is ideal for small scripts and gems. HTTParty gives you nicer ergonomics (HTTParty.get(url, headers: ...)) and built-in JSON parsing. Faraday shines when you need middleware (logging, instrumentation, retries) and is the standard for larger Rails apps.

How do I handle the rate limit gracefully?

The default cap is 200 requests per minute. Sleep 0.3 seconds between calls in a sequential loop, or cap thread/Async fiber concurrency to around 8 workers. Catch HTTP 429 responses and back off exponentially via the with_retry helper. If you consistently hit the ceiling, message support for a raise.

Can I use this from a Rails controller directly?

You can, but it ties your request latency to TikTok's response time. The better pattern is to enqueue a Sidekiq job, render a "loading" state, and update the UI via Turbo Streams or polling once the job completes. Cache resolved userids in your database so you do not pay one credit per page view.

How much does it cost to run a daily snapshot of 100 accounts?

One /userinfo-by-username/ call equals one credit. Snapshotting 100 accounts per day costs 100 credits, or roughly 3000 per month. The starter package on the pricing page starts at 9.90 USD, and credits never expire - if you skip a few days, you keep your balance.

Wrap-up

Ruby plus TikLiveAPI is a great combo: a few lines of net/http get you authenticated requests, Enumerator.new turns cursor pagination into clean lazy iteration, threads and Async give you concurrency without ceremony, and Sidekiq slots in for production-grade background work. Whether you are building a Rails analytics dashboard, a Sinatra microservice, or just logging follower counts to a CSV, you have the building blocks now.

Ready to start? Grab 100 free credits and try the snippets above against your own account. Then explore the full users endpoint reference and the live playground for everything else.

Build with the TikTok API

Ready to put what you read into code? Try our endpoints live or grab the full reference.

Open Playground Read Documentation