Ruby has always been the language developers reach for when they want to get something done fast without sacrificing readability. Whether you are bolting a data feature onto a Rails app, writing a Sinatra microservice, or just hacking together a one-off script, Ruby's expressive syntax makes data work feel almost conversational. That makes it a fantastic fit for TikTok data: you can pull profiles, posts, and follower counts in a few lines of code, then pipe the results straight into ActiveRecord, a CSV, or a Sidekiq job.
This tutorial walks through every step of scraping TikTok user data with Ruby using the TikLiveAPI REST endpoints. We will fetch profile info, resolve usernames to IDs, paginate through posts, handle errors gracefully, run requests concurrently, and finish with a daily follower tracker that writes to CSV and (optionally) saves to a Rails model.
Ruby's strengths line up perfectly with what TikTok scraping demands:
app/services/, schedule it with whenever or Sidekiq, persist to ActiveRecord. No glue code required.Enumerator.new, and transforming responses with map, select, group_by, and each_slice reads like English.net/http, json, and csv are built in. You do not need a heavy framework just to talk to a JSON API.&:foo), safe navigation (&.), and pattern matching in Ruby 3 make response parsing painless.We will use the official TikLiveAPI service so you never have to reverse-engineer TikTok's private endpoints. One request equals one credit, and new accounts get 100 free credits at signup. See pricing for the credit packages.
Data.define)httparty (nicer ergonomics than net/http), async (concurrency), and sidekiq (background jobs).This guide uses pure net/http for the core examples so it runs anywhere, then shows HTTParty and Async variants for production use.
Never hardcode credentials. Export the key in your shell and read it with ENV.fetch, which raises a clear error if the variable is missing:
# In your shell or .env file:
# export TIKLIVE_API_KEY="your_api_key_here"
API_KEY = ENV.fetch("TIKLIVE_API_KEY")
BASE_URL = "https://api.tikliveapi.com"
For Rails, add the variable to config/credentials.yml.enc or use the dotenv-rails gem in development. The X-Api-Key header authenticates every request.
Let's start by calling /userinfo-by-username/ to fetch a public TikTok profile. The response contains a user object and a stats object, both using camelCase keys (uniqueId, followerCount, heartCount, etc).
require "net/http"
require "json"
require "uri"
def tiklive_get(path, params = {})
uri = URI("#{BASE_URL}#{path}")
uri.query = URI.encode_www_form(params) unless params.empty?
req = Net::HTTP::Get.new(uri)
req["X-Api-Key"] = API_KEY
req["Accept"] = "application/json"
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
http.read_timeout = 30
http.request(req)
end
raise "HTTP #{res.code}: #{res.body}" unless res.is_a?(Net::HTTPSuccess)
JSON.parse(res.body)
end
data = tiklive_get("/userinfo-by-username/", username: "tiktok")
puts "Nickname: #{data.dig("user", "nickname")}"
puts "Followers: #{data.dig("stats", "followerCount")}"
puts "Total likes: #{data.dig("stats", "heartCount")}"
puts "Videos: #{data.dig("stats", "videoCount")}"
Hash#dig is your friend here: it returns nil instead of raising when a key is missing, which keeps your code resilient to private accounts and edge cases.
Most "list" endpoints (posts, followers, following) need a numeric userid, not a username. The /userid/ endpoint returns a single key id with the numeric TikTok UID as a string:
def resolve_userid(username)
response = tiklive_get("/userid/", username: username)
response.fetch("id")
end
userid = resolve_userid("tiktok")
puts "Resolved userid: #{userid}"
Cache this value. Userids are permanent, so storing them in your database avoids paying one credit every time you want to refresh a profile.
The /user-posts/ endpoint returns up to 35 videos per request. The top level uses camelCase (hasMore) while each video item uses snake_case (aweme_id, play_count, digg_count, create_time, etc). Watch out for that mixed casing - it trips up everyone the first time.
def fetch_user_posts(userid, count: 35, cursor: 0)
tiklive_get("/user-posts/",
userid: userid,
count: count,
cursor: cursor
)
end
result = fetch_user_posts(userid)
videos = result["videos"] || []
videos.each do |v|
puts "#{v["title"]&.slice(0, 60)} | views=#{v["play_count"]} likes=#{v["digg_count"]}"
end
puts "More pages? #{result["hasMore"]}"
puts "Next cursor: #{result["cursor"]}"
Each video also includes video_id, cover, duration, play (no-watermark URL), wmplay (watermarked), comment_count, share_count, download_count, and an author object.
Manual cursor loops are ugly. Ruby's Enumerator.new gives you a lazy, pull-based iterator that paginates on demand. Callers can use .first(100), .lazy.select { ... }, or .each_slice(50) without ever touching cursors directly.
def all_user_posts(userid)
Enumerator.new do |yielder|
cursor = 0
loop do
page = fetch_user_posts(userid, count: 35, cursor: cursor)
(page["videos"] || []).each { |v| yielder << v }
break unless page["hasMore"]
cursor = page["cursor"]
sleep 0.3 # be polite, stay under rate limits
end
end
end
# Grab the 200 most recent videos:
recent = all_user_posts(userid).first(200)
puts "Got #{recent.size} videos"
# Or use Enumerable methods - this stops as soon as it finds 10 viral hits:
viral = all_user_posts(userid).lazy.select { |v| v["play_count"] > 1_000_000 }.first(10)
This is one of Ruby's most underrated patterns. The same shape works for /user-liked/, /user-collections/, and /user-playlists/ too.
The follower and following endpoints use a time timestamp parameter instead of cursor, and /user-following/ returns a top-level key called followings (note the trailing s, not following). Easy to miss:
def all_followers(userid)
Enumerator.new do |yielder|
time = 0
loop do
page = tiklive_get("/user-followers/",
userid: userid, count: 200, time: time)
(page["followers"] || []).each { |f| yielder << f }
break unless page["hasMore"]
time = page["time"]
sleep 0.3
end
end
end
# Top 50 followers by total likes received:
top = all_followers(userid)
.first(1000)
.sort_by { |f| -f["total_favorited"].to_i }
.first(50)
Each follower item has snake_case fields: id, unique_id, nickname, signature, avatar, verified, follower_count, following_count, aweme_count, and total_favorited.
Networks fail. APIs blip. A well-built scraper retries transient errors with exponential backoff but gives up fast on permanent ones (invalid key, missing user). Here is a small helper using rescue and retry the Ruby way:
class TikLiveError < StandardError; end
class TikLiveClientError < TikLiveError; end # 4xx
class TikLiveServerError < TikLiveError; end # 5xx
def with_retry(max_attempts: 4, base_delay: 0.5)
attempt = 0
begin
attempt += 1
yield
rescue TikLiveServerError, Net::OpenTimeout, Net::ReadTimeout,
Errno::ECONNRESET => e
raise e if attempt >= max_attempts
delay = base_delay * (2 ** (attempt - 1))
warn "Attempt #{attempt} failed (#{e.class}); retrying in #{delay}s"
sleep delay
retry
end
end
def tiklive_get(path, params = {})
# ... build request as before ...
res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) { |h| h.request(req) }
case res
when Net::HTTPSuccess then JSON.parse(res.body)
when Net::HTTPTooManyRequests then raise TikLiveServerError, "Rate limited"
when Net::HTTPServerError then raise TikLiveServerError, "HTTP #{res.code}"
when Net::HTTPClientError then raise TikLiveClientError, "HTTP #{res.code}: #{res.body}"
else raise TikLiveError, "Unexpected #{res.code}"
end
end
# Usage:
data = with_retry { tiklive_get("/userinfo-by-username/", username: "tiktok") }
The standard rate limit is 200 requests per minute, so a 0.3 second sleep between calls keeps you safely under. If you need more, contact support to raise the cap.
For batch jobs - say, refreshing 500 profiles - sequential requests are slow. Ruby's MRI has a GIL, but I/O releases it, so threads are great for HTTP. Use a fixed-size queue to cap concurrency:
require "thread"
def fetch_many(usernames, concurrency: 8)
queue = Queue.new
usernames.each { |u| queue << u }
results = {}
mutex = Mutex.new
workers = concurrency.times.map do
Thread.new do
until queue.empty?
username = queue.pop(true) rescue nil
break unless username
data = with_retry { tiklive_get("/userinfo-by-username/", username: username) }
mutex.synchronize { results[username] = data }
rescue => e
mutex.synchronize { results[username] = { error: e.message } }
end
end
end
workers.each(&:join)
results
end
profiles = fetch_many(%w[tiktok khaby.lame charlidamelio bellapoarth], concurrency: 4)
For higher fan-out, swap threads for the async gem. It uses fibers, which are cheaper than threads and let you run hundreds of concurrent requests without OS thread overhead:
require "async"
require "async/http/internet"
Async do
internet = Async::HTTP::Internet.new
tasks = usernames.map do |u|
Async do
response = internet.get("#{BASE_URL}/userinfo-by-username/?username=#{u}",
{ "X-Api-Key" => API_KEY })
JSON.parse(response.read)
ensure
response&.close
end
end
results = tasks.map(&:wait)
ensure
internet&.close
end
Let's tie everything together. We will track follower counts for a list of accounts every day and append the results to a CSV. This is the kind of script you would schedule via cron or whenever-gem.
require "csv"
require "date"
ACCOUNTS = %w[tiktok khaby.lame charlidamelio mrbeast]
OUTPUT = "follower_log.csv"
def snapshot(username)
data = with_retry { tiklive_get("/userinfo-by-username/", username: username) }
{
date: Date.today.iso8601,
username: data.dig("user", "uniqueId"),
followers: data.dig("stats", "followerCount"),
following: data.dig("stats", "followingCount"),
likes: data.dig("stats", "heartCount"),
videos: data.dig("stats", "videoCount")
}
end
write_header = !File.exist?(OUTPUT)
CSV.open(OUTPUT, "a", write_headers: write_header,
headers: %i[date username followers following likes videos]) do |csv|
ACCOUNTS.each_slice(4) do |batch|
batch.each do |username|
row = snapshot(username)
csv << row.values
puts "Logged #{username}: #{row[:followers]} followers"
end
sleep 1
end
end
each_slice batches the work so you can pause between groups, which is handy if you grow the list to hundreds of accounts. Open the CSV in Excel or pipe it into a chart - you now have a daily follower history.
In a Rails app, swap the CSV writer for a model. Create a FollowerSnapshot with username, followers, captured_on, and friends, then:
# app/services/tiktok_tracker.rb
class TiktokTracker
def self.snapshot_all(usernames)
usernames.each do |u|
data = TikLive.user_info(u)
FollowerSnapshot.create!(
username: data.dig("user", "uniqueId"),
followers: data.dig("stats", "followerCount"),
videos: data.dig("stats", "videoCount"),
captured_on: Date.today
)
end
end
end
Hitting the API inline during a web request is a recipe for slow controllers. Push the work to Sidekiq:
# app/workers/tiktok_snapshot_worker.rb
class TiktokSnapshotWorker
include Sidekiq::Worker
sidekiq_options retry: 5, queue: :scraping
def perform(username)
data = TikLive.user_info(username)
FollowerSnapshot.create!(
username: data.dig("user", "uniqueId"),
followers: data.dig("stats", "followerCount"),
captured_on: Date.today
)
end
end
# Enqueue one job per account, fan out across workers:
ACCOUNTS.each { |u| TiktokSnapshotWorker.perform_async(u) }
# Or schedule daily via sidekiq-cron / whenever:
# every 1.day, at: '3:00 am' do
# runner "ACCOUNTS.each { |u| TiktokSnapshotWorker.perform_async(u) }"
# end
Sidekiq's built-in retry handles transient failures, and the :scraping queue lets you cap concurrency separately from the rest of your app.
The same client patterns work for the full 37-endpoint surface. A few popular extensions:
/post-detail/ with a video URL to get the no-watermark play URL, then /post-comments/ to pull comment threads (the comment id field is id, not cid)./search-user/, /search-video/, and /search-challenge/ all use a keyword param plus cursor pagination./download-video/ returns no-watermark URLs (video standard, video_hd HD) ready to pipe into open-uri or save to S3 via the aws-sdk-s3 gem./music-info/ and /music-posts/ let you track trending sounds.Want to experiment before writing code? The interactive playground lets you fire requests in your browser and see the exact JSON shape.
No. TikLiveAPI works against public TikTok data using a per-account API key. You never share TikTok credentials, and the dashboard never asks for them.
All three work. net/http is built in and has zero dependencies, which is ideal for small scripts and gems. HTTParty gives you nicer ergonomics (HTTParty.get(url, headers: ...)) and built-in JSON parsing. Faraday shines when you need middleware (logging, instrumentation, retries) and is the standard for larger Rails apps.
The default cap is 200 requests per minute. Sleep 0.3 seconds between calls in a sequential loop, or cap thread/Async fiber concurrency to around 8 workers. Catch HTTP 429 responses and back off exponentially via the with_retry helper. If you consistently hit the ceiling, message support for a raise.
You can, but it ties your request latency to TikTok's response time. The better pattern is to enqueue a Sidekiq job, render a "loading" state, and update the UI via Turbo Streams or polling once the job completes. Cache resolved userids in your database so you do not pay one credit per page view.
One /userinfo-by-username/ call equals one credit. Snapshotting 100 accounts per day costs 100 credits, or roughly 3000 per month. The starter package on the pricing page starts at 9.90 USD, and credits never expire - if you skip a few days, you keep your balance.
Ruby plus TikLiveAPI is a great combo: a few lines of net/http get you authenticated requests, Enumerator.new turns cursor pagination into clean lazy iteration, threads and Async give you concurrency without ceremony, and Sidekiq slots in for production-grade background work. Whether you are building a Rails analytics dashboard, a Sinatra microservice, or just logging follower counts to a CSV, you have the building blocks now.
Ready to start? Grab 100 free credits and try the snippets above against your own account. Then explore the full users endpoint reference and the live playground for everything else.
Ready to put what you read into code? Try our endpoints live or grab the full reference.