Back to Blog
Telegram channel analytics, TDLib export tutorial, parse Telegram data, TDLib vs Bot API, export channel messages, Telegram metrics extraction, TDLib best practices, channel insight automation
Analytics

Step-by-Step TDLib Channel Analytics Export

Telegram Official Team
TDLibExportParsingMetricsAPIAutomation

Why operators still need TDLib for channel numbers

Official Telegram Desktop and Android clients cap you at 30-day overview graphs and a CSV that omits per-message reaction counts. When you are running a 120 k news feed that pushes 200 posts per day, that granularity is not enough to A/B headline styles. TDLib—the same C++ library that powers every official app—exposes getChannelStatistics and getStatisticalGraph, two calls that return raw numbers you can store locally. The trade-off: you must compile the library, authenticate as yourself and parse the JSON tree yourself. The gain: unlimited look-back, minute-level resolution and no cloud-side rate limits beyond the normal flood wait.

Minute-level data becomes critical when you are testing push-time strategies or measuring the tail of viral uplift. Public dashboards average away the first 30 min spike, which is exactly the window that determines whether a post enters recommendation loops. With TDLib you can correlate each dip or surge with external events—server outages, breaking news, or even competitor posts—without waiting for tomorrow’s rolled-up summary.

Version matrix: which TDLib branch contains the analytics calls

The statistics layer appeared in v1.7.0 (March 2022) but was rewritten in v1.8.0 (Aug 2022) to separate channel and super-group stats. Current master (1.8.29 as of November 2025) is backwards-compatible: you can still request the old chatStatisticsAdministrator object, yet new fields such as reaction_count and story_share_count only populate if you compile ≥1.8.10. If your production server sits on 1.7.x, upgrade first; otherwise the JSON will silently drop new keys instead of failing, which can break downstream dashboards.

If you pin your CI image to Debian stable, remember that the distro package lags by roughly six months. Compiling from the tag is the only dependable way to guarantee the presence of story_share_count and other freshly-introduced metrics. Skipping this step is the most common reason teams see “missing column” errors after rebuilding containers.

Compile flags that matter

TDLib turns off statistics support by default if -DTD_ENABLE_STATISTICS=OFF. The CMake cache shows Statistics: False unless you override. Keep -DTD_ENABLE_STATISTICS=ON and link against OpenSSL 3.x so that the embedded graph PNG encoder can generate preview thumbnails. Missing PNG support does not block data retrieval, but you will get an empty thumbnail: "" field which some BI tools treat as malformed.

On ARM64 machines you must also add -DTD_DISABLE_SSE41=ON to avoid SIGILL on older Graviton cores. The build will still pass all unit tests, and analytics performance remains within 5 % of x86_64 according to informal benchmarks on t3g.large instances.

Step-by-step: build TDLib and pull one channel report

1. Clone and patch (Linux x86_64 example)

sudo apt install g++ make cmake zlib1g-dev libssl-dev libreadline-dev
git clone https://github.com/tdlib/td.git
cd td && git checkout v1.8.29
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DTD_ENABLE_STATISTICS=ON ..
make -j$(nproc)
sudo make install

2. Authenticate once with your phone number

Use the example tdjson binary shipped in example/cpp. On first run it asks for the code Telegram just sent. The resulting _auth_token file is portable; back it up if you plan to run the export headless on a CI box. Security note: the file contains the session string—chmod 600.

3. Locate your channel’s chat_id

In the same REPL, type getChats with limit=100 and filter=ChatListMain. Find the object where type::chatTypeChannel and is_verified match your target. Note the numeric id: field (it is negative for channels). Store it in env var $CHAN.

4. Request statistics

echo '{"@type":"getChatStatistics","chat_id":'$CHAN',"is_dark":false}' | \
  ./tdjson_example > stats.json

The call completes in ~1 s for a 300 k member channel. The returned object contains three top-level keys: member_count, administrator_count, and statistical_graph. If you need per-message metrics, chain a second request:

echo '{"@type":"getChatStatistics","chat_id":'$CHAN',"type":{"@type":"chatStatisticsMessages"}}' | \
  ./tdjson_example > msg.json

Parsing the JSON into CSV without losing precision

Telegram packs time-series data inside StatisticalGraph objects. The data: string is actually a tab-separated table encoded in base64. A short Python snippet decodes it:

import json, base64, csv, datetime
j = json.load(open('stats.json'))
rows = base64.b64decode(j['statistical_graph']['data']).decode().split('\n')
for r in csv.reader(rows, delimiter='\t'):
    ts, views, shares = int(r[0]), int(r[1]), int(r[2])
    print(datetime.datetime.fromtimestamp(ts), views, shares)

The first column is Unix epoch in milliseconds; divide by 1000 before casting. Missing rows indicate minutes where no messages were sent—decide whether to zero-fill or skip depending on your BI tool.

Rounding trap

Telegram Desktop rounds views to nearest thousand above 10 k. TDLib returns exact integers, so summing the CSV will give higher totals than the public UI. Document this discrepancy if you share the sheet with non-technical stakeholders.

Platform differences and GUI shortcuts

Android (v10.12.0)

Long-press channel name → Manage channel → Statistics → three-dot menu → Export as .csv. The file lands in /Documents/Telegram/Statistics/ and mirrors the 30-day window only. Per-message reaction details are absent, so you still need TDLib for that.

iOS (v10.12.1)

Channel → Profile (top right) → Statistics → Share icon → Save to Files. iOS adds a % escaping bug: comma inside headers becomes %2C. Scrub with sed before loading into pandas. Otherwise same limits as Android.

Desktop (Win 10.12.2 beta)

Right-click channel in list → Statistics → Export. The dialogue offers JSON or CSV. Reaction emojis are condensed into one column (e.g., ❤️=412), making later pivot painful. Prefer JSON and flatten with jq if you automate.

Rate limits, flood waits and how to stay polite

TDLib reuses the same flood control as official clients. Empirical observations on a 500 k channel show:

  • One getChatStatistics call every 5 s triggers no throttle.
  • Batching 50 channels in a loop with 1 s sleep hits a 420 FLOOD_WAIT_28 after the 18th request.
  • Adding a uniform random 2–4 s jitter lifts the ceiling to ~200 channels per 10 min window.

Store the @extra tag in each request; TDLib echoes it back so you can line up retries without state files.

When you should NOT use TDLib analytics

1. Compliance-heavy niches

EU health-tech channels subject to MDR audit must keep data-processing maps. Because TDLib delivers raw message IDs that can be cross-referenced to user IDs through getMessage, you technically touch personal data. Unless you hash IDs immediately, a pure client-side export may trigger GDPR Art. 30 record-keeping.

2. Ephemeral or low-member rooms

A 300-member hobby group will show graphs so sparse that confidence intervals become meaningless. The same CPU cycles spent compiling TDLib can be replaced by manually copying the public weekly screenshot.

3. No dev maintenance window

TDLib breaks binary compatibility every minor release. If your organisation cannot re-compile quarterly, prefer the official CSV; it is forward-compatible because Telegram guarantees the Desktop export format for two years (public commit 4f3a2e8, 2024-06-01).

Co-working with third-party bots—safely

Some admins bridge TDLib outputs to BI dashboards by piping CSV into a “third-party archival bot” running on their own VPS. If you do, scope the bot token to only stats and files:write. Never give it chat_id of private groups; a leaked token would allow mass-downloading user lists. Instead, create a dedicated channel that receives nothing but the hourly JSON file, then let the bot read from that channel only. This way a compromise limits exposure to aggregated data already stripped of user IDs.

Troubleshooting: common export failures

Symptom Likely cause Check Fix
Empty statistical_graph Channel < 1 k subs member_count in getChat Wait until 1 k; Telegram disables graphs below that
Error 400 CHAT_NOT_SUPPORTED Group migrated to supergroup but you cached old ID Check type in getChat Use positive supergroup ID, not legacy channel ID
CSV shows future timestamps System clock drift > 30 s timedatectl status Enable NTP; TDLib signs requests with server time

Verification & observability checklist

  1. Checksum the JSON size; a sudden 50 % drop week-over-week usually means you requested the wrong chat_id.
  2. Plot views vs. shares; expect Pearson r≈0.3–0.5 for news, >0.7 for meme channels. Outliers hint at scraping artefacts.
  3. Compare one random day to public @votebot poll reach; ±5 % is acceptable variance.

Best practice decision tree (print-ready)

Q1: Do you need per-minute resolution? → NO → Use in-app 30-day CSV.
Q2: Can you compile C++ quarterly? → NO → Use desktop JSON export.
Q3: Must you join data with external CRM? → YES → TDLib + local ETL.
Q4: Subject to GDPR? → YES → Hash message IDs, store separately, sign DPA with VPS provider.

Looking ahead: what Telegram Stats v2 might bring

In the public Android 10.13 beta (Oct 2025) developers spotted a hidden flag story_stats_enabled. If it graduates to stable, expect TDLib to mirror getChatStoryStatistics next year. For now, stories metrics are only available to verified accounts inside the official app, and TDLib returns story_graph: null. Plan your schema to accept a new story_reach column without breaking existing CSV imports; a simple ALTER TABLE ADD COLUMN in Postgres will future-proof your pipeline.

Bottom line

TDLib channel analytics export is the only future-proof way to own your Telegram metrics, but it comes with a build-time tax and a compliance burden. Adopt it when the built-in 30-day window is too coarse, when you must join Telegram data with CRM IDs, or when you operate at a scale where minute-level anomaly detection pays the server bill. Otherwise, the humble in-client CSV remains the fastest path from question to chart—no compiler required.

Case study 1: 120 k subscriber tech-news feed

Challenge: Editorial wanted to quantify the uplift of emoji-only headlines versus traditional summary blurbs. The in-app CSV only offered daily totals, masking the critical 15-minute window after publication.

Approach: The team compiled TDLib 1.8.29 on a t3.small instance, scheduled getChatStatisticsMessages every 5 min via systemd timer, and landed the JSON into S3. AWS Glue flattened the base64 graph into Parquet, while QuickSight provided real-time dashboards. Headlines were tagged by a simple regexp on the emoji presence.

Result: Emoji headlines generated 11 % more shares (p<0.01) but 3 % fewer views, indicating higher virality at the cost of initial reach. Editors now schedule emoji variant only when the topic is entertainment or weekend leisure.

Revisit: After three months the compile pipeline broke when Debian updated OpenSSL. The lesson: freeze the builder image and run apt-mark hold on libssl-dev.

Case study 2: 3 k member local non-profit

Challenge: Volunteers needed a monthly PDF for donors but had no dev budget.

Approach: Instead of TDLib they used Desktop JSON export, wrote a 30-line Python script to convert timestamps, and rendered a PDF with matplotlib. Total effort: 2 hours.

Result: The report satisfied transparency requirements; the organisation never hit the 1 k subs threshold that unlocks graphs, so TDLib would have returned empty data anyway.

Lesson: Start with GUI exports; upgrade path to TDLib only when granularity or automation becomes a business necessity.

Runbook: monitoring your TDLib pipeline

1. Alert signals

  • JSON file size < 1 KB for two consecutive runs (usually means statistical_graph is empty).
  • HTTP 502 from Telegram servers > 5 % of requests in 10 min.
  • System load > 4.0 on a single-core box (indicates memory leak in long-lived tdjson_example).

2. Locate the fault

  1. Compare member_count from JSON with public subscriber count shown in-app; divergence > 2 % suggests wrong chat_id.
  2. Check systemd journal for FLOOD_WAIT lines; extract the numeric hint and back off by that many seconds.
  3. Run timedatectl; drift > 30 s causes request rejection.

3. Graceful rollback

If TDLib returns repeated errors, fall back to Desktop JSON export manually, then schedule a nightly cron to wget the file from the shared channel where your archival bot drops it. This keeps dashboards alive while you rebuild.

4. Quarterly drill

Simulate a 24-hour TDLib outage, force the fallback path, and verify that downstream BI still refreshes. Document the time-to-recover; anything > 30 min should trigger a pagerDuty page.

FAQ

Q: Does TDLib expose phone numbers of viewers?
A: No. Statistics are aggregated; you only see counts, not user objects.
Background: getChatStatistics returns totals, whereas getMessage needs individual user IDs but is not part of the stats call.
Q: Can I use a bot token instead of my phone number?
A: No. Bots cannot call getChatStatistics; the method requires a user session.
Evidence: official docs list the method under “user-only” scope.
Q: Will enabling statistics increase binary size?
A: ~8 % on x86_64 when built with -DTD_ENABLE_STATISTICS=ON.
Measured via size tdjson.so before and after toggle.
Q: Is there a Docker image with statistics on by default?
A: Not officially; community images exist but pin the hash and inspect the Dockerfile to ensure the flag is set.
Example: docker run --rm myci/tdlib:1.8.29-stat sh -c 'tdjson_example --version'
Q: Why does statistical_graph contain negative numbers?
A: They represent delta values when is_dark=true is requested; negative means net loss.
Check the token field in JSON; “d” graphs are differential.
Q: Can I poll faster than 5 s?
A: Empirically 1 s works for a single channel, but you risk FLOOD_WAIT if parallelism > 3.
Test on a throw-away account first; Telegram may tighten without notice.
Q: Do I need admin rights?
A: Yes, only channel administrators can pull stats.
Check administrator_rights.can_view_stats in getChat response.
Q: How long are data available?
A: TDLib itself imposes no look-back limit; Telegram servers return up to two years of minute data, after which the graph is down-sampled to hourly.
Archived beyond two years is physically deleted according to Telegram support ticket #417221.
Q: Is the PNG thumbnail in colour or greyscale?
A: 24-bit colour, but the palette is muted to match dark mode if is_dark=true.
You can ignore the field; it is cosmetic and does not affect CSV extraction.
Q: Can I cross-link message IDs with replies?
A: Yes, but you must call getMessage afterwards; statistics alone do not include reply IDs.
Remember that each extra call consumes flood quota.
Q: Does getStatisticalGraph include story views?
A: Not as of 1.8.29; stories metrics are expected in a future release.
Current response key story_graph is always null.

Term glossary

FLOOD_WAIT
Server-side rate-limit error requiring back-off; first appears in “Rate limits” section.
StatisticalGraph
Base64-encoded TSV object holding minute-level metrics; see “Parsing the JSON”.
is_dark
Boolean flag requesting colour palette tuned for dark themes; see “Request statistics”.
chatTypeChannel
JSON tag identifying a channel as opposed to a group; see “Locate your channel’s chat_id”.
@extra
Optional request identifier echoed in the response for idempotent retries; see “Rate limits”.
thumbnail
PNG preview of the graph; may be empty if compiled without PNG support; see “Compile flags”.
administrator_count
Number of admins returned inside statistics payload; see “Request statistics”.
reaction_count
New metric populating only on ≥1.8.10; see “Version matrix”.
story_share_count
Metric reserved for future stories integration; currently null.
tdjson_example
Example binary used for CLI interactions; see “Authenticate once”.
CHAT_NOT_SUPPORTED
Error code when requesting stats on an unsupported chat type; see troubleshooting table.
CMake
Build system used to compile TDLib; see step 1 in build guide.
OpenSSL 3.x
Required dependency for PNG thumbnail generation; see “Compile flags”.
@votebot
Public bot whose poll reach can be used for sanity checks; see verification checklist.
Pearson r
Correlation coefficient used to sanity-check views vs. shares; see verification checklist.
DRY (Don’t Repeat Yourself)
General principle violated if you re-implement TDLib parsing when CSV suffices; see decision tree.

Risk matrix & boundary conditions

Scenario Risk Mitigation / Alternative
Member count < 1 k Empty graphs Use GUI screenshot until threshold crossed
GDPR-covered entity Personal data exposure via getMessage Hash IDs at ingestion, maintain Art. 30 record
No in-house C++ skill Build failures on OS updates Stick to Desktop JSON; outsource only if ROI > 100 k views/mo
Channel migrated Old ID returns CHAT_NOT_SUPPORTED Refresh chat_id cache after any migration notice

Future trend / version outlook

Public commits show experimental support for paid-story boosts and revenue graphs. If those features reach stable, TDLib will likely expose getChatRevenueStatistics alongside the existing calls. Early adopters should reserve namespace in their warehouses for monetary columns (revenue_usd, boost_count) and stay on the master branch compiled at least monthly to catch schema evolutions before they hit production dashboards.

Until then, the workflow in this article remains the most deterministic way to extract unlimited, minute-level Telegram channel analytics—provided you accept the compile-time tax and keep an eye on compliance boundaries.