Are We Decentralized Yet?

fetch-nodeinfo-bot

fetch-nodeinfo-bot crawls data reported via the nodeinfo standard by a variety of federated services. This data commonly includes the type and version of software in use, the number of users, the number of posts created locally on that site, and information about which federation protocols the site supports.

It presents User-Agent: fetch-nodeinfo-bot (+https://arewedecentralizedyet.online/) and can be blocked via robots.txt.

Crawling Practices

fetch-nodeinfo-bot gets the lists of hosts to crawl from nodes.fediverse.party. The methodology for node discovery and listing is described on that site, and this extensive documentation is a large part of why I chose this as my node list. fetch-nodeinfo-bot does not discover other nodes itself.

fetch-nodeinfo-bot respects robots.txt, and will not fetch nodeinfo on sites that restrict crawlers in general, or its User-Agent specifically, from /.well-known/nodeinfo. Thus, if you do not want your side to be included, setting up a robots.txt file will block this bot.

fetch-nodeinfo-bot is currently set to fetch each server's nodeinfo approximately once per day, with separate TTLs for re-fetching robots.txt and re-trying in case of failures. It uses a rate-limit per IP address block to avoid overloading shared infrastructure (such as multiple sites on the same physical or virtual server), and lowers this limit in response to receiving HTTP 429 (rate limit exceeded) responses.

Data Use

Data gathered by this bot is used to create the website arewedecentralizedyet.online, which compares how centralized or decentralized social networks and other Web services are in practice. Historical data and raw nodeinfo snapshots are available in the Data section of the site.

If you would like data about your site to be removed from this dataset, contact the author.

Author and Code

This bot is written and operated by Robert Ricci, who can be reached at rob [at] ricci [dot] io . The source code for this bot is on Codeberg .

Back to the main page