Bluesky Research
We are a team of researchers performing measurements on Bluesky.
Why are you doing this?
The pitfalls of centralized social networks, such as Facebook and Twitter/X, have led to concerns about control, transparency, and accountability. Decentralized social networks have emerged as a result with the goal of empowering users. In contrast to alternative approaches (e.g. Mastodon), Bluesky decomposes and opens the key functions of the platform into subcomponents that can be provided by third party stakeholders.
We investigate this novel architecture of Bluesky, measure the network, describe the components, and look into the effects this all has on the users.
Who are you?
If you want, contact us on Bluesky:
- Leonhard Balduf: leobalduf.bsky.social
- Saidu Sokoto: bibo7086.bsky.social
- Dr Michał Król: harnen.bsky.social
- Onur Ascigil: asonur.bsky.social
- Gareth Tyson: garethtyson.bsky.social
- Ignacio Castro: ignactro.bsky.social
- Andrea Baronchelli: baronca.bsky.social
We are associated with the Communication Networks Lab at TU Darmstadt, Germany; the School of Science and Technology at City, University of London, UK; the School of Computing and Communications at Lancaster University, UK; the School of Electronic Engineering and computer Science at Queen Mary, University of London, UK; the Hong Kong University of Science and Technology (GZ), China; the University of Grenoble, Alpes, Ensimag, France; and the former Trust in Distributed Systems research group at the Weizenbaum Institute, Germany.
Papers
IMC ’24: Looking AT the Blue Skies of Bluesky
In our IMC ’24 paper we look at the overall structure, datasets, and growth of Bluesky. We conduct the first large-scale analysis of this novel microblogging platform. We collect a comprehensive dataset covering all the key elements of Bluesky, up to April 2024, covering about 5.5M users, 225M posts, 40k Feed Generators, and 62 Labelers. We study the uptake of the functionalities that Bluesky opens to third parties. Our findings show substantial uptake of content curation related functionalities.
Please cite as such:
@inproceedings{balduf2024bluesky,
author = {Balduf, Leonhard and Sokoto, Saidu and Ascigil, Onur and Tyson, Gareth and Scheuermann, Bj\"{o}rn and Korczy\'{n}ski, Maciej and Castro, Ignacio and Kr\'{o}l, Micha{\l}},
title = {Looking AT the Blue Skies of Bluesky},
year = {2024},
url = {https://doi.org/10.1145/3646547.3688407},
doi = {10.1145/3646547.3688407},
booktitle = {Proceedings of the 2024 ACM on Internet Measurement Conference},
}
Replication
In order to replicate this work, you need to obtain Firehose updates, Labelers, Feed Generators, and DID documents of every user. We make most of the tooling for this public, please see below.
ICWSM ’25: Bootstrapping Social Networks: Lessons from Bluesky Starter Packs
In our ICWSM ’25 paper we look at Starter Packs and their impact. We curate a complete dataset up to the end of 2024, with 25M users and 335k Starter Packs with 1.7M members. We identify follows resulting from starter packs and confirm that starter packs help users bootstrap their social network.
Please cite as such:
@inproceedings{balduf2025bootstrapping,
title={Bootstrapping Social Networks: Lessons from Bluesky Starter Packs},
author={Balduf, Leonhard and Sokoto, Saidu and Baronchelli, Andrea and Castro, Ignacio and Kr{\'o}l, Micha{\l} and Tyson, Gareth and Pavlou, George and Scheuermann, Bj{\"o}rn and Ascigil, Onur},
volume={19},
pages={178--192},
year={2025},
url={https://doi.org/10.1609/icwsm.v19i1.35810},
doi={10.1609/icwsm.v19i1.35810},
booktitle={Proceedings of the International AAAI Conference on Web and Social Media}
}
Replication
In order to replicate this work, you need (in addition to everything from the previous work) snapshots of the entire network. The tools for this are open-source, see below.
In order to match extracted multi-follow operations from the Firehose to starter packs, you’d need
- The state of every starter pack at every point in time, which can be realized through Firehose updates.
- Multi-follow operations extracted from the Firehose.
- A tool to intersect them with high performance, which we open source here.
WWW ’26: Open or Blocked Skies? Community Moderation Practices in Bluesky
In our WWW ’26 paper we look at community-driven moderation on Bluesky. We collect a comprehensive dataset covering moderation activities across 34M users, including both individual blocking actions and community-maintained blocklists. We study the characteristics of blocked users, the different behaviors targeted by blocklists, and the effects of blocking on user visibility, activity, popularity, and social connections. Our findings show that community blocking is widespread, operating at a scale orders of magnitude larger than official moderation actions and affecting the visibility of more than 90% of Bluesky content. We further find that blocked accounts are among the most active, popular, toxic, and politically engaged users, while blocking has only limited effects on their subsequent activity and position in the social graph.
Please cite as such:
@inproceedings{10.1145/3774904.3792106,
author = {Sokoto, Saidu and Balduf, Leonhard and Ascigil, Onur and Tyson, Gareth and Castro, Ignacio and Scheuermann, Bj\"{o}rn and Baronchelli, Andrea and Kr\'{o}l, Micha\l{}},
title = {Open or Blocked Skies? Community Moderation Practices in Bluesky},
year = {2026},
url = {https://doi.org/10.1145/3774904.3792106},
doi = {10.1145/3774904.3792106},
booktitle = {Proceedings of the ACM Web Conference 2026},
series = {WWW '26}
}
Replication
Source code of this paper has been made publicly available at https://doi.org/10.5281/zenodo.18351518.
Datasets
We collect a number of datasets for our research, some of which are
available publicly in anonymized form. Please see Bluesky Datasets.
Please see our privacy policy for
contact information and details about the data collected.
Services
Please see Bluesky Services.
Privacy Policy
Since our work includes collecting data of potentially real humans, we wrote ourselves a privacy policy.