Bluesky Research

We are a team of researchers performing measurements on Bluesky.

Why are you doing this?

The pitfalls of centralized social networks, such as Facebook and Twitter/X, have led to concerns about control, transparency, and accountability. Decentralized social networks have emerged as a result with the goal of empowering users. In contrast to alternative approaches (e.g. Mastodon), Bluesky decomposes and opens the key functions of the platform into subcomponents that can be provided by third party stakeholders.

We investigate this novel architecture of Bluesky, measure the network, describe the components, and look into the effects this all has on the users.

Who are you?

If you want, contact us on Bluesky:

We are associated with the Communication Networks Lab at TU Darmstadt, Germany; the School of Science and Technology at City, University of London, UK; the School of Computing and Communications at Lancaster University, UK; the School of Electronic Engineering and computer Science at Queen Mary, University of London, UK; the Hong Kong University of Science and Technology (GZ), China; the University of Grenoble, Alpes, Ensimag, France; and the former Trust in Distributed Systems research group at the Weizenbaum Institute, Germany.

Papers

IMC ’24: Looking AT the Blue Skies of Bluesky

In our IMC ’24 paper we look at the overall structure, datasets, and growth of Bluesky. We conduct the first large-scale analysis of this novel microblogging platform. We collect a comprehensive dataset covering all the key elements of Bluesky, up to April 2024, covering about 5.5M users, 225M posts, 40k Feed Generators, and 62 Labelers. We study the uptake of the functionalities that Bluesky opens to third parties. Our findings show substantial uptake of content curation related functionalities.

Please cite as such:

@inproceedings{balduf2024bluesky,
    author = {Balduf, Leonhard and Sokoto, Saidu and Ascigil, Onur and Tyson, Gareth and Scheuermann, Bj\"{o}rn and Korczy\'{n}ski, Maciej and Castro, Ignacio and Kr\'{o}l,  Micha{\l}},
    title = {Looking AT the Blue Skies of Bluesky},
    year = {2024},
    url = {https://doi.org/10.1145/3646547.3688407},
    doi = {10.1145/3646547.3688407},
    booktitle = {Proceedings of the 2024 ACM on Internet Measurement Conference},
}
Replication

In order to replicate this work, you need to obtain Firehose updates, Labelers, Feed Generators, and DID documents of every user. We make most of the tooling for this public, please see below.

ICWSM ’25: Bootstrapping Social Networks: Lessons from Bluesky Starter Packs

In our ICWSM ’25 paper we look at Starter Packs and their impact. We curate a complete dataset up to the end of 2024, with 25M users and 335k Starter Packs with 1.7M members. We identify follows resulting from starter packs and confirm that starter packs help users bootstrap their social network.

Please cite as such:

@misc{balduf2025bootstrappingsocialnetworks,
    title={Bootstrapping Social Networks: Lessons from Bluesky Starter Packs},
    author={Leonhard Balduf and Saidu Sokoto and Onur Ascigil and Gareth Tyson and Ignacio Castro and Andrea Baronchelli and George Pavlou and Björn Scheuermann and Michał Król},
    year={2025},
    eprint={2501.11605},
    archivePrefix={arXiv},
    primaryClass={cs.SI},
    url={https://arxiv.org/abs/2501.11605},
}
Replication

In order to replicate this work, you need (in addition to everything from the previous work) snapshots of the entire network. The tools for this are open-source, see below.

In order to match extracted multi-follow operations from the Firehose to starter packs, you’d need

  • The state of every starter pack at every point in time, which can be realized through Firehose updates.
  • Multi-follow operations extracted from the Firehose.
  • A tool to intersect them with high performance, which we open source here.

Datasets

We collect a number of datasets for our research, some of which are available publicly in anonymized form. Please see Bluesky Datasets.
Please see our privacy policy for contact information and details about the data collected.

Services

Please see Bluesky Services.

Privacy Policy

Since our work includes collecting data of potentially real humans, we wrote ourselves a privacy policy.