CAA2026 – The Jolly Roger showed the way: Decentralised archaeological data sharing
At CAA2026 in Vienna, I will present a talk titled:
“The Jolly Roger showed the way: A decentralised data repository using torrent and peer-to-peer technology.”
The idea behind the talk is simple: archaeological research increasingly depends on digital datasets, but the infrastructures that store those datasets are often fragile.
Many repositories depend on long-term funding, institutional support, and continuous technical maintenance. When any of those disappear, the data may become difficult to access, move, or reuse.
In this talk I explore a different possibility:
what if archaeological datasets could be distributed directly by the research community itself, using peer-to-peer technologies?
A different way of sharing data
Peer-to-peer systems such as BitTorrent distribute files across many computers rather than relying on a single server.
This approach has several interesting properties:
- large files can be distributed efficiently
- file integrity is verified automatically through cryptographic hashes
- datasets can be replicated across many participants
- availability no longer depends on a single host
These technologies are already widely used for distributing large open datasets and software, including Linux distributions and scientific data collections.
The question I wanted to explore was whether a similar approach could be used in archaeology.
A small proof of concept
For the experiment I created a simple workflow that packages a dataset together with metadata and distributes it through a torrent file.
Two datasets were used:
- a small experimental CSV dataset from bone surface modification research
- a larger dataset consisting of photogrammetrically generated 3D trench models
The torrent links were shared publicly in several online communities in reddit, including archaeology and data-sharing forums.
The goal was not to build a full repository, but simply to observe one thing:
Could a dataset swarm emerge without dedicated infrastructure?
What happened
Within the first 24 hours, dozens of volunteers had already downloaded and seeded the datasets.
This meant that:
- multiple independent copies of the data were created
- the original uploader was no longer required to keep the data available
- the datasets could continue to circulate through the network
In other words, a single researcher with a normal internet connection was able to initiate a distributed dataset network.
Challenges and open questions
Of course, this approach is not a replacement for established repositories.
There are still many open questions:
- how long swarms remain active
- how metadata and indexing should be organised
- how ethical and legal restrictions should be handled
- how decentralised infrastructures should be governed
Rather than replacing existing archives, peer-to-peer distribution could complement them, especially for large datasets or community-driven projects.
Why the “Jolly Roger”?
The title of the talk references the pirate flag associated with file-sharing culture.
The point is not piracy, but rather the idea that decentralised technologies can redistribute power over data infrastructures, allowing communities to share and preserve knowledge collaboratively.
In archaeology, where datasets are often scattered across projects and institutions, that idea may be worth exploring.
Repository, code and reddit posts
The proof-of-concept workflow used in the talk is available here.
Reddit posts sharing the datasets can be found in the following links:
If you are attending CAA2026, feel free to come say hello and discuss ideas about open infrastructure, reproducible research, and data sharing in archaeology. I look forward to connecting with others who are passionate about computing in archaeology and sharing these projects with the community!