This blog is written by Fergal Reid and Martin Harrigan. We are researchers with the Clique Research Cluster at University College Dublin. The results in this blog are based on a paper we wrote that considers anonymity in the Bitcoin system. A preprint of the paper is available on arXiv.

Update (January 1, 2013): We received many requests for an up to date, human-readable copy of the block chain, which can be difficult to extract using existing tools. One of the authors, Martin Harrigan, has released QuantaBytes to this end. It provides up to date copies of the block chain along with tools for analysis and visualization. Check it out!

Friday, September 30, 2011

Bitcoin is not Anonymous


TL;DR

Bitcoin is not inherently anonymous. It may be possible to conduct transactions is such a way so as to obscure your identity, but, in many cases, users and their transactions can be identified. We have performed an analysis of anonymity in the Bitcoin system and published our results in a preprint on arXiv.

The Full Story

Anonymity is not a prominent design goal of Bitcoin. However, Bitcoin is often referred to as being anonymous. We have performed a passive analysis of anonymity in the Bitcoin system using publicly available data and tools from network analysis. The results show that the actions of many users are far from anonymous. We note that several centralized services, e.g. exchanges, mixers and wallet services, have access to even more information should they wish to piece together users' activity. We also point out that an active analysis, using say marked Bitcoins and collaborating users, could reveal even more details. The technical details are contained in a preprint on arXiv. We welcome any feedback or corrections regarding the paper.

Case Study: The Bitcoin Theft

To illustrate our findings, we have chosen a case study involving a user who has many reasons to stay anonymous. He is the alleged thief of 25,000 Bitcoins. This is a summary of the victim's postings to the Bitcoin forums and an analysis of the relevant transactions.

Summary

The victim woke up on the morning of 13/06/2011 to find a large portion of his Bitcoins sent to 1KPTdMb6p7H3YCwsyFqrEmKGmsHqe1Q3jg. The alleged theft occurred on 13/06/2011 at 16:52:23 UTC shortly after somebody broke into the victim's Slush pool account and changed the payout address to 15iUDqk6nLmav3B1xUHPQivDpfMruVsu9f. The Bitcoins rightfully belong to 1J18yk7D353z3gRVcdbS7PV5Q8h5w6oWWG.

An Egocentric Analysis

Fig. 1: The egocentric user network of the thief.


We consider the user network of the thief. Each vertex represents a user and each directed edge between a source and a target represents a flow of Bitcoins from a public-key belonging to the user corresponding to the source to a public-key belonging to the user corresponding to the target. Each directed edge is colored by its source vertex. The network is imperfect in the sense that there is, at the moment, a one-to-one mapping between users and public-keys. We restrict ourselves to the egocentric network surrounding the thief: we include every vertex that is reachable by a path of length at most two ignoring directionality and all edges induced by these vertices. We also remove all loops, multiple edges and edges that are not contained in some biconnected component to avoid clutter. In Fig. 1, the red vertex represents the thief and the green vertex represents the victim. The theft is the green edge joining the victim and the thief. There are in fact two green edges located nearby in Fig. 1 but only one directly connects the victim to the thief.



Fig. 2: An interesting sub-network induced by the thief, the victim and three other vertices.

Interestingly, the victim and the thief are joined by paths (ignoring directionality) other than the green edge representing the theft. For example, consider the sub-network shown in Fig. 2 induced by the red, green, purple, yellow and orange vertices. This sub-network is a cycle. We contract all vertices whose corresponding public-keys belong to the same user. This allows us to attach values in Bitcoins and timestamps to the directed edges. Firstly, we note that the theft of 25,000 BTC was preceded by a smaller theft of 1 BTC. This was later reported by the victim in the Bitcoin forums. Secondly, using off-network data, we have identified some of the other colored vertices: the purple vertex represents the main Slush pool account and the orange vertex represents the computer hacker group LulzSec (see, for example, their Twitter stream). We note that there has been at least one attempt to associate the thief with LulzSec. This was a fake; it was created after the theft. However, the identification of the orange vertex with LulzSec is genuine and was established before the theft. We observe that the thief sent 0.31337 BTC to LulzSec shortly after the theft but we cannot otherwise associate him with the group. The main Slush pool account sent a total of 441.83 BTC to the victim over a 70-day period. It also sent a total of 0.2 BTC to the yellow vertex over a 2-day period. One day before the theft, the yellow vertex also sent 0.120607 BTC to LulzSec. The yellow vertex represents a user who is the owner of at least five public-keys:
Like the victim, he is a member of the Slush pool, and like the thief, he is a one-time donator to LulzSec. This donation, the day before the theft, is his last known activity using these public-keys.

A Flow and Temporal Analysis

In addition to visualizing the egocentric network of the thief with a fixed radius, we can follow significant flows of value through the network over time. If a vertex representing a user receives a large volume of Bitcoins relative to their estimated balance, and, shortly after, transfers a significant proportion of those Bitcoins to another user, we deem this interesting. We built a special purpose tool that, starting with a chosen vertex or set of vertices, traces significant flows of Bitcoins over time. In practice we have found this tool to be quite revealing when analyzing the user network.

Fig. 3: A visualization of Bitcoin flow from the theft. The size of a vertex corresponds to its degree in the entire network. The color denotes the volume of Bitcoins warmer colors have larger volumes flowing through them. We also provide an SVG which contains hyperlinks to the relevant Block Explorer pages.


Fig. 4: An annotated version of Fig. 3.

In the left inset, we can see that the Bitcoins are shuffled between a small number of accounts and then transferred back to the initial account. After this shuffling step, we have identified four significant outflows of Bitcoins that began at 19:49, 20:01, 20:13 and 20:55. Of particular interest are the outflows that began at 20:55 (labeled as 1 in both insets) and 20:13 (labeled as 2 in both insets). These outflows pass through several subsequent accounts over a period of several hours. Flow 1 splits at the vertex labeled A in the right inset at 04:05 the day after the theft. Some of its Bitcoins rejoin Flow 2 at the vertex labeled B. This new combined flow is labeled as 3 in the right inset. The remaining Bitcoins from Flow 1 pass through several additional vertices in the next two days. This flow is labeled as 4 in the right inset.

A surprising event occurs on 16/06/2011 at approximately 13:37. A small number of Bitcoins are transferred from Flow 3 to a heretofore unseen public-key 1FKFiCYJSFqxT3zkZntHjfU47SvAzauZXN. Approximately seven minutes later, a small number of Bitcoins are transferred from Flow 3 to another heretofore unseen public-key 1FhYawPhWDvkZCJVBrDfQoo2qC3EuKtb94. Finally, there are two simultaneous transfers from Flow 4 to two more heretofore unseen public-keys: 1MJZZmmSrQZ9NzeQt3hYP76oFC5dWAf2nD and 12dJo17jcR78Uk1Ak5wfgyXtciU62MzcEc. We have determined that these four public-keys which receive Bitcoins from two separate flows that split from each other two days previously are all contracted to the same user in our ancillary network. This user is represented as C.


There are several other examples of interesting flow. The flow labeled as Y involves the movement of Bitcoins through thirty unique public-keys in a very short period of time. At each step, a small number of Bitcoins (typically 30 BTC which had a market value of approximately US$500 at the time of the transactions) are siphoned off. The public-keys that receive the small number of Bitcoins are typically represented by small blue vertices due to their low volume and degree. On 20/06/2011 at 12:35, each of these public-keys makes a transfer to a public-key operated by the MyBitcoin service. Curiously, this public-key was previously involved in another separate Bitcoin theft.

WikiLeaks

WikiLeaks recently advised its Twitter followers that it now accepts anonymous donations via Bitcoin. They also state that "Bitcoin is a secure and anonymous digital currency. Bitcoins cannot be easily tracked back to you, and are a [sic] safer and faster alternative to other donation methods." They proceed to describe a more secure method of donating Bitcoins that involves the generation of a one-time public-key but the implications for those who donate using the tweeted public-key are unclear. Is it possible to associate a donation with other Bitcoin transactions performed by the same user or perhaps identify them using external information?


Fig. 5: A visualization of the egocentric user network of WikiLeaks. We can identify many of the users in this visualization.

Our tools resolve several of the users with identifying information gathered from the Bitcoin Forums, the Bitcoin Faucet, Twitter streams, etc. These users can be linked either directly or indirectly to their donations. The presence of a Bitcoin mining pool (a large red vertex) and a number of public-keys between it and WikiLeaks' public-key is interesting. Our point is that, by default, a donation to WikiLeaks' 'public' public-key may not be anonymous.

Conclusion
This is a straight-forward passive analysis of public data that allows us to de-anonymize considerable portions of the Bitcoin network. We can use tools from network analysis to visualize egocentric networks and to follow the flow of Bitcoins. This can help us identify several centralized services that may have even more details about interesting users. We can also apply techniques such as community finding, block modeling, network flow algorithms, etc. to better understand the network.

Feedback
We are excited about the Bitcoin project and consider it a remarkable milestone in the evolution of electronic currencies. Our motivation for this work has not been to de-anonymize any individual users; rather it is to illustrate the limits of anonymity in the Bitcoin system. It is important that users do not have a false expectation of anonymity. We welcome any feedback or comments regarding the preprint on arXiv or the details in this post.

Follow on:
We have wrote a follow on blog post: http://anonymity-in-bitcoin.blogspot.com/2011/09/code-datasets-and-spsn11.html  where we release some of the data we extracted, in other to allow other researchers replicate our work, or perform follow on analysis.

46 comments:

  1. Haha. What taliesin said... Did you reveal his identity? No. Anonymous.

    ReplyDelete
  2. > On 20/06/2011 at 12:35, each of these public-keys makes a transfer to a public-key operated by the MyBitcoin service. Curiously, this public-key was previously involved in another separate Bitcoin theft.

    ...And then? Once the money entered the coinmix, then what? Or do the passive observations break down completely at MyBitcoin or MtGox?

    ReplyDelete
  3. taliesin, Unknown:
    We don't set out to to de-anonymise the thief - we are researchers, not law enforcement, and we are just using that as an example to show its possible to trace the flow of Bitcoins around the network.

    It is possible to use Bitcoin in a way that is almost certainly anonymous, in the same way it is possible to get almost certain anonymity on the Internet, by using encryption, onion routing, and never associating your identity with your actions.

    Our point is that you don't get this anonymity automatically, and that most casual users of Bitcoin may not be anonymous, even though many of them may believe they are.

    The system looks more anonymous than it is.


    gwern:
    The passive observations, as we've described them here, break down.

    However, at that point, the users are relying on centralised organisations for their anonymity.
    This is a very different scenario from having anonymity due to the nature of the system.

    Centralised organisations could suffer security problems, or may be accessible to governments in repressive regimes, for example.

    ReplyDelete
  4. > We don't set out to to de-anonymise the thief - we are researchers, not law enforcement

    Ok, then how about this. If law enforcement were to perform the analysis you performed, is that enough to a.) identify the thief and b.) persuade a jury beyond a reasonable doubt?

    ReplyDelete
  5. Fergal,
    "we are researchers, not law enforcement"

    I don't think that 'stealing bitcoins' is illegal in any sense, then again, I don't know how the law deals with the theft of digital tokens that aren't actually representative of legally recognised currency.

    ReplyDelete
  6. >and we are just using that as an example to show its possible to trace the flow of Bitcoins around the network.

    To know that you can trace the flow of bitcons around the network you dont need this blog post and the fancy graphics. You only need to go to the Bitcoin wiki and read how it works.

    This "mistery" you have solved has been stated clearly by the Bitcoin community. Its no secret.

    I understand you want to get a "exciting" title to get visits but it feels cheap.

    ReplyDelete
  7. Hi Errores, I'm not sure I agree. I have read the Bitcoin wiki and we acknowledge the fact that the pseudo-anonymity of Bitcoin is known within the community. However, we don't believe it is known well enough outside the core community. Yes, anyone can in theory follow the flow of Bitcoins --- we all have the data -- but it's not easy to do with current tools. The majority of the observations above have not been pointed our in the Bitcoin Forums and it is very difficult to establish them using something like blockexplorer.com alone. I believe the fancy graphics are necessary.

    ReplyDelete
  8. Errores:


    You need some sophistication in your flow analysis to trace the transactions - its not *trivial* to do.

    You can look at allinvain's lists of public keys to which the money went, if you want to see an example of a more naive approach - there are over 30K keys on those lists, and its very hard to intrepret.

    See here: http://forum.bitcoin.org/index.php?topic=16457.0

    and specifically these lists of addresses to where the BTC went: http://folk.uio.no/vegardno/allinvain-addresses.txt

    (This is not to disparage allinvain's work in any way - we just happen to be network analysis researchers, and have a good toolbox for this sort of thing).


    Before doing this work, I would have expected things to pan out a little more like that - flows getting lost in the noise - rather than showing up intelligibly on visualizations.

    ReplyDelete
  9. Derp:
    That's a legal question - way outside our area of expertise.

    I would say, though, that the value of the items stolen is debatable.

    It was generally reported as about $0.5M. That was the market value, at the exchanges at the time, so its hard to come up with a better valuation. Bitcoin markets don't have huge depth though - so its hard to accurately assess its real worth.

    ReplyDelete
  10. I know you put 'alleged' into your article once, but I'd like to make it clear that allinvain is completely at fault that is, *IF* the bitcoins were even his to begin with. I can point to any transaction in the blockchain and say they were mine and 'stolen' from me.

    Nice charts, they provide a bunch of public keys which can't even be used to track anything with any certainty. How do you even know that the keys listed even belong to the same wallet?

    Bitcoin may not be truly anonymous, but it sure isn't as easy as you suggest to implicate a transfer belonging to a single client.

    ReplyDelete
  11. One of traditional PKI's strengths is its non-repudiation. BitCoin uses public keys, and each public key that a BTC (or fraction thereof) has been owned by is necessarily in its transaction history. So, it makes sense that a user wishing to remain anonymous would need to take steps to
    1. disassociate their real identity with their anonymous public key(s),
    2. not let their private key(s) be compromised or exposed publicly, and
    3. separate their anonymous public key(s) from traceable transactions as recorded in the coin's transaction history.

    Good that someone is clarifying this for people just trying to use or capitalize on the BitCoin phenomenon, but obvious to those who know how it works and the principles it is built on.

    It's been mentioned that BitCoin laundry services will eventually be introduced that separate individual transactions from each other for just such reasons. Eg. I put 100 BTC in this service, I get back 98 BTC, likely not the same ones, with the ability to truthfully deny any knowledge of previous transactions in their history. This could easily be an integral part of an online wallet service.

    ReplyDelete
  12. The identity can easily change in just one transaction. Say, the thief passes $1000 to another wallet, transfers it out again to another wallet, then transfers the money out of mtgox: even if the mtgox user is identified, he can claim he is not the thief, and that he received it from someone WHO was the one receiving the stolen money, he didn't even receive the "dirty" money directly; you cannot claim anyone is the thief, even though he might be.

    So even with just two wallet transfer, anyone can claim he is just one end-point not directly associated with the thief, or even one and say he is not the thief, just a seller.

    ALSO, how can we even BE TOTALLY SURE THE VICTIM IS A VICTIM? How can we know that he did not just buy a car, paid the seller, and THEN claim it was stolen? Why then criminalize the rest of the receivers, when we are not even sure there was a crime in the first place?

    Even the thief admitting the crime is not enough, he MUST be convicted to be considered "guilty", and then he must be the one to be responsible of where the bitcoins are, not the bitcoin network...

    So the whole anonymous/identity problem is moot except of course when the transfer is a direct one.

    ReplyDelete
  13. Martin Harregal & Ferral, I am not saying the article is bad. I read it all and enjoyed the graphs. Considering most of what I find I stop reading half way, its a good article.

    What I dont like about it is the cheap shots, like if the article is discovering something new and unknown. What the article is describing its a well known characteristic of Bitcoin, and anyone that has a basic understanding of how it works will know. Whether that knowledge is known or not by a majority of users is very debatable and any guess is good.

    Also, I would have liked that the article touched on methods to avoid this detection like bitcoin scramblers and such.

    Bitcoin is not necesarely anonymous, depending on how you use it, it can be very public. But if you take the right aproach it can be very anonymous. Its up to the user really.

    Its not a bad article, I just think the tone was not adequate at all, and more like trying to grap attention with cheap tricks. Hope that clarifies my comment.

    ReplyDelete
  14. As the real identity of the Thief was not revealed in any way... even after all this analysis and pretty graphics... then it seems the article in fact is evidence that Bitcoin is quite anonymous. Of course transactions are tracked, but -identities- are not known, and anonymity is an issue of identity knowledge.

    Not to mention - Bitcoins can be swapped without any transaction occurring on the Block Chain. It's very easy. It's called Bitbills =) Because of these physical transfers, the Block Chain history tends toward inadequacy, does it not?

    ReplyDelete
  15. Erik: Bitbills is just another coinmix, one with an offline physical aspect. It has the exact same vulnerabilities/problems that Fergal points out with MyBitcoin/Mt.Gox - you're trust the third party in order to avoid the blockchain recording *everything*. From the Bitbill FAQ http://bitbills.com/faq.html :

    > Do you keep a copy of the cards' private keys?
    >
    > After each card has been produced and proven functional, we delete all records of the private key. This means that once the card leaves our hands, we can no longer access the associated bitcoins (be aware, this means we also can't help if you lose or destroy your card).

    ReplyDelete
  16. Gwern - Bitbills is not just another coinmix, because the "mix" happens without any record on the blockchain. In other words, the block chain record is rendered unreliable, because Bitcoins can transfer ownership on physical medium just like cash and the block chain never knows about it.

    It is true that you need to trust Bitbills for the money to be on those cards, but that's a separate issue.

    ReplyDelete
  17. You seem to be missing an EXTREMELY crucial point - the difference between anonymous and untraceable.

    You did not connect the theft to an actual identity, thus it is traceable, but still anonymous! That you can follow the traces doesn't matter - it didn't reveal an identity. I have yet to see the first person that can successfully trace this to an actual snippet of useful data (IP, name, whatever).

    ReplyDelete
  18. errores:

    You wrote:
    "
    Bitcoin is not necesarely anonymous, depending on how you use it, it can be very public. But if you take the right aproach it can be very anonymous. Its up to the user really.
    "

    This is exactly our point: you don't get anonymity automatically from the system.
    A lot of people out there think you do.

    Until we did this work, we thought that, in *practice*, if not in theory, your transactions would get lost in the network very fast.

    We know that Bitcoin experts didn't think it to be theoretically anonymous - one of the developers even said that it could be vulnerable to the techniques of network analysis - but we think our contribution is that we actually did this, and showed that its not inherently anonymous in practice.

    Thanks for your feedback - its good to hear all opinions.

    ReplyDelete
  19. Erik:

    We don't know the real identity of the thief.
    It wasn't our intention to track the thief down.
    It was our intention to use the theft as a case study to show that flows can be tracked.
    We think the fact that the flows split off, and then reunite later on, is evidence that our flow tracking is working properly.

    Our point is that the alleged thief's transactions didn't get lost in the system.

    It is reasonable to suspect that the thief took extra precautions, such as accessing everything using TOR, being very careful about what they did, etc, and as such is still anonymous.

    But its also possible that they assumed the network would hide their transactions, and left enough information to be caught.

    There are co-incidences, and leads, that could be examined; services could be subpoena'd - maybe that'd reveal an identity in the end, and maybe it wouldnt - but as we've said, that's outside the scope of our work, which is about letting users know that the system doesn't make them anonymous.

    ReplyDelete
  20. joepie91:

    Thanks for your comment.

    That's certainly one way of putting it.

    We aren't saying that we know the identities of everyone on Bitcoin, just because they use Bitcoin.
    Like any Internet based service, users can take steps to make sure the network never knows their identity (using TOR etc)

    This will limit the usefulness of Bitcoin - users that want to be anonymous would have to be very careful to never make a mistake, and buy something, or spend BTC, in a way that can identify them. And, frankly, such usage is outside the technical sophistication of most users.

    So, if people start using Bitcoin as a broadly adopted currency, for buying their online shopping, say, then, using current clients, the vast majority of users are going to be very unanonymous.

    If someone got their hands on the records of a large exchange today, they could probably follow the actions of many casual users.


    So, what we are saying that Bitcoin doesn't hide your identity, just because its Bitcoin.
    Its possible to see a lot of what goes on, on the Bitcoin system.


    In this specific case, the thief may still be anonymous, depending on how they operated.
    But its not Bitcoin that makes them anonymous - its the extra steps they took outside of it.

    We haven't followed the thief all the way.
    We just used that as a case study, and an example, to show that flows could be tracked; our goal has been to investigate anonymity generally, and then to warn users that they don't get anonymity just because they use Bitcoin, which many of them think they do.


    We saw some surprising things, that we haven't mentioned, that would make us think users have had their anonymity compromised in some way that they may not have expected (but obviously, we don't know what each individual user's expectations are).

    We could debate what most users expectations are - if you were a Bitcoin user, would you expect your Bitcoin address to be linked to those of any organizations? - but based on the feedback we've gotten already, I think its clear many users had a higher expectation of anonymity than was justified.


    Our point is that users have to be careful - especially if they are living in a repressive regime, or something like that - they might not be as anonymous as they think.

    ReplyDelete
  21. This comment has been removed by the author.

    ReplyDelete
  22. Guys, beautiful work on this!

    With the uncertainties that currently exist with our global economy (i.e. a default on US debit looming!), the hope that we will one day have a decentralised means of trading is awsome.

    This report brings us one step closer and clears up some of the fear that surrounds it.

    ReplyDelete
  23. Fergal - I understand your intention was merely to track "network flows," but I take issue with the title of your piece.

    The title was clearly intended to grab attention, and it states in no uncertain terms that "bitcoin is not anonymous." I think that statement is as misleading as saying that "Bitcoin IS anonymous."

    Clearly, Bitcoin permits anonymity, and that is a huge advantage of the system. The fact that such anonymity is not "automatic" doesn't dismiss the advantage.

    And still... the fact that you didn't reveal any identifying information about the PERSON behind the account number really undermines your title, if not your thesis.

    I do appreciate that you're trying to make sure users are aware that anonymity is not easy and automatic. This type of education is valuable. And damn do those graphics look nice :)

    ReplyDelete
  24. Erik:

    The issue you are taking with the title seems to be primarily about what semantics are attached to something being 'not anonymous'.
    We aren't trying to say 'Bitcoin is never anonymous' or 'Bitcoin cannot be used anonymously' - just that anonymity is not a property of the system.

    I'd also like to point out that, even if you don't agree with these semantics, that our summary at the top (the first few lines of this blog post) address your concerns - so there's clearly no attempt to mislead anyone here. I think we've been pretty thorough on this point - there's a 13 page paper on arxiv, which hopefully sets out our findings pretty clearly.


    To come at this from another angle, you write:
    "Clearly, Bitcoin permits anonymity, and that is a huge advantage of the system. The fact that such anonymity is not 'automatic' doesn't dismiss the advantage."

    But, a huge advantage of the system, compared to what?

    You can make a parallel argument about every other sort of payment system - you could say that credit cards permit anonymity; because, so long as the users go to great lengths to acquire a credit card that isn't associated with them, they can use it anonymously.
    But I wouldn't then say 'the credit card system permits anonymity, which is a huge advantage of the system'.
    And, in fact, I'd be happy to say 'credit card payments are not anonymous' - because, while they can be used anonymously, its very hard, and it'd be very hard not to leak your identity at some point.

    Similarly, with Bitcoins, you've got to buy your Bitcoins from somewhere (Ok, some users mine them, but fast forward a few years, and the mining will be all done) and, as we've shown, in practice, its hard for users to avoid binding together different addresses into a single identity.

    I think that if Bitcoin had some form of sophisticated mixing built into it, at a protocol level, or if it was just practically impossible to follow transaction flows, or make associations between addresses, then it'd be reasonable to say 'Bitcoin is anonymous'.

    But it doesn't; users shouldn't think it does; these attacks are practical, as we've shown; so I'm happy to describe it as 'not anonymous', as it currently stands.

    ReplyDelete
  25. "And still... the fact that you didn't reveal any identifying information about the PERSON behind the account number really undermines your title, if not your thesis."

    Even if we had such identifying information, we'd have to think long and hard about the ethics of revealing it, and outing someone publicly.
    As others have pointed out, we only have one forum user's word - widely reported in the media, but still - that a theft actually occurred.


    Again, the first line of this post states that: "It may be possible to conduct transactions is such a way so as to obscure your identity, but, in many cases, users and their transactions can be identified."
    Our point, all the way through, is that its possible to make sense of whats going on in the network, and see interesting patterns. As such, users shouldn't assume the system is providing anonymity - because its not.

    There are a number of anonymity pitfalls - the address linking is very real in practice, and we see a lot of identities (forum posters, faucet receiving users, organizations) linked, that probably don't realize they are so easily linked.


    Finally, we should point out that we aren't in the business of identifying individual people, in the system. (Other parties may be.)
    We are just analyzing the level of anonymity provided by the system, and pointing out that its possible to associate addresses, and track flows.

    So, if we were a large exchange or other service, that had a lot of individual identities, perhaps due to incoming payments - or a law enforcement agency with the power to subpoena such an exchange, or computer criminals that cracked its database - then the architecture of the Bitcoin system would provide very little practical anonymity.

    We look at the identities from the Bitcoin forum as a proxy to this, and we can see at all sorts of relationships between those users. This gives us a sense of the level to which an exchange would be able to track individual people. (Equally, unless those users are accessing the Bitcoin forum using TOR, if it is keeping logs, then they are already mapped to real individuals - though we don't have the mapping, and wouldn't particularly want it, in any event.)


    All this is without considering more sophisticated attacks, or active attacks (such as running a mixing service, flooding another mixing service with coins from accounts you control, etc)



    Thanks for your comment on the graphics - if you are curious, the one on the top is created using Gephi, and the ones on the bottom are a combination of Graphviz, for layout, and a set of visualization generating code we wrote. We used the excellent Python library Networkx, with graphviz, for the custom visualizations. Also used were the bitcointools, and R.

    ReplyDelete
  26. Thanks for your work guys, it was an interesting read :)

    ReplyDelete
  27. Thanks for a fascinating article - you've made what could easily be a baffling subject very clear... almost.

    Might I make a constructive criticism?

    I'm colour-blind, and that made the diagrams very difficult to follow until I realised I could save them, select individual colours without knowing what they are and highlight them.
    Some of the coloured text embedded in the body of the work simply don't show at all for me unless I 'select all' to mask the colours.

    In print, this option won't be available. Accessibility is important...

    ReplyDelete
  28. Nice work! The core bitcoin developers have been saying that "bitcoin is not anonymous unless you know how it works and you work pretty darn hard to make it anonymous" for months.

    If I recall correctly, the bitcoin.org home page used to refer to bitcoin transactions as being anonymous, which was an unfortunate mistake that probably started the whole "bitcoin is for anonymous transactions" meme. I hope your work will help stop that meme from spreading any further.

    ReplyDelete
  29. Thanks Gavin - great to get the positive feedback.

    We understand that people such as yourself have been saying for a while that there was no anonymity built into the system.

    There's often a gap between theory and practice, and we weren't sure how hard it would be to decipher what was happening in practice; we hadn't seen any other public studies on this.
    We were actually quite surprised at how well our attempts to make sense of the block chain actually worked in practice.

    As you say, we are hoping that this blog, and the graphics showing transaction activity, will give users a better understanding of the lack of practical anonymity, and dispel the anonymity meme.


    On a related note, did you see the part of our preprint (in the paper, but not discussed on the blog) about using the IP->bitcoin_address mapping that the faucet gives publicly, to deanonymise some users?

    I think this is a fairly significant source of identifying information; coupled with address linking, we find it sometimes associates a timestamped IP with a significant amount of prior Bitcoin activity.

    We also came across a few instances where linked accounts had received several faucet allocations between them - presumably people trying to game the faucet.

    I presume you don't have much sympathy for the later category, but would you consider modifying the faucet so as to not make the IP addresses public?
    Or even to print the list of IP addresses separate from the transactions they were for?
    I don't see any real reason to show the IP address->transaction mapping - I would imagine a shuffled list of IPs would be just as effective, at cutting down faucet fraud, while identifying a lot fewer users?

    Regardless, thanks again for your comment, its very encouraging to get good feedback from someone that is as knowledgeable about Bitcoin as yourself.

    ReplyDelete
  30. RE: the Faucet and IP addresses:

    I haven't read the full paper yet. I'm torn between doing something to anonymize the IP addresses and using the Faucet as a way of educating people about the issues of bitcoin pseudo-anonymity.

    I suppose I could do both... but frankly I'm not very motivated to spend more time working on the Faucet (I've got much higher priority work on my TODO list).

    ReplyDelete
  31. Gavin:
    How about working on an encrypted P2P distributed file system to store everyone's wallets ?
    If that is even remotely possible, it should be attempted IMHO...

    ReplyDelete
  32. Richard_N: Thanks for the comment. We will look at using colorblind-safe colors in future revisions of the paper.

    ReplyDelete
  33. Just to be clear then, if traffic analysis on bitcoin can provide an ip-to-bitcoin-address mapping, then is trading done using bitcoin through tor completely anonymous?

    It sounds like if bitcoin is used exclusively through tor, that anonymity is guaranteed, but can your identity be revealed if you preform one transaction through tor, and subsequent ones not through tor? Even if you use a different bitcoin address for each transaction?

    I hope these aren't silly questions, as I haven't yet read the paper yet myself. (I plan to soon though!)

    ReplyDelete
  34. jhtrde54e: Tor ensures anonymity at the TCP/IP level. If you use Tor exclusively then mappings between Bitcoin addresses and IP addresses are essentially useless. However, in many cases, it is still possible to establish that two or more public-keys (identities within the Bitcoin system) are actually controlled by a single user. So, yes, even if you use several different Bitcoin public-keys and use Tor intermittently, it may be possible to link transactions performed while using Tor to transactions performed while not using Tor.

    ReplyDelete
  35. Would it be possible to build into the client a tool that would recognize bitcoin that was involved in a dispute before accepting it? As we try to grow this system, we do need ways to deter fraud. Of course, there would have to be some sort of arbitration panel to resolve disputes, but a system like that might help reduce the need for escrow in everyday transactions as well.

    ReplyDelete
  36. Hi! I've been confused by this in the past, and now trying to understand it again, I think I see part of why. You've swapped "A" and "B" in the following sentence, haven't you?

    "Flow 1 splits at the vertex labeled A in the right inset at 04:05 the day after the theft. Some of its Bitcoins rejoin Flow 2 at the vertex labeled B."

    ReplyDelete
  37. Oh, no I was confused when I thought that was why I was confused. Actually it is hard for me to see some of the arrowheads so I had one of the links backwards.

    ReplyDelete
  38. "We have determined that these four public-keys — which receive Bitcoins from two separate flows that split from each other two days previously — are all contracted to the same user in our ancillary network. "

    What does this mean? What does it mean for a public key to be "contracted to" a user, and how did you determine that these four had that relationship to the same user?

    ReplyDelete
  39. @Zooko, in the paper we describe an ancillary network that maps multiple public-keys to individual users. We do this using a a property of transactions with multiple inputs. Basically, if you control, say, three separate public-keys, PK_A, PK_B and PK_C, and (inadvertently?) use all three to sign a transaction in order to send Bitcoins to another public-key, you have revealed that one user controls all three. This information can be applied retrospectively -- a transaction you may perform in the future may reveal information about transactions you performed in the past.

    ReplyDelete
  40. @Martin that is unsound reasoning. What if my wife and I have separate bitcoin accounts but we use both together to make a joint purchase? What if I inappropriately get ahold of her private key and make a purchase in tandem with my own? What if we both hold both accounts?

    The combination of accounts on a transactions is completely uninformative about who or how many people control the accounts, or whether the account was used legitimately or illegitimately. You cannot draw the conclusions you wish to from the data we have.

    ReplyDelete
  41. There is a good chance that this analysis will help _save_ bitcoin [B¢] from paranoid future regulation.

    If someone invented plain cash, nowadays, it would quickly get banned before it gained traction. That B¢ is traceable feeds enough spy fantasies to resist banning. It might take time and effort to track someone down, but the thrill of this would probably make B¢ _more_ appealing to law enforcement.

    Sometimes a leaky bucket is better than a tight one.

    ReplyDelete
  42. That’s a very smart analysis. Bitcoin system may not be totally anonymous, but in some way, a person who creates a transaction can make it anonymous by using tools or software to prevent data trace.

    ReplyDelete
  43. Impressive. You can certainly track the coins hither and yon in a relatively easy-to-show manner, which is certainly a good first step in IDing a specific user.

    Sadly, unless that user has - somewhere along the line - identified themselves (ex. buys a pizza and has it delivered to their home or some other event)...They're still unidentifiable unless they pooch it in the future.

    Simply washing the BTC little by little through real-world traders (physical coins or not) would scupper this method of tracking as suddenly you have branch offs and coins becoming 'clean' again over time, or at least impractical to backtrace.

    A fine piece of design, analysis and e-tracking but still derail-able. :[

    Bitcoins may not be anonymous in themselves, yet if you don't expose yourself (which is pretty much the first law for nefarious activities, LulzSec notwithstanding) your still anonymous for all practical purposes!

    ReplyDelete