Using Tahoe-LAFS with an anonymizing network: Tor, I2P¶
- Use cases
- Software Dependencies
- Connection configuration
- Anonymity configuration
- Performance and security issues
Tor is an anonymizing network used to help hide the identity of internet clients and servers. Please see the Tor Project’s website for more information: https://www.torproject.org/
I2P is a decentralized anonymizing network that focuses on end-to-end anonymity between clients and servers. Please see the I2P website for more information: https://geti2p.net/
There are three potential use-cases for Tahoe-LAFS on the client side:
- User wishes to always use an anonymizing network (Tor, I2P) to protect their anonymity when connecting to Tahoe-LAFS storage grids (whether or not the storage servers are anonymous).
- User does not care to protect their anonymity but they wish to connect to
Tahoe-LAFS storage servers which are accessible only via Tor Hidden Services or I2P.
- Tor is only used if a server connection hint uses
tor:. These hints generally have a
- I2P is only used if a server connection hint uses
i2p:. These hints generally have a
- Tor is only used if a server connection hint uses
- User does not care to protect their anonymity or to connect to anonymous storage servers. This document is not useful to you... so stop reading.
For Tahoe-LAFS storage servers there are three use-cases:
The operator wishes to protect their anonymity by making their Tahoe server accessible only over I2P, via Tor Hidden Services, or both.
The operator does not require anonymity for the storage server, but they want it to be available over both publicly routed TCP/IP and through an anonymizing network (I2P, Tor Hidden Services). One possible reason to do this is because being reachable through an anonymizing network is a convenient way to bypass NAT or firewall that prevents publicly routed TCP/IP connections to your server (for clients capable of connecting to such servers). Another is that making your storage server reachable through an anonymizing network can provide better protection for your clients who themselves use that anonymizing network to protect their anonymity.
Storage server operator does not care to protect their own anonymity nor to help the clients protect theirs. Stop reading this document and run your Tahoe-LAFS storage server using publicly routed TCP/IP.
See this Tor Project page for more information about Tor Hidden Services: https://www.torproject.org/docs/hidden-services.html.en
See this I2P Project page for more information about I2P: https://geti2p.net/en/about/intro
Clients who wish to connect to Tor-based servers must install the following.
Tor (tor) must be installed. See here: https://www.torproject.org/docs/installguide.html.en . On Debian/Ubuntu, use
apt-get install tor. You can also install and run the Tor Browser Bundle.
Tahoe-LAFS must be installed with the
[tor]“extra” enabled. This will install
pip install tahoe-lafs[tor]
Manually-configured Tor-based servers must install Tor, but do not need
txtorcon or the
[tor] extra. Automatic configuration, when
implemented, will need these, just like clients.
Clients who wish to connect to I2P-based servers must install the following. As with Tor, manually-configured I2P-based servers need the I2P daemon, but no special Tahoe-side supporting libraries.
I2P must be installed. See here: https://geti2p.net/en/download
The SAM API must be enabled.
- Start I2P.
- Visit http://127.0.0.1:7657/configclients in your browser.
- Under “Client Configuration”, check the “Run at Startup?” box for “SAM application bridge”.
- Click “Save Client Configuration”.
- Click the “Start” control for “SAM application bridge”, or restart I2P.
Tahoe-LAFS must be installed with the
[i2p]extra enabled, to get
pip install tahoe-lafs[i2p]
Both Tor and I2P¶
Clients who wish to connect to both Tor- and I2P-based servers must install all of the above. In particular, Tahoe-LAFS must be installed with both extras enabled:
pip install tahoe-lafs[tor,i2p]
See Connection Management for a description of the
[i2p] sections of
tahoe.cfg. These control how the Tahoe client will
connect to a Tor/I2P daemon, and thus make connections to Tor/I2P -based
[i2p] sections only need to be modified to use unusual
configurations, or to enable automatic server setup.
The default configuration will attempt to contact a local Tor/I2P daemon listening on the usual ports (9050/9150 for Tor, 7656 for I2P). As long as there is a daemon running on the local host, and the necessary support libraries were installed, clients will be able to use Tor-based servers without any special configuration.
However note that this default configuration does not improve the client’s
anonymity: normal TCP connections will still be made to any server that
offers a regular address (it fulfills the second client use case above, not
the third). To protect their anonymity, users must configure the
[connections] section as follows:
[connections] tcp = tor
With this in place, the client will use Tor (instead of an IP-address -revealing direct connection) to reach TCP-based servers.
Tahoe-LAFS provides a configuration “safety flag” for explicitly stating whether or not IP-address privacy is required for a node:
[node] reveal-IP-address = (boolean, optional)
reveal-IP-address = False, Tahoe-LAFS will refuse to start if any of
the configuration options in
tahoe.cfg would reveal the node’s network
[connections] tcp = toris required: otherwise the client would make direct connections to the Introducer, or any TCP-based servers it learns from the Introducer, revealing its IP address to those servers and a network eavesdropper. With this in place, Tahoe-LAFS will only make outgoing connections through a supported anonymizing network.
tub.locationmust either be disabled, or contain safe values. This value is advertised to other nodes via the Introducer: it is how a server advertises it’s location so clients can connect to it. In private mode, it is an error to include a
tub.location. Private mode rejects the default value of
tub.location(when the key is missing entirely), which is
AUTO, which uses
ifconfigto guess the node’s external IP address, which would reveal it to the server and other clients.
This option is critical to preserving the client’s anonymity (client use-case 3 from Use cases, above). It is also necessary to preserve a server’s anonymity (server use-case 3).
This flag can be set (to False) by providing the
--hide-ip argument to
Note that the default value of
reveal-IP-address is True, because
unfortunately hiding the node’s IP address requires additional software to be
installed (as described above), and reduces performance.
To configure a client node for anonymity,
tahoe.cfg must contain the
following configuration flags:
[node] reveal-IP-address = False tub.port = disabled tub.location = disabled
Once the Tahoe-LAFS node has been restarted, it can be used anonymously (client use-case 3).
Server anonymity, manual configuration¶
To configure a server node to listen on an anonymizing network, we must first
configure Tor to run an “Onion Service”, and route inbound connections to the
local Tahoe port. Then we configure Tahoe to advertise the
to clients. We also configure Tahoe to not make direct TCP connections.
- Decide on a local listening port number, named PORT. This can be any unused port from about 1024 up to 65535 (depending upon the host’s kernel/network config). We will tell Tahoe to listen on this port, and we’ll tell Tor to route inbound connections to it.
- Decide on an external port number, named VIRTPORT. This will be used in the advertised location, and revealed to clients. It can be any number from 1 to 65535. It can be the same as PORT, if you like.
- Decide on a “hidden service directory”, usually in
/var/lib/tor/NAME. We’ll be asking Tor to save the onion-service state here, and Tor will write the
.onionaddress here after it is generated.
Then, do the following:
Create the Tahoe server node (with
tahoe create-node), but do not launch it yet.
Edit the Tor config file (typically in
/etc/tor/torrc). We need to add a section to define the hidden service. If our PORT is 2000, VIRTPORT is 3000, and we’re using
/var/lib/tor/tahoeas the hidden service directory, the section should look like:
HiddenServiceDir /var/lib/tor/tahoe HiddenServicePort 3000 127.0.0.1:2000
Restart Tor, with
systemctl restart tor. Wait a few seconds.
hostnamefile in the hidden service directory (e.g.
/var/lib/tor/tahoe/hostname). This will be a
u33m4y7klhz3b.onion. Call this ONION.
tor:ONION.onion:VIRTPORT. Using the examples above, this would be:
[node] reveal-IP-address = false tub.port = tcp:2000:interface=127.0.0.1 tub.location = tor:u33m4y7klhz3b.onion:3000 [connections] tcp = tor
Launch the Tahoe server with
tahoe start $NODEDIR
tub.port section will cause the Tahoe server to listen on PORT, but
bind the listening socket to the loopback interface, which is not reachable
from the outside world (but is reachable by the local Tor daemon). Then the
tcp = tor section causes Tahoe to use Tor when connecting to the
Introducer, hiding it’s IP address. The node will then announce itself to all
tub.location, so clients will know that they must use Tor
to reach this server (and not revealing it’s IP address through the
announcement). When clients connect to the onion address, their packets will
flow through the anonymizing network and eventually land on the local Tor
daemon, which will then make a connection to PORT on localhost, which is
where Tahoe is listening for connections.
Follow a similar process to build a Tahoe server that listens on I2P. The
same process can be used to listen on both Tor and I2P (
tor:ONION.onion:VIRTPORT,i2p:ADDR.i2p). It can also listen on both Tor and
plain TCP (use-case 2), with
tub.port = tcp:PORT,
anonymous = false (and omit
tcp = tor setting, as the address is already being broadcast through
the location announcement).
Server anonymity, automatic configuration¶
To configure a server node to listen on an anonymizing network, create the
node with the
--listen=tor option. This requires a Tor configuration that
either launches a new Tor daemon, or has access to the Tor control port (and
enough authority to create a new onion service). On Debian/Ubuntu systems, do
apt install tor, add yourself to the control group with
YOURUSERNAME debian-tor, and then logout and log back in: if the
debian-tor in the output, you should have permission to
use the unix-domain control port at
This option will set
reveal-IP-address = False and
[connections] tcp =
tor. It will allocate the necessary ports, instruct Tor to create the onion
service (saving the private key somewhere inside NODEDIR/private/), obtain
.onion address, and populate
Performance and security issues¶
If you are running a server which does not itself need to be anonymous, should you make it reachable via an anonymizing network or not? Or should you make it reachable both via an anonymizing network and as a publicly traceable TCP/IP server?
There are several trade-offs effected by this decision.
Making a server be reachable via Tor or I2P makes it reachable (by Tor/I2P-capable clients) even if there are NATs or firewalls preventing direct TCP/IP connections to the server.
Making a Tahoe-LAFS server accessible only via Tor or I2P can be used to
guarantee that the Tahoe-LAFS clients use Tor or I2P to connect
(specifically, the server should only advertise Tor/I2P addresses in the
tub.location config key). This prevents misconfigured clients from
accidentally de-anonymizing themselves by connecting to your server through
the traceable Internet.
Clearly, a server which is available as both a Tor/I2P service and a regular TCP address is not itself anonymous: the .onion address and the real IP address of the server are easily linkable.
Also, interaction, through Tor, with a Tor Hidden Service may be more protected from network traffic analysis than interaction, through Tor, with a publicly traceable TCP/IP server.
XXX is there a document maintained by Tor developers which substantiates or refutes this belief? If so we need to link to it. If not, then maybe we should explain more here why we think this?
As of 1.12.0, the node uses a single persistent Tub key for outbound connections to the Introducer, and inbound connections to the Storage Server (and Helper). For clients, a new Tub key is created for each storage server we learn about, and these keys are not persisted (so they will change each time the client reboots).
Clients traversing directories (from rootcap to subdirectory to filecap) are likely to request the same storage-indices (SIs) in the same order each time. A client connected to multiple servers will ask them all for the same SI at about the same time. And two clients which are sharing files or directories will visit the same SIs (at various times).
As a result, the following things are linkable, even with
- Storage servers can link recognize multiple connections from the same not-yet-rebooted client. (Note that the upcoming Accounting feature may cause clients to present a persistent client-side public key when connecting, which will be a much stronger linkage).
- Storage servers can probably deduce which client is accessing data, by looking at the SIs being requested. Multiple servers can collude to determine that the same client is talking to all of them, even though the TubIDs are different for each connection.
- Storage servers can deduce when two different clients are sharing data.
- The Introducer could deliver different server information to each subscribed client, to partition clients into distinct sets according to which server connections they eventually make. For client+server nodes, it can also correlate the server announcement with the deduced client identity.
A client connecting to a publicly traceable Tahoe-LAFS server through Tor incurs substantially higher latency and sometimes worse throughput than the same client connecting to the same server over a normal traceable TCP/IP connection. When the server is on a Tor Hidden Service, it incurs even more latency, and possibly even worse throughput.
Connecting to Tahoe-LAFS servers which are I2P servers incurs higher latency and worse throughput too.
Positive and negative effects on other Tor users¶
Sending your Tahoe-LAFS traffic over Tor adds cover traffic for other Tor users who are also transmitting bulk data. So that is good for them – increasing their anonymity.
However, it makes the performance of other Tor users’ interactive sessions – e.g. ssh sessions – much worse. This is because Tor doesn’t currently have any prioritization or quality-of-service features, so someone else’s ssh keystrokes may have to wait in line while your bulk file contents get transmitted. The added delay might make other people’s interactive sessions unusable.
Both of these effects are doubled if you upload or download files to a Tor Hidden Service, as compared to if you upload or download files over Tor to a publicly traceable TCP/IP server.
Positive and negative effects on other I2P users¶
Sending your Tahoe-LAFS traffic over I2P adds cover traffic for other I2P users who are also transmitting data. So that is good for them – increasing their anonymity. It will not directly impair the performance of other I2P users’ interactive sessions, because the I2P network has several congestion control and quality-of-service features, such as prioritizing smaller packets.
However, if many users are sending Tahoe-LAFS traffic over I2P, and do not have their I2P routers configured to participate in much traffic, then the I2P network as a whole will suffer degradation. Each Tahoe-LAFS router using I2P has their own anonymizing tunnels that their data is sent through. On average, one Tahoe-LAFS node requires 12 other I2P routers to participate in their tunnels.
It is therefore important that your I2P router is sharing bandwidth with other routers, so that you can give back as you use I2P. This will never impair the performance of your Tahoe-LAFS node, because your I2P router will always prioritize your own traffic.