[applause] (Brian Warner)
Hey everybody. Thank you so much
for coming out today. Everybody, thanks for coming out
today. My name is Brian Warner. I'm going to be talking about
a file transfer project I've been working on
called Magic Wormhole. If you go to magic-wormhole.io,
that will take you to the Github page. There's a copy of the slides
if you want to follow along. You can take a look
at all the code. The obligatory biography:
I've been working in Python for about 15 years or so.
I created Buildbot a long, long time ago,
in case you've heard of that one. The most recent thing I've put together
that is getting some more usage is called Versioneer,
so a little package to update your version
every time you do a git commit. At a startup called AllMyData,
I helped build a system called Tahoe-LAFS, a distributed
encrypted data storage system that's getting
a little bit of traction. At Mozilla I worked on Persona.
I worked on the add-on SDK. I worked on the original version
of Firefox Sync, and fans of the JPEG process there
will probably recognize a few things in this project. And these days
I'm working on Magic Wormhole. You can follow me on Twitter,
Lotharrr with a triple R, or GitHub, warner. So Magic Wormhole
is a file transfer program. This is about moving a file
or a directory or some little bit of string
from one machine to another. I bet everybody in this room
has done this at least a dozen times
in the last 24 hours. It's something that we do
all the time. You might have used SSH to copy it
from one place to another. You might have attached it
in an email message. You might have put it
into a shared directory where your co-workers
could get to it. The use case for this
is any of those -- you know, if you've got a cat video
on your computer and you really want to give it
to the person next to you, it should be easy to get it there
in a safe, secure, simple fashion. I'm hoping to convince you
that this tool is both easier and safer than all of the other tools --
all the other techniques that are typically used
for this kind of thing, especially when there's
not already a relationship between those two machines. So, you know, you two
are sitting next to each other. The humans know each other
but the computers don't yet know about each other. That's a use case
that is usually solved with tools that I think are harder
and less secure than this one. So what does it look like? First off, you do
pip install magic wormhole. Please do that into a virtualenv.
You saw Glyph's talk yesterday about why doing this
into your root is a bad idea. Then on the sender side,
there's a command-line tool, and you run 'wormhole send'
and you give it the file name. That gives you a wormhole code,
the string that prints out, the '7-guitarist-revenge'. You transcribe that.
You get that to your recipient. So as one human to another,
you give them this string. The other person
runs 'wormhole receive'. They type in that string. Given that, the two programs
can find each other. They can negotiate
a shared, strong session key. They can encrypt the file,
figure out how to get bytes from one side to the other,
transfer it over, check the hash, acknowledge it. The kind of fanciful way
I like to think about this is two wizards who are standing
on hilltops on far sides of the planet and they speak the same magic spell
into the air at the same time and someone goes,
"Seven guitarist revenge!" And that causes this wormhole,
this portal to pop open between them. And then they can kind of
throw files and stuff through it. Things pop out of the other side. So think of that
and you'll understand Magic Wormhole. So the wormhole code
is kind of the key to this. It's a string that goes
from the sending computer up to the sending human. The sending human
reads that off the screen. The sending human
transcribes that over to the receiving human
via some channel. They can speak it, they can IM it,
they can send it by email, whatever. And then the receiving human
types that into the computer. So this is, at its heart, a way of getting
two computers to meet each other via these two humans
that have already met each other. So the human's job in this
is to get that code from one machine
off to the other. Now, you might think that file transfer
is a solved problem. I mean, FTP, the, you know -- everybody here has probably used FTP
at one point or another. Or you're too young
to have been born when the FTP RFC
was actually published in 1980. You know, this has been around
for so, so long. What's wrong with that?
What's wrong with the tools that we're currently using for this? So, you can characterize
these different tools by the amount of data that one human
has to transcribe to the other. That's kind of the limiting
factor in a UX thing. And I argue
that all the common tools we have actually take more work and are
less secure than this approach. So if -- the most obvious thing
I might reach for is to include this file
as an email attachment to you. In order for me to send you
a file that way, I have to start by getting you
to tell me your email address. So this email address has to travel
in the opposite direction, and you have a transcription problem:
I have to hear what you're saying and figure out
how to type it accurately. And in the best case,
you have an email address that's easy to spell
and easy to understand and it's at some popular
short-named provider. But in the worst case -- you know,
most email addresses tend to be big and long and random
because you're fighting for this contentious namespace
at one of these providers. Another approach
is to upload it to a server. If I have a server ready to go and I can think of a unique path
on that server and I upload it, then I can dictate to you
an even longer string. I have to tell you the domain name
of my server and the path that I've allocated on that. You can upload it
to something like Dropbox, which is a nice easy way of getting
a file into a particular URL. But the URLs there are even longer
because they have an even more difficult
contention problem. You have a longer, more random string
that's not human-selected. And the problem
with all of those cases is that there's all these
intermediaries that get to have access to the file that really
don't need any access to the file. So the server that you put it on, the network links in between you
and your recipient, if it's protected by TLS, great,
but you're still vulnerable to the certificate authorities
in the PKI world. You could transfer this over
on a USB drive if you happen to be sitting
right next to each other. But USB drives
are kind of hard to erase. Flash memory, you know,
doesn't forget quite as well
as you would like it to. So now in terms of the security
aspect, you have to think about what you're going to do
with that USB drive next. And in addition,
it's a little bit sketchy to take other people's USB drives
and put them into your computer. There are enough bugs
in the kernel drivers these days. You don't know what's going to happen
to your system when you do that. Using SSH takes care
of the security problems but transforms the UX problem
into getting your public key onto that other person's machine. You know, if you've got a cat video
and you wanted to give it to him, they have to give you
an entire account on their machine. And then you have to get
that entire public key over to their machine, and that's
far too long to actually dictate. So nobody does that
and you end up using some funky password thing
for that bootstrapping part. In contrast to all of those,
in Magic Wormhole, you send a very short string
that's carefully designed to be transcribable and dictatable
from the sender to the recipient. And that one string
is the entire security -- the entire accessibility path. You know, anybody in the world,
any wizard on that hilltop that says that same Magic Wormhole code
at the same time gets that one file,
and nobody else. So if I've convinced you now
that this is the easiest and safest way to move a file
from one machine to another, then you probably
want to know how it works. There are a couple of different
phases to the protocol. It starts with a message exchange
so the two nodes can just find each other
and start exchanging messages at all. Then there's a key agreement phase
where they use an algorithm called PAKE that I'll be telling you about
in a little bit that lets them agree
on a session key. Then they try and exchange
their IP addresses. They try and use
different techniques to get a direct connection
between the two of them. Once they have that connection,
they can run an encrypted data transfer
protocol over that, move the file over,
get the acknowledgement back. So, the clients
exchange their initial messages through a rendezvous server.
This is a simple relay that accepts messages
from one side, hands them out to somebody
who asked for them on the other side. This is something that I run
for this particular program. The rendezvous server doesn't
participate in this conversation. It's just facilitating it. And it's necessary
because when this starts, the two clients don't know anything
about each other other than there's somebody else out there
using the same wormhole code. So the first part of that
wormhole code is the channel ID -- in this example, channel #1,
but the numbers are just short integers.
The server allocates those. The channel ID tells the server which --
two clients coming together using the same channel ID
get messages from each other. The wormhole codes in general
can be short because they cheat. You know, the URL for this service
is baked into the client. Any two applications that are using
the same wormhole mechanism will use the same server. And so you don't have to include it
in the wormhole code. So in a lot of those other examples
like uploading it to an FTP server, uploading it to a Dropbox thing,
you have to include your domain name there
to provide that context. The clients make a WebSocket
connection to this machine and they exchange three or four,
maybe five messages by the time that they're done. Now here's the really fun crypto part. So the first message they send
across this connection that's being facilitated
by the relay server is using a protocol called PAKE. PAKE stands for
password authenticated key exchange. It is actually
a family of protocols. There are a bunch of different ones
that have been developed. The first one came out
in the early '90s. SRP is probably the best known one,
but the security proof on that is not really very good. It was kind of before we knew
how to design protocols well. There's one called J-PAKE that we used
in the first version of Firefox Sync. But these days, the one that is
preferred by the community is called SPAKE2
which was made in 2005. These protocols allow two parties
to turn a weak shared secret into a strong shared session key. So it's basically
a box on one side and you put something
like a password into it and it gets to exchange messages
with its peer on the other side. And then at the end of the day,
what pops out is a session key. And the rule is that if both sides
put the same password in, then both sides
get the same key out. And if the passwords are different,
then they get keys out that are random and completely
unrelated to each other. So this is a protocol
that lets you take any short string and expand it
into something larger that you can use
for subsequent messages. So, SPAKE2, I'm not going to go
into a great amount of detail on the math here. Feel free to talk to me later
if you want some more details. But if you're familiar with
Diffie-Hellman key exchange, key agreement,
two parties agree upon a shared key, where you have one side
sending g to the x, and the other side -- er, g to the a,
and the other side sends g to the b and then you compute g to the a to the b
or g to the b to the a, and because of commutative,
those two things are the same. It's like that, except
that the message that you send gets blinded by a factor drive
from the password before you send it. So that u to the pw is the blinding
factor that Alice puts on, and then Bob divides by u to the pw to take that blinding factor off
before he uses it. So if the two sides are using
the same blinding factors, then this behaves
just like regular Diffie-Hellman, but you know that you could only come
to the same key if the other person had exactly the same
blinding factor. And because of the way
that exponents work inside this kind of modular field,
this is a random mapping and there's no way of knowing
what that g to the a was unless you happen to have
that same blinding factor. So this is super cool.
I really love this algorithm. I wrote a pure Python
implementation of this. It's using the Ed25519 curve
so that it's pretty fast even though it's in pure Python. It has nice short keys
and is super secure. And this one takes
about 14 milliseconds to run. If we used the C implementation of this,
you could probably improve that by a factor of ten,
maybe a hundred. But it's good enough
for an application like this where most of the time
you're blocked waiting on the user
to type in the thing. So PAKE is pretty awesome. It means that with a weak secret
and some interaction, you can get to a strong secret. And it's basically trading off --
it's adding -- it's spending interaction
in order to prevent offline attacks. A passive eavesdropper
who's watching your messages go by gets zero information
about what your session key is. An active man in the middle,
the best they can do is try to pretend to be Alice, try to pretend
to be your conversation partner and try to guess what that key is. And if they succeed,
then they can get in the middle, they can see your messages,
but if they fail, you know that they failed. And they only get one guess. So this is great
because even a short string -- you can afford to allow somebody
a one-in-a-million chance of guessing something
if they only get one guess at it. And that's not true for offline things
like encryption keys or passwords. The wormhole codes
are the input to PAKE. So the wormhole code is generated by taking the channel ID
that you allocated from the server. There's a word list.
I'm using the PGP word list that I think was used
for PGP phone or Zphone. And it's a list of words designed
to be phonetically distinct so you can recognize them when they're
coming over an audio channel. There are 256 words in there;
os.urandom, choose one, choose one. So the default configuration
is using two words and then this whole thing
is fed into PAKE. That configuration gives us
a 16-bit secret. You can change that. There's an option
to use more words if you feel like it. But that means that there are
65,000 codes. These are single-use. So each time you send a file,
you get a new code. Every time -- if you want to send
a new file, you get a new code. You establish that new connection. And it has this really nice
property here where the user -- each time an attacker
tries to guess this and fails, the user sees the failure screen.
So they see a message like this, where: "I've tried to send your file.
Something didn't go right. "Either you or your correspondent
mistyped the code, "or somebody tried to attack you
and they lost. "They guessed
and they guessed wrong. "You can try again.
That's going to give both you "and the attacker
a second chance at this." So, because the size
of the key space is large enough
to keep the attacker from getting a good chance
of getting through, the user has to be
really, really, really patient and run this over and over
and over again to give the attacker a better chance
of going through. So you'd have to rerun this protocol
and re-transcribe this string to your friend 650 times
before the attacker has even a 1% chance
of breaking through. And it's this nice
and relatively rare situation where the laziness of the user
improves the security. Their unwillingness to continue
in the face of constant errors is a good thing. It's like, I had to reload
the foot gun multiple times. OK, so that's enough about PAKE. Once the session key
is established and they both have
the same shared session key, then they can encrypt
all of their subsequent messages. So everything else in this protocol
gets encrypted from a key derived from that first one
that comes out of PAKE. And this is kind of where
the network plumbing aspect of the project comes -- takes over. So, after that phase is done,
then the two sides use ifconfig to find out all their local IP addresses.
And they listen on a TCP port. And then they send a message
encrypted with a session key to say, "Hey, I'm listening
here and here and here." And Bob says, "I'm listening
here and here and here." And they try to connect
to all of those. And the first connection
that goes through and successfully answers
an encrypted handshake, also derived from that PAKE key,
is the winner and it stops using
all of the other ones. This works out okay as long as one of the two sides
has a public IP address -- so like, one is a server
that you're connecting to -- or both sides are sitting
on the same local network. So, two people sitting next to each other
that want to exchange cat videos. They're both on the PyCon network.
Their local addresses, their 10. or 192.168,
will be able to see each other. And if you're moving something
server, they can see that one. But if you have two people --
say you are calling my dad on the phone to say, "I want to send you this file,
the code is" blah blah blah. He and I are both at home,
both behind our NAT boxes. Those machines can't get
to each other directly. So in this case,
the client falls back to a relay server
that I also run here. If you're familiar
with STUN and TURN, this is the moral equivalent
of a TURN server. It's again just a data relay
that accepts two TCP connections that want to talk to each other
and it glues them together. It's called the transit relay.
This is not ideal. I'm working on adding techniques
to do NAT hole punching to increase the frequency
of connections -- the probability that a connection will be able
to make a direct connection instead of going through this relay. But for the kind of traffic volume
I'm expecting here, I don't expect
this will be a big problem. And I just run this on a VPS
somewhere off in the cloud. Once that handshake is complete, then the actual file data gets sent
through an encrypted record pipe. So this is using the NaCl SecretBox. This is using libsodium,
the Python bindings to that. Each record gets encrypted
by a unique nonce, gets sent. There are different keys
in each direction. All these keys are derived from
that same original PAKE master key. And as the data gets sent over,
it gets hashed. The receiving end
is hashing all of those bytes. And then the acknowledgement
message says, "Here's the hash of everything
that I have received from you." So that's a good strong way to making
sure our record didn't get dropped or some funky network error
caused something to get dropped. The secret box function
is an authenticated encryption cypher. So flipping bits -- an attacker
trying to flip the bits along the way, that gets detected
as soon as a single record goes by. So that's how the application works. This is built on top of an API that you can use
in your own applications. You -- each application
that wants to use this has an application identifier
that scopes the channel IDs so that two different applications
using different identifiers aren't fighting for each other
for the short, easy-to-transcribe codes. There's a relay URL that you pass
into the application and the one that I run that people
are free to use is in a file. It's just a constant definition
inside the package. You then set the code. You can also ask the wormhole
to allocate for you a brand new code and that'll go to the server
and find a channel. You can actually have -- let's see,
there's another call called input code that does Tab completion
on the words as you type them in. So when I'm typing in
"7-guitarist-revenge," I only actually have to hit Tab
and it goes to the server and says, "There's only one
channel allocated right now "and that's 7,
so let me fill that in for you." And then I type "GU" and I hit tab
and there's a couple of words that start with "G-U" in the list
and it shows me what they are. So it's actually really fast to go
and type these things in. And if you think about it
from an information theory point of view, you're only entering in
16 bits of data. So obviously,
you should be able to do that with a very small number
of keystrokes. But it makes it something
nice and easy to transcribe and easy to pronounce.
Then you send a binary. You send byte strings
and you get byte strings back. This works in both Python 2,
Python 3.3, 3.4, 3.5, maybe PyPy. I'm not sure. The library -- the API handles all the
network protocols, all the encryption. All that is taken care of
under the covers. And by the end of the sprints,
I will have both the blocking interface to this
and the Twisted interface to this. I broke the blocking interface
about a week ago when I was making
some major changes. So, the direction
that I want to take this, there -- I definitely want to have
a nice easy GUI application for this. This is calling out for something
better than a command-line tool. And you should be able to
drag and drop a file onto it. It shows you the code.
You should be able to click a button and type in the code and a file pops out
into your downloads folder. I think it'd be great to have
a browser extension for this, probably even a web-deployed
webpage form of this. The protocol is speaking WebSockets
to the relay rendezvous server specifically so that I can then
have a web-based form of this interoperate
with a command-line base. I'm looking to add to the range
of transports that it can do. Two browser-based implementations
could totally use web RTC to connect to each other and avoid
having to use my relay server. Other people smarter than me
have already figured out all of the STUN/TURN,
STUN/ICE kinds of stuff, so we should be able
to take advantage of that. Running this over a Tor Onion service
is a great way of bypassing the NATs, the NAT barriers. I'm hoping to try and get
the SPAKE2 algorithm added to libsodium, and that will make it easier
to extend this to other languages, so that there are lots of --
there are Go bindings and Rust bindings
and JavaScript bindings to libsodium. And this would make it easier
to write a client for this in those other languages. And then I'm really interested in
seeing this used beyond file transfer. I want people
to think of this as a tool that they can use
in their applications anytime you need to deliver
a credential from one place to another. So, think about --
you have your mobile app, and to attach
that mobile app to your -- to an account on some server,
it needs to have some token. Or you have two client devices
and they want to go and talk to each other,
some kind of messaging thing. And you need a way of getting a key
from one of those machines to the other
so they know about each other. A lot of this comes down to
that same introduction task, where there are two humans
that know about each other and they have their devices. And the humans are introducing
the devices to each other. It's like, "Hi there, Bob,
this is my computer." And it's more like, "Hi there,
Bob's computer, this is my computer." And you want to connect those. So provisioning a client
in this sort of world would -- the old way, the typical way
this is done is to go to a website, create an account,
type in a password to that account, remember the password,
deal with the fact that people generate
really lousy passwords. It's just outstanding
how bad passwords are from a usability point of view. And then you install
the mobile app and you type the same password
into that. So, passwords are bad. Remembering passwords
is a hassle. Another approach
if you have a tool like this is that you go to the server
and you say, "I'd like to add a new device. "Tell me an invitation code for it.
Tell me a wormhole code for it." And then you type that
into your phone. That runs this protocol
and it gets a token dedicated to that one device,
and now you're done. You know, you could set it up
so that you ask the server to email you a code. A nice thing about these codes
that I forgot to mention before is that they're forward-secure. Once you finish that PAKE message,
once you exchange those two messages, you can reveal the code
and not lose any security. So it's a single use
and it's not secret outside that really narrow window
from the time you start that process to the time that the other side
sends their message. So the fact that this code
is still sitting around an email for a long period of time
doesn't hurt you. Same reason that you'd never
email password resets. You never send
the new password out in email. You send a token that can get used
a single time to go and reset it. In a messaging app,
something like Signal -- so the crypto in Signal is great. But the way that it works
is that when I want to talk to Bob, I ask the central Signal server
for Bob's public key. And then I start using that. There are some hidden options
in there to let you verify the key. If you meet Bob in person,
you can compare your keys. But fundamentally
you're depending upon that server to give you the right key.
And the model that I'd like to see is where I'm standing
next to somebody right now and I want to get our two machines
to know about each other. So I should be able to go to my device
and go to the address book and say, "Plus," you know,
"invite somebody." And it says, "Here's the code."
And they type in that code or, I don't know,
a QR code, something -- the two devices have
that physical proximity right now. They can leverage that
to transfer something small across, and then use this protocol
to ask each other for the full-blown public key
so that you're not vulnerable to anybody outside of this ecosystem
other than those two programs. So, my -- the real reason
that I'm building this tool is because PAKE is a really
interesting cryptographic tool that is a primitive
that we should be able to use and is not yet in our toolbox. The history of cryptography
and of techniques has this really long lag time. You know, symmetric encryption
is something that I'd say, you know, maybe half
the developers that I talk to are comfortable
using that in a product. Some of them think it's a little bit,
you know, out there. But it's been around since, what,
Julius Caesar days, right, like 800 BC, what? So there's a set of tools that are
slowly coming into general use. Hashes: most people are
comfortable with hashes at this point. Digital signatures: that's
showing up in a lot of applications. Public key cryptography:
it's been around for 50 years and it's still just becoming something
that people are comfortable using. PAKE has been around since the '90s. It enables a lot of
really interesting use cases and really a lot of
interesting operational modes. But I think to get it
into the hands of developers, you need to have
some clear examples, some clear -- the academic community needs to say,
"This algorithm is good. "Use this and it'll do
what you expect it to do." You need libraries
that are easy to use. You need applications that use it
so you can feel comfortable, you can say, "I'm not going this alone.
That person over there is doing it. "Everybody's doing it.
Maybe it's okay for me to do it too." So, file transfer -- it's a kind of
pedestrian application. I found it to be
really handy myself, actually, more convenient
than SSH in some cases, just moving files
between my own servers. But it's not exactly
an exciting use case. But it is kind of
a foot in the door. You know, being able to use PAKE,
being able to use a short secret to bootstrap to a larger secret
is the rest of the stuff that I want people have access to. So of the stuff in your toolbox,
let's add PAKE to that list. So that's the project.
If you go to magic-wormhole.io, that will bounce you
over to the GitHub page where you can take a look
at a copy of these slides, the code. Pip install magic wormhole
is how you get it. That gets you the wormhole binary.
And you can send me email or contact me on Twitter,
Lotharrr with the triple r. Thank you very much. [applause] Oh! And I have stickers. I'll take questions
and I'll throw out stickers. And if you want questions,
please come to the microphones in the two aisles there. (audience member)
I have a question, sir. For that initial relay, right? (Brian Warner)
The initial relay? Yeah. (audience member)
I have a NAP ID which I imagine is not private information.
It's kind of public. It's stored. It's shared with your clients
and whatnot. Then I have a channel ID, which is meant to be private.
And then the two words, right? What stops a malicious third party
from just spamming your relay server with the same NAP ID
and random channel IDs? (Brian Warner)
Nothing. One problem with this approach,
with this protocol, is that it's kind of vulnerable
to denial of service attacks. Because there's so little information
somebody needs to participate in it,
there's not a very high barrier to prevent somebody
from participating in it. So my thoughts for that
are to add a layer on top where you have to get -- there's
some sort of rate-limiting thing, right? It's like, I have to go
to some central server to get a ticket that allows me
to submit a message to that rendezvous server. And the person who's trying to do
the denial of service attack runs up against
some rate-limiting boundary. Maybe you do a Captcha thing. Maybe you have to be signed into
some other service and they can count how many tickets
people are requesting and then terminate that
as an abuse thing. So it'll be something
in that category, I think. Yeah? (audience member)
Is it correct that the PAKE process to derive the session key requires real-time communication
between the sender and receiver? (Brian Warner)
Yes, it doesn't -- not strictly real time but it requires a couple of messages
to be exchanged. So PAKE emits one message
and then waits for one message coming from the other side
before you have the shared session key. Then there's a key confirmation message
to make sure that somebody really did get the right session key
and not something else. And then you send
all of your regular messages. I'm working on a mode of this
that's more persistent. So at the moment,
it makes that WebSocket connection. If that drops, then the server abandons
the channel and you have to start again. So these two programs
have to be running concurrently for at least a couple of seconds
in order to make this work. The persistent mode that I'm working on
would be more "store and forward "and come back again later,"
so that in a longer-running, like a messaging application,
it could be "start the process, "shut down, turn on again tomorrow,
get the receive -- "get the receipt, send the next
message, do it again there." (audience member)
So my question comes from the Tab complete. So, say there's only one channel, 7,
and you put in the G. You can now Tab complete
and see the things. Doesn't that make it easier to guess?
Or -- because maybe there's only five keywords
that are currently in use. (Brian Warner)
Yeah, but you're not revealing to the attacker what you hit
before you hit Tab. So it's your local agent
that knows the fixed list of words that the codes were chosen from. And then it's just an easier affordance
to get to that word faster. The total amount of information
you're typing in is still 16 bits. And you can almost think of it
like "G, Tab" is like Huffman encoding when you have a fixed alphabet. One thing I'm looking to add to it
is a way to have alternate word lists. So when you get the channel,
when the client goes to the server and says, "Hey,
I want to use channel 7," it can say, "Oh, I got this hint
from the other side "that we are using Dutch
for this particular word list. "So switch to that
alternate word list." And then you're not losing
any more information than you would
if you didn't reveal that, because you were only using 256 words
in any given list anyway. (audience member)
Gotcha. This is a cool package. (Brian Warner)
Thank you. (audience member)
So, 16 bits of entropy, and it's time bound. (Brian Warner)
Yeah. And single-use bound. (audience member)
Single-use bound for one file; that's cool. However, you're providing a library. Developers are going to use this for all sorts of things
that you didn't expect. I'm imagining someone building,
like, a backup software. So I transfer the contents
of my whole hard drive to the server. Now we're talking
half a million files. That seems like
a much bigger attack surface than you are anticipating
in your example use case. Have you thought about that?
How do you want to mitigate it? (Brian Warner)
Yeah. The way to characterize the attack rate is going to be:
an attacker chooses to attack some percentage, some fraction
of all the connections that are going through there.
They're going to be successful with some other fraction
of those attempts, depending on how long
that word length is. And then they'll be successful
with whatever -- you know, if somebody attacks
every single connection once -- the worst-case attack here
is probably where somebody is camped out on your server. They watch all the connections
going through. And for every single attack they do, they attack once, and if they fail,
then they don't touch that ever again. So any individual user
is protected by that laziness thing because nobody's ever
going to rerun this program enough times to be
seriously vulnerable to it. But if they only ever attack
any given person once, then maybe people will just be convinced
that this program is kind of buggy and you have to do it
a couple times to make it work. And then the attacker
will successfully manage to get, you know, one out of
every 65,000 of those people. The mitigations for that would be,
first you're using this to establish a relationship. I'm looking to add a mode
where once you have connected to -- once you've transferred
a file over there once, then it remembers the strong session key
and uses that for next time. So that reduces the rate
at which you're creating these things. You can also add
extra words to this. You can say 'wormhole send --words=4'
and get a bigger code space. And there are some post-connection
verification options. 'Wormhole send --verify' will show you
a hash of the session key afterwards and you can verify those yourself
before you continue forward. (audience member)
OK. So it sounds like for -- you're not really protecting
against developers that are using this
to send many, many of these at one time
against a determined adversary? (Brian Warner)
I am expecting that this is the kind of thing
that is being run at human scale, like, there is a person
driving this each time. If you have a machine driving it,
then you can get a longer-term stronger key
and keep on using that. So yeah, I think you're right. (audience member)
Cool, thank you. (audience member)
I have a totally different kind of question. And that is, I was just wondering
who supports, you know, your work on this? Is that the Navy?
Is it the State Department? (Brian Warner)
This is me. This is just me because
I wanted to see this protocol and this technique
made available in more places. So I'm totally willing to spend
the $20 a month to run the VPS to -- (audience member)
Yeah, I mean, but your time is just -- (Brian Warner)
Well, I am recommending that if people use this in a serious application,
you may not want to depend upon me running this thing,
and so the server that backs this is part of the same distribution. (audience member)
But your development time is just -- (Brian Warner)
Yeah. This is totally a side project. (audience member)
Yeah, I mean like, as somebody who could potentially use this,
like this is the kind of basic research we as a community
need to be supporting. (Brian Warner)
Yeah. Last question? (audience member)
So the attacker scenario you're looking at is more like eavesdropping
and getting information. But what about just an attacker
trying to execute denial of service? (Brian Warner)
Yeah, yeah. Like I was saying earlier, its kind of vulnerable to that.
Somebody can go through and just try to claim every single channel
and just block other people's use of it. And so the best I can think of
is Captcha of some sort to kind of slow that down. Thank you, everybody.
Come on up. I got some cool stickers. [applause]