Automate away your problems: combatting illegal abuses of ngrok

by Alan Shreve

ngrok is a tunneling, reverse proxy that establishes secure tunnels from a public endpoint to a locally running network service while capturing all traffic for inspection and replay. It is an open-source project on GitHub.

-

At its core, ngrok is an open reverse proxy. That shudder you heard was the sound of millions of network administrators crying out in horror. Open proxies, sometimes called open relays or open resolvers are excellent resources for spammers, cybercrime, DDOS attacks and more. “Open” in these cases means that the service is available without any authentication or payment.

Open reverse proxies like ngrok are traffic sinks, and thus not prone to those types of abuses. They won’t relay any requests out to the public internet. Instead, ngrok.com suffers from the opposite problem that most hosting providers and Platforms-as-a-Service like Heroku, Azure and App Engine deal with: people like to host illegal content on us.

Unlike most platform providers, ngrok doesn’t even require you to create an account before you can begin hosting content over a tunnel. I don’t require accounts because of my fanatical devotion to first-time user experience. Minimizing the time to value of the service is very important to me. Unfortunately, this lower barrier to entry does put me at a disadvantage when it comes to dealing with illegal content.

Recently, some miscreants have taken to using ngrok for hosting phishing websites which attempt to collect login credentials for popular, high-value websites like Google Accounts. They’ve stuck to hosting these on the only types of tunnels you can create without signing up on ngrok.com which are randomly-chosen hexadecimal subdomains that look like 39ac91f.ngrok.com. And while most of us would immediately recognize a bank login page hosted on that domain as suspicious, there are plenty of humans without the internet-savvy to recognize the illegitimacy of such a site.

It’s hard to blame them, honestly. URLs are often opaque, magic looking things with weird base64 encoded state passed around by the ColdFusion framework or ASP.NET web forms or something. Some mobile browsers have also taken to hiding them under certain circumstances. On top of all that, ngrok.com provides free TLS-secured tunnels, which means that an ngrok-hosted phishing site renders with that “secure lock” symbol we trained an entire cohort of internet users to believe meant a website was trustworthy.

Inevitably, when illegal content gets hosted on ngrok.com, it is brought to the attention of my hosting provider and, in turn, to me. Hosting a phishing site (or any illegal content) is against the terms of service of pretty much every hosting provider out there and mine takes it pretty seriously.

A motivating outage

During the first couple weeks encountering this phenomenon, I banned the sites manually as I was notified of them. But on Sunday, May 18th, ngrok suffered its first major outage (approximately five hours) in nearly nine months because I wasn’t fast enough. My hosting provider notified me of a phishing site and then subsequently powered down the ngrok servers because I failed to deal with it before the given deadline. The cause was as pedestrian and dull as they come: it was a Sunday morning and I was still asleep. After I got ngrok back online, I considered my next steps. What changes could I make to combat illegal content more effectively or reduce the incidence rate? I came up with a number of possible solutions.

Possible Solutions

Only provide TLS tunnels to paid accounts

Phishing sites are less attractive without TLS/HTTPS because our CA system has unfortunately conflated the fundamentally separate concepts of encryption and trust. In the end, other illegal content would be just as useful without https, and I’m a big proponent of encryption everywhere, so I decided against this.

Require signup before you can create any tunnels

This would severely compromise ngrok’s first-time user experience and ease of use, both of which are hugely important in ngrok’s popularity. Moreover, creating an ngrok account does not require anything besides an email address, which it doesn’t bother to verify since obtaining a valid one is so easy anyways. This wouldn’t help either really.

Inspect tunnel traffic on the wire for illegal content patterns

One of the core tenets of the ngrok.com service is that it does not inspect your traffic at all beyond reading the header field necessary to perform the multiplexing. All traffic inspection and replay is done client side. While I’m sure that this would likely be effective, it’s hard to get right and it’s not a route I want to go down unless I have no other options. This is a last resort.

Make ngrok a completely paid service

Even if I did this and there was no free tier (which would fundamentally change ngrok’s audience and global appeal), it’s likely that I would still need some sort of free trial. I could keep upping the barrier to entry, (like requiring a credit card number) but it only hurts ease of use.

Automating the banhammer

In the end, I decided that what I really want to start with is just a more efficient way to respond to these illegal sites. Ideally, I’d like to give my hosting provider an administrative tool that they could use to shut down illegal tunnels without putting me in the critical path. Of course, it’s a little bit scary to hand a giant banhammer for your entire service over to someone else, but it’s a reasonable compromise so long as I get notifications and can asynchronously verify that it’s not being abused.

These types of administrative interfaces aren’t all that uncommon, and I tried suggesting one to my provider, but they weren’t responsive to any deviation from their procedure of opening tickets against me. But I really wanted a completely automated workflow, so in the end, I got out my proverbial programmatic roll of duct-tape and automated around them. Let’s talk about how it works.

The anatomy of a hack

The goal was to automate the entire process so that as soon as I get a notification that an illegal site is hosted on an ngrok.com subdomain, the following will all happen, without a human involved:

  1. Forward the email notification through to my personal email address
  2. Hellban the offending subdomain and the IP address of the tunneling client. If someone was silly enough to do this with an account, ban their user id as well
  3. Notify my hosting provider that the site was blocked through their ticketing system
  4. Notify my phone so that I can verify that the site was indeed illegal

All total, I ended up writing a small custom extension to the ngrok server and about a hundred or so lines of Python to handle all of the blacklisting.

Automated handling of email

I get notifications of new tickets against me in email. So far as I can tell, this is the only mechanism by which I can automate the handling of them. I run my email through postfix, and it turns out that automatically handling email is super easy with postfix (sidenote: not exactly, it took me a while to figure out that you can only do this with aliases, not with virtual aliases). Here’s how you set up your /etc/aliases file to invoke a script for every email to an address:

1 banhammer:        |/var/services/ngrok.blacklist/venv/bin/ngrok.ban

Now emails to banhammer on the local machine will invoke the given script. If you’re using virtual aliases you’ll need to properly point an entry in your virtual aliases file as well:

1 banhammer@ngrok.com      banhammer@localhost

The script

This is a slightly pared-down version of the script that I wrote up to automate the whole process. It walks through each of the important steps in the process: reading out the email, parsing it, forwarding it to my personal address, communicating with the blacklist service, and then posting an update back to my provider. It is a testament to the power of the Python language, standard library and its ecosystem of third-party modules that allow me to accomplish so much in ~60 lines of code:

 1 import sys, email, smtplib, re, requests, bs4
 2 
 3 # read in the message from stdin and parse it
 4 rawmsg = sys.stdin.read()
 5 msg = email.message_from_string(rawmsg)
 6 
 7 # send the mail through to my personal address
 8 smtp = smtplib.SMTP("localhost", 25)
 9 smtp.sendmail(get_from(msg.get("From")), config.my_email, msg.as_string())
10 smtp.quit()
11 
12 # only process messages from the provider
13 if not verify_authenticity(msg):
14     return
15 
16 ticket_id = None
17 domains = set()
18 # look in each part of the email
19 for part in msg.walk():
20     try:
21         # find all the ngrok.com subdomains to ban
22         domains.update(re.findall(r"[\w]+\.ngrok\.com", part.get_payload()))
23 
24         # look for the ticket id in the email
25         m = re.search(TICKET_ID_REGEX, part.get_payload())
26         if m:
27             ticket_id = m.group(1)
28     except:
29         # skip message parts without a payload
30         pass
31 
32 # blacklist each domain requested
33 for d in domains:
34     resp = requests.post(config.blacklist_service_url, data={"hostname": d})
35     resp.raise_for_status()
36 
37 if len(domains) > 0 and ticket_id != None:
38     s = requests.Session()
39     resp = s.post("https://provider.com/login", data={
40         "username": config.provider_username,
41         "password": config.provider_password
42     })
43     resp.raise_for_status()
44 
45     # load the ticket page to pull out the CSRF token
46     resp = s.get("https://provider.com/ticket/{}".format(ticket_id))
47     resp.raise_for_status()
48 
49     # find the csrf token
50     soup = bs4.BeautifulSoup(resp.text)
51     csrf_token = soup.select("#ticket input[name=csrf_token]")[0]["value"]
52 
53     # construct and post the response
54     msg = "The site {} has been blocked"
55     if len(domains) > 1:
56         msg = "The sites {} have been blocked"
57     resp = s.post("https://provider.com/ticket/{}".format(ticket_id), data={
58         "message": msg.format(", ".join(domains)),
59         "csrf_token": csrf_token
60     })
61     resp.raise_for_status()

The blacklist service

The blacklist service is fairly simple. It just manages a database table of blacklist entries and pushes updates over HTTP into the ngrokd server to immediately update its blacklist maps. It’s also responsible for notifying my phone and doing some metrics collection. I’m not going to walk over this part of the code since it’s basically just a little CRUD service that talks to some database tables and external HTTP services.

It’s just the beginning

It’s my expectation that this problem will never go away so long as ngrok is a successful service and that I’m in for a long arms race and game of whack-a-mole. Automation will help, but to what extent it acts as a deterrent for those using ngrok to accomplish dishonorable ends remains to be seen. At the very least, I experience a brief moment of magical bliss whenever my phone lights up to notify me that my automation has done in a few seconds something which used to be a manual process that could have hours of latency before I could respond.

For those of you who work at hosting providers and services platforms: how do you deal with this problem? What automation do you have in place?