CyberCX Unmasks China-linked AI Disinformation Capability on X → 

Identifying Gophish Servers

Blogs from CyberCX

Alain shares a methodology for discovering and identifying Gophish deployments in the wild. How easy is your Gophish installation to spot?

 

Gophish is an open source phishing framework created by Jordan Wright that is widely used by both internal security teams and security consultancies to perform phishing awareness exercises. Gophish is one of several phishing frameworks we use at Insomnia Security because it’s an excellent tool that is simple to deploy and easy to use.

When evaluating third party tools such as Gophish a key consideration for us is determining the tools susceptibility to external fingerprinting. This evaluation process is important because a tool with an obvious fingerprint could unnecessarily expose our attack infrastructure.

For the average phishing awareness campaign, it’s probably not a big deal if our phishing infrastructure is discovered by the blue team or security researchers. However, during an exercise where operational security is important we want to know how safe a tool is to use and what precautions we need to take while using it.

There are well-known examples where blindly running a third party tool with the default configuration resulted in exposure of attack infrastructure such as Empire and Cobalt Strike

 

Fingerprinting Gophish

Gophish exposes a relatively small external attack surface by default. If we look at the registered route handlers in the source code we can see there are only a few paths that will return responses. Most of these paths result in 404 Not Found responses unless you know the special RID value that corresponds to a specific recipient in a phishing campaign. The Gophish documentation elaborates on this further:

Note: Landing pages are stored in the database. Gophish generates a unique ID (called the rid parameter) for each recipient in a campaign, and uses this ID to dynamically load the correct landing page. To preview what a landing page will look like, you will need to either use the HTML editor seen below, or launch a test campaign. Simply browsing directly to the Gophish listener without specifying an rid parameter will display a generic 404 page.

Requesting a landing page path without a valid RID value or requesting any unhandled path such as / results in a 404 Not Found response as shown below:

GET / HTTP/1.1
Host: 127.0.0.1
Accept: */*


HTTP/1.1 404 Not Found
Content-Type: text/plain; charset=utf-8
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Date: Thu, 08 Aug 2019 23:09:09 GMT
Content-Length: 19

404 page not found

 

At a first glance this looks like a reasonably generic HTTP response that you could potentially see from any random web server. Let’s go through the individual components of the responses and consider how unique each value is:

  • The HTTP response code is 404 Not Found which is generic and uninteresting.
  • There’s a HTTP body of 404 page not found that has a consistent length of 19 bytes. This might be useful but it seems like something a lot of web servers would return.
  • Some optional HTTP headers are specified like Content-Type: text/plain; charset=utf-8X-Content-Type-Options: nosniff and Vary: Accept-Encoding. These might be useful if this particular combination of headers is unique, but these headers seem common enough.

 

If we look through the Gophish source code we can’t find any reference to the 404 page not found string or the optional HTTP headers. If we look upstream to the Go HTTP Package source code we can see the 404 page not found string, the X-Content-Type-Options header and the Content-Type headers are hardcoded values. This means these values could be useful for fingerprinting HTTP servers built using the Go HTTP Package but not for fingerprinting Gophish directly.

A quick review of the Go HTTP Package source code and documentation doesn’t reveal much else in the way of unique behaviour. One interesting feature is that the Go HTTP Package supports HTTP/2 by default unless it’s explicitly disabled. This might be useful as a fingerprint because HTTP/2 support is a recent technology that is only supported in newer versions of HTTP libraries and web servers. However, this may also be less useful as a fingerprint because HTTP/2 is not yet universally supported (e.g. if an HTTP/2 compatible server is sitting behind a CDN or reverse proxy that only supports HTTP/1.1).

The only Gophish path that returns any real content without a valid RID value is /robots.txt which has a hard-coded response in the source code. This results in a response that looks like the one below:

GET /robots.txt HTTP/1.1
Host: 127.0.0.1
Accept: */*


HTTP/1.1 200 OK
Vary: Accept-Encoding
Date: Thu, 08 Aug 2019 23:12:11 GMT
Content-Length: 26
Content-Type: text/plain; charset=utf-8

User-agent: *
Disallow: /

This seems like a better fingerprint because it’s something specific to Gophish rather than the Go HTTP Package. However, the response still looks generic and looks like it could be the robots.txt from any random website.

 

A Methodology For Hunting Gophish Servers

Overall the possible fingerprint values we have for Gophish aren’t looking too promising. We have the 404 page not found string and a couple of HTTP headers that could maybe be used to find web servers served by the Go HTTP Package. If you found a server that matches this fingerprint you could then possibly confirm it was a Gophish server by requesting /robots.txt and checking if you got the correct static response. If you are on a Blue Team looking at a single web server that you are already suspicious of then maybe this is enough to confirm it’s running Gophish. But what if you are looking at 1,000 web servers or all of the web servers on the Internet? This fingerprint seems like it would be too weak to hunt for Gophish servers when they are hidden in a larger pool of web servers.

In order to check how effective this fingerprint is we first need a large database on HTTP responses to search through. There are many Internet search engine projects like Shodan or Censys that could be used for this purpose. My personal preference is Rapid7’s Project Sonar because it’s easy to grab a bulk dump of all of the data and search through it with my own tools offline.

We start our hunt by downloading the HTTP and HTTPS GET response data from Project Sonar. These datasets contain responses to a HTTP GET request to the / for the entire IPv4 address space. The initial filtering process we run is to search for responses that:

  1. Have an HTTP status code of 404 Not Found
  2. Have an HTTP body content of 404 page not found and no other data by checking for a Content-Length: 19 header
  3. Specify the optional Content-Type: text/plain; charset=utf-8 and X-Content-Type-Options: nosniff headers

This returned just under 100K unique IP addresses in the Project Sonar dataset at the time of searching. Potentially this search could be further improved by ignoring responses that also include other HTTP headers that we know Gophish doesn’t send. For example, we didn’t observe any Server or Set-Cookie headers being sent by Gophish. However, there is a risk we might miss cases where a reverse proxy in front of a Gophish server is adding headers.

We now have a set of 100K IP web servers that have a fingerprint that is similar to the Go HTTP Package. Next we can request /robots.txt from all of these web servers and check if the response is the same as the hardcoded response in the Gophish source code. This process reduces the set of possible candidates to a much more manageable 1,181 IP addresses. A quick search through the results show there are some web servers that return headers indicating they are unlikely to be Gophish servers such as Server: Microsoft-IIS/8.5 and X-XSS-Protection: 1; mode=block. There are definitely refinements that could be made to this filtering process but 1,181 is a small enough number to not worry about some false positives.

Earlier we learned that Gophish won’t return any landing pages without a valid RID value. Without a valid RID value all that we can retrieve from the web server is the /robots.txt page or a 404 Not Found response. This means there is no direct way to confirm that any of these 1,181 web servers are running Gophish or hosting phishing pages. One way we can check if any of these web servers were likely used for phishing is to search passive DNS databases (e.g. Project Sonar’s Forward DNS Dataset) for the IP address. Phishing pages are largely served from registered domain names (rather than raw IP addresses) that are intended to impersonate legitimate domain names of target organisations or well-known brands. We can review known DNS records for the IP address and check the results for any domains that seem suspicious.

We now have an end to end hunting methodology to go from all web servers on the Internet to a small candidate list of possible GoPhish servers:

  1. Gophish uses the Go HTTP Package which has a fingerprintable HTTP response – we can find IP addresses hosting web servers with a similar fingerprint.
  2. Phishing pages are usually served from a domain name – we can retrieve known DNS records for IP addresses we have found.
  3. The domain names used to host phishing pages often impersonate legitimate domain names – we can check the list of DNS names for suspicious domain names.

Results

After performing a hunt for Gophish using the above methodology we have a raw dataset of candidate GoPhish servers consisting of IP addresses and DNS records. This dataset has been made available for download here.

We know this dataset contains some false positives but a search for “gophish” across the DNS records confirms that it looks we have found at least some real Gophish servers:

116.203.74.54, gophish.ortner-sec.at
128.233.203.150, gophish.usask.ca
13.239.155.148, gophish.spd2wig46t.ap-southeast-2.elasticbeanstalk.com
134.193.136.195, kc-issr-gophish.kc.umkc.edu
134.209.181.166, sochost-19a31bb4c98a5f2c.toplook.gophish.ga
165.227.130.233, fsd142f8761fdaf710fd.toplook.gophish.ga
165.227.130.233, gophish.ga
174.107.179.82, gophish.timmonstech.rocks
185.79.155.21, gophish.lt-innovators.nl
188.226.159.156, fsd142f8761fdaf711fd.toplook.gophish.ga
188.226.159.156, gophish.ga
198.236.66.240, gophish.mesd.k12.or.us
198.237.190.240, gophish.wesd.org
199.101.148.12, gophish.caretech.link
199.101.148.12, gophish.mailgateway.link
204.13.77.123, gophish.fitchburgstate.edu
206.74.86.236, gophish.intsys.net
206.78.117.3, gophish.maderacoe.org
213.221.128.75, gophish.simnetsa.ch
35.225.34.13, gophish.freemyip.com
35.226.101.230, gophish.cbakers.net
37.252.122.206, gophish.tilaa.cloud
40.118.151.79, admin.gophish.corp.nurx.net
40.118.151.79, gophish.corp.nurx.net
50.116.56.72, gophish.desales.edu
52.22.251.242, gophish01.barroncollier.com
52.45.15.206, gophish.flatiron.com
54.69.102.24, lb-gophish-safelinkchecks-1782701540.us-west-2.elb.amazonaws.com
54.70.240.140, lb-gophish-safelinkchecks-1782701540.us-west-2.elb.amazonaws.com
54.79.126.80, gophish01.hacklabs.com
66.11.18.140, gophish.aheliotech.com
66.154.140.240, gophish.nwresd.k12.or.us
68.183.116.163, gophish.atgfw.com
93.93.128.45, gophish.lancsac.uk
93.94.109.149, gophish.lab.nviso.be
94.103.101.161, gophish.seculabs.ch

 

Impersonating Well-Known Brands

Most of the obvious phishing domains names in the dataset are impersonating well-known brands such as Microsoft, Amazon, Google, Apple and LinkedIn. Some examples are shown below:

157.230.185.119, login.microsoftonline.cloud
209.97.153.211, microsoftowa.com
52.28.11.189, 0365.exchange
104.248.172.55, offiec365.com
37.71.58.142, offlce-365.com
34.247.120.62, login.onmlcrosoft.co.uk
66.90.217.3, windowsazureoffice365.com
93.174.195.253, signin.aws.amazon.console.eu.com
13.82.94.206, amaz.services
104.153.190.170, amazon.shlpplng.com
174.78.230.108, mail-awsapps.com
18.188.215.90, googlesdoc.com
51.83.76.146, goog1e.eu
13.77.147.218, googlegdrive.com
104.131.106.4, google.oauth-secure-login.com
52.215.51.61, app1e.co
51.141.234.75, appleicare.email
37.247.42.163, apple.loginservice.app
104.248.160.241, apple-alerts.co.uk
37.247.42.163, linkedn.net
212.129.5.37, recovery-linkedin.com
3.9.137.40, llinkedin.co.uk
18.197.214.95, www.linkedln.cloud
18.213.6.190, linkedin-alerts.com
104.153.190.170, linkedlln.com

 

Phishing Simulation Providers

We can find some examples of legitimate phishing simulation services who appear to be running Gophish. For example, EveryCloud are a provider of phishing simulations and have a list of domain names that they provide to clients to be whitelisted. We can see these domain names in our dataset which suggests that EveryCloud are probably users of Gophish:

104.248.60.174, a01.encrypted-network.com
104.248.60.174, a01simulation.everycloud.com
104.248.60.174, ctfd.jmphosts.info

 

Internal Phishing Awareness Programs

Several organisations can be identified that are using Gophish to run internal phishing awareness programs. For example, the 40.117.211.76 is associated with several domain names related to United Nations agencies:

40.117.211.76, fraud.unessco.ga
40.117.211.76, hr.unessco.ga
40.117.211.76, login.ifad.ga
40.117.211.76, login.oms-who.ga
40.117.211.76, login.uhncr.org
40.117.211.76, login.unessco.ga
40.117.211.76, login.unicc-office365.cf
40.117.211.76, login.unjcef.org
40.117.211.76, mrsp.uhncr.org
40.117.211.76, msrp.uhncr.org
40.117.211.76, mx.uhncr.org
40.117.211.76, mx1.unjcef.org
40.117.211.76, o365.unessco.ga
40.117.211.76, phish-aware.uhncr.org
40.117.211.76, ubs.uhncr.org
40.117.211.76, uhncr.org
40.117.211.76, unhcr.eastus.cloudapp.azure.com
40.117.211.76, www.uhncr.org

 

Some of these domains host content advising visitors that a phishing simulation is being performed:


Security Consultancies

Several of the IP addresses and domain names in the dataset can be easily linked to security consultancies who are likely performing phishing simulations, penetration tests or red team exercises. As a professional courtesy this blog post avoids calling out any specific examples but these are easy enough to find with some basic analysis. In most cases the lack of concern for operational security and attribution is probably intentional because it’s not essential for all types of security exercises. Even so, I know I would have a bad day if someone dissected my phishing infrastructure on a blog post in the middle of an exercise.

Some of the giveaways that infrastructure might be operated by a security consultancy include:

Hosting phishing servers on IP addresses or domain names that are directly attributable to the company (e.g. DNS records pointing a sub-domain of your corporate domain name to the Gophish server).
Domain WHOIS information containing real contact details of your employees.
Targeting organisations that are highly unlikely to be targeted by a real threat actor.
Targeting multiple organisations in the same geographic area who are in vastly different industries.

Retrieving RID Values From VirusTotal

One way that we can attempt to retrieve Gophish RID values related to a domain is by searching for the domain on VirusTotal. You can search for a domain and see URLs hosted on the domain that been previously scanned. In the case of Gophish phishing pages this can include the RID value in the URL parameters. For example, if we search VirusTotal for naturesb0unty.com we see there are several undetected URLs containing RID values. This means we can potentially discover and visit phishing pages hosted by Gophish if they are still live as shown below:

Conclusions

The results have demonstrated that it’s possible for anyone to hunt for Gophish servers across the Internet. This is due to a combination of Gophish being built with the Go HTTP Package and the inherent suspicious nature of most phishing domains. Depending on your use case for Gophish this may or may not be something you need to be concerned about:

  • Penetration Testers and Red Teamers – This is a good reminder to always understand your tools and to be vigilant about operational security. If you are concerned about your Gophish server being identified then consider obscuring the fingerprint by placing a reverse proxy in front of your phishing server or making changes to the Gophish source code.
  • Internal Phishing Awareness Programs and Phishing Simulation Providers – You’re already in a position to whitelist your phishing servers and Blue Teams are often provided with exercise indicators in advance. This means it probably doesn’t matter if someone is able to discover your phishing server.
  • Gophish Developers – This isn’t a vulnerability or major issue with Gophish. You shouldn’t feel obligated to make changes to mitigate the operational security choices of your users.

Ready to get started?

Find out how CyberCX can help your organisation manage risk, respond to incidents and build cyber resilience.