We present the first global study of connection tampering through a passive analysis of traffic received at a global CDN, Cloudflare. Our study shows that passive measurement can be a powerful complement to active measurement in understanding connection tampering and improving transparency.
In this paper, we measure and characterize the GFW’s new system for censoring fully encrypted traffic. We find that, instead of directly defining what fully encrypted traffic is, the censor applies crude but efficient heuristics to exempt traffic that is unlikely to be fully encrypted traffic; it then blocks the remaining non-exempted traffic. Our understanding of the GFW’s new censorship mechanism helps us derive several practical circumvention strategies. We responsibly disclosed our findings and suggestions to the developers of different anti-censorship tools, helping millions of users successfully evade this new form of blocking.
Since 2006, Turkmenistan has been listed as one of the few Internet enemies by Reporters without Borders due to its extensively censored Internet and strictly regulated information control policies, but it is difficult to study due to its small population and low internet penetration rate. In this paper, we present the largest measurement study to date of Turkmenistan’s Web censorship. We apply our tool TMC to 15.5M domains, our results reveal that Turkmenistan censors more than 122K domains, using different blocklists for each protocol. We also reverse-engineer these censored domains, identifying 6K over-blocking rules causing incidental filtering of more than 5.4M domains.
Persistent routing loops on the Internet are a common misconfiguration that can lead to packet loss, reliability issues, and can even exacerbate denial of service attacks. In this paper, we perform high-TTL traceroutes to the entire IPv4 Internet from a vantage point in order to enumerate routing loops and validate our results from a different vantage point. We find over 320k /24 subnets with at least one routing loop present.
In this paper, we present the first techniques to automate the discovery of new censorship evasion techniques purely in the application layer. We present a general solution and apply it specifically to HTTP and DNS censorship in China, India, and Kazakhstan. Our automated techniques discovered a total of 77 unique evasion strategies for HTTP and 9 for DNS, all of which require only application-layer modifications, making them easier to incorporate into apps and deploy.
In this paper, we present evidence that suggests the GFW has deployed a second HTTPS censorship middlebox that runs in parallel to the first. We present a detailed analysis of this secondary censorship middlebox—how it operates, the content it blocks, and how it interacts with the primary middlebox. We also present several packet-based evasion strategies for the secondary middlebox and demonstrate that the primary censorship middlebox can be defeated independently from the secondary.
In this paper, we present the first non-trivial TCP-based DDoS amplification attack by weaponizing censoring middleboxes. We develop a novel mechanism to discover these amplification attacks and perform Internet-wide measurements to measure the threat censoring middleboxes pose. We find hundreds of thousands of IP addresses that offer amplification factors greater than 100×. We also report on network phenomena that causes some of the TCP-based attacks to be so effective as to technically have infinite amplification factor (after the attacker sends a constant number of bytes, the reflector generates traffic indefinitely).
Censors pose an even greater threat to the Internet than previously understood. We demonstrate an off-path attack that exploits residual censorship, a feature by which a censor continues blocking traffic between two end-hosts for some time after a censorship event. Our attack sends spoofed packets with censored content, keeping two victim end-hosts separated by a censor from being able to communicate with one another. This attack allows anyone to weaponize censorship infrastructure to perform their own blocking.
Earlier this year, Iran deployed their protocol filter that permits only a small set of protocols (DNS, HTTP, and HTTPS) and censors connections using any other protocol. In this paper, we present the first detailed analysis of Iran’s protocol filter: how it works, its limitations, and how it can be defeated.
In this paper, we present the first purely server-side censorship evasion strategies—11 in total—enabling servers to subvert censorship on behalf of clients. We extend Geneva to automate the discovery and implementation of server-side strategies, and we apply it to four countries (China, India, Iran, and Kazakhstan) and five protocols (DNS-over-TCP, FTP, HTTP, HTTPS, and SMTP).
In this paper, we present the first purely server-side censorship evasion strategies—11 in total—enabling servers to subvert censorship on behalf of clients. We extend Geneva to automate the discovery and implementation of server-side strategies, and we apply it to four countries (China, India, Iran, and Kazakhstan) and five protocols (DNS-over-TCP, FTP, HTTP, HTTPS, and SMTP).
We present Geneva, a novel genetic algorithm that evolves packet-manipulation-based censorship evasion strategies against nation-state level censors. With experiments performed both in-lab and against several real censors (in China, India, and Kazakhstan), we demonstrate that Geneva is able to quickly and independently re-derive most strategies from prior work, and derive novel subspecies and altogether new species of packet manipulation strategies.
In this paper, we present the design, implementation, and initial run of King of the Hill (KotH), an active learning cybersecurity competition designed to give students experience performing and defending against penetration testing. KotH competitions involve a sophisticated network topology that students must pivot through in order to reach high-value targets. When teams take control of a machine, they also take on the responsibility of running its critical services and defending it against other teams.
This paper presents unCaptcha, an automated system that can solve reCaptcha’s most difficult auditory challenges with high success rate. We evaluate unCaptcha using over 450 reCaptcha challenges from live websites, and show that it can solve them with 85.15% accuracy in 5.42 seconds, on average. unCaptcha combines free, public, online speech-to-text engines with a novel phonetic mapping technique, demonstrating that it requires minimal resources to mount a large-scale successful attack on the reCaptcha system.
Routine Activity Theory (RAT) is used by criminologists to explain the situational factors that influence crime in the physical world. RAT states that crime is most likely when a motivated offender, a vulnerable victim, and a lack of capable guardianship converge. We hypothesize that the time of cybercriminal actions will align with the principles of RAT. We analyzed data from over 20,000 intrusions on a large set of target computers over a period of four years. A statistically significant pattern is found in the time of intrusions in the local timezone of the victim hosts and native timezone of the attacker; intrusions geolocated to China demonstrate a stronger statistically significant pattern. The results suggest that RAT does apply to cyberspace, and further conclusions and policy implications are discussed.