Internet censorship and blocking circumvention: no time to relax
Internet censorship and blocking bypass: no time to relax
Disclaimer: almost everything described in the article is not something fundamentally new or innovative - it has long been known and invented, used in different countries of the world, implemented in code and described in scientific and technical publications, so I do not open any Pandora's box.
Often on Habré, in topics devoted to blocking resources, there are funny statements like “I set up TLS-VPN, now they will watch what they want and the censors will not block my VPN”, “I use an SSH tunnel, so everything is OK, they won’t ban all SSH entirely", and the like. Well, let's analyze the experience of other countries and think about how it can actually be.
0
So, let's say we bought from some service, or, as savvy users, installed in a personal cloud / VPS and set up a VPN server for ourselves. Let's say these are the popular WireGuard or OpenVPN. You know what? WireGuard is such a beautiful protocol that just screams "Look everyone, look, I'm a VPN" with all its packets. And this, in principle, is not surprising, because the authors on the project website directly write that obfuscation was not and will not be included in their goals and plans.
Accordingly, on DPI equipment (aka TSPU), with a little desire, the WireGuard protocol is detected and blocked for one or two. IPSec/L2TP - similar. With OpenVPN the same thing - this is probably the very first protocol that the Chinese have learned to identify and ban on their "Great Chinese Firewall" (GFW). We are fucked.
1
Okay, let's say we have drawn conclusions, and instead of completely pale protocols, we decided to use TLS-VPN, such as SSTP, AnyConnect / OpenConnect or SoftEther - the traffic in them goes inside TLS, the initial connection is established via HTTP - which should be completely indistinguishable from regular connection to any regular site. How to say...
In the case of MS SSTP , censors, wanting to find out what you are doing, will simply make a request to your server with the URL /sra_{BA195980-CD49-458b-9E23-C84EE0ADCD75}/ with the HTTP method SSTP_DUPLEX_POST , as described in the standard protocol , and the server will happily confirm in response that it is, yes, indeed MS SSTP VPN.
SoftetherVPN in response to a GET request with the /vpnsvc/connect.cgi path, application/octet-stream type and ' VPNCONNECT ' payload will return a 200 code and a predictable binary blob with a story about who he is.
AnyConnect/OpenConnect when accessed via / or via /auth will respond with a very characteristic XML. And you can’t get rid of all this in any way - this is defined in the protocols, and it is through this logic that VPN clients work. We are fucked.
2
Clearly, we will be smarter, and since we still have TLS, let's put a reverse proxy (for example, haproxy) in front of the VPN server and resolve everything via SNI (server name identification): connections with a specific domain in the request will be sent to VPN server, and everyone else - to a harmless site with cats. You can even try to hide behind some CDN - they won’t ban the entire CDN, though, and they won’t be able to extract our traffic from the total traffic to the entire CDN, right?
However, there is one "but". In current versions of TLS, the SNI field is not encrypted, so censors can easily spy on it and make a request with exactly the right domain name. You can’t count on the Encrypted Client Hello (ECH) extension, formerly known as eSNI: firstly, it is still in the Draft state and it is not known when it will be accepted and widely used, and secondly, censors can take it and simply block all TLSv1.3 connections with ECH, as they did in China . The Sheriff's Indians don't care. We are fucked.
3
Joking aside, we are determined. For example, we patched the OpenConnect server so that it only accepts connections with a special word in the URL (fortunately, AnyConnect / OpenConnect clients allow this), and gives everyone else a plausible stub. Or set up mandatory authentication using client certificates.
Or we connect heavy artillery from Chinese comrades who ate a dog while bypassing blockages. Shadowsocks (Outline) disappears, because its versions before 2022 are vulnerable to replay attacks and even active probing , but V2Ray / XRay with a VMess and VLess plugin on top of Websockets or gRPC, or Trojan-GFW is what you need. They work over TLS, they can share the same port with an HTTPS web server, and without knowing the cherished secret line that cannot be eavesdropped from outside, it would seem impossible to detect the presence of a tunnel and connect to it, so everything is fine?
Let's think. Each TLS client, when connecting, passes a certain set of parameters to the server: supported TLS versions, supported cipher suites, supported extensions, elliptic curves and their formats. Each library has its own set, and its variants can be analyzed. This is called ClientHello fingerprinting . The fingerprint of the OpenSSL library is different from the fingerprint of the GnuTLS library. The fingerprint of the Go TLS library is different from that of the Firefox browser.
And when frequent and long connections to a certain site are recorded from your address by a client with the GnuTLS library (which is not used in any popular browser, but is used in the OpenConnect VPN client), or from a mobile phone through a mobile operator, some client connects to Go (in which V2Ray is written), we are fucked. Such detection, for example, is carried out in China and Turkmenistan .
4
OK. Let's say we rebuilt our V2Ray client not with standard TLS, but with uTLS , which can masquerade as popular browsers. Or they even took the source codes of the most popular browser, ripped out the entire network stack code and wrote their own proxy client based on it , in order to be completely indistinguishable from the usual browser TLS. Either they decided to go in the direction of disguising themselves as other protocols like SSH, or they took OpenVPN with an XOR patch . Or some KCP/ Hysteria disguised as DTLS.
In short, let's say we have something more rare and inconspicuous. It would seem that everything is fine? How to say. Remember the "Yarovaya package"? The one that requires Internet services to save all session metadata, and Internet providers generally record traffic dumps of their subscribers? Many, even then, laughed - they say, well, stupid, what will they be given gigabytes of encrypted data that they will not decrypt anyway? And here's what.
You use, for example, your tunnel, look at all sorts of prohibited sites there. And then click! - and accidentally go through your tunnel to some domestic site or service seen in cooperation with the state - conditional there VK / Mail.ru / Yandex or something else. Or on some harmless site comes across a widget, banner or counter from them. Or someone in the comment will throw in a link to some honeypot site that looks like a news resource, and you click on it.
And here the most interesting thing happens. What's inside TLS, what's inside SSH, what's inside OpenVPN + xor, the data is transmitted in encrypted form and cannot be decrypted. But the "external form" (packet sizes and timings between them) of encrypted data is exactly the same as that of unencrypted data. The censors see that traffic is going from the subscriber to some unknown server and back, and the flow from some controlled service sees that some requests arrive there from the same IP address as the "unknown server" and replies fly away, and - that's interesting - packet sizes and time points almost completely coincide. What is very characteristic says that we have a proxy here, possibly a VPN, Andryukha, on horses!
And yes, if you act wiser and your server has two IP addresses, one for input and the other for output, then it will not work to match your "input" and "output" by addresses, but by the "form" of the transmitted data, at least and significantly more difficult, but if desired, it is still possible. We are fucked again.
5
It's not such a bad thing. We configured rule-based access for our tunnel. Namely, we will walk on it only where necessary, and when necessary - and in all other cases, let the bags run immediately through a regular Internet connection. True, adding a new resource to the list every time is another hemorrhoid, especially when you keep a proxy / VPN not only for yourself, but also, for example, for elderly parents living far away who, for example, want to read all sorts of foreign agents - but this, in fact, the little things, we can handle it.
Let's say we're still using the SSH tunnel. True, it will work, most likely, for a short time. Why? Because it's all about the same traffic patterns. And no, there is no need to write down and painfully compare anything anywhere. The traffic patterns of ssh-as-console, ssh-as-ftp and ssh-as-proxy are very different and are easily detected by a properly trained neural network. Therefore, the Chinese and Iranians have long been detecting all such "wrong" use of SSH and cutting the connection speed to a snail's, that you can still work in the terminal, but practically not surfing.
Well, or, let's say you still use whatever-over-TLS tunnel, taking into account everything given in this article. But the problem is that everything said in the previous paragraph also applies to it - namely, TLS-inside-TLS is detected by an outside observer using heuristics and machine learning, which can still be further trained on the most popular sites. We are still fucked.
6
OK. We added random padding to our secret tunnel - "adding" some garbage of random length to the end of the packet to confuse observers. Or we deliberately break the packets into small pieces (and we get problems with the MTU, oh, we'll have to diligently rebuild later). Or, conversely, when we have a TLS connection with some server inside the tunnel, we start sending these packets as-is without an additional encryption layer , thus looking from the outside one hundred percent like regular TLS without a double bottom (although you have to still spend a few iterations on bringing the protocol to mind and plugging very subtle and very non-obvious implementation vulnerabilities ). It would seem, happy end, we are not fucked anymore?
And here all the most interesting begins. Namely, sooner or later, in matters of identifying tunnels and blockages, especially with the development of technologies for bypassing them (after all, we have not yet touched on steganography and many other interesting things), something called collateral damage begins to grow - damage that has arisen accidentally in during an attack on an intended target. For example, as insiders say and confirm reports from the fields, the Chinese have learned to detect the tls-inside-tls detection mentioned above, even with random padding, with approximately 40% accuracy . It is clear that with such accuracy, false positives are also possible, but when did the problems of the Indians worry the sheriff?
Protocols that don’t look like anything from the outside (for example, shadowsocks obfs4, etc.) can also be identified by... statistics of zeros and ones in bytes , because for encrypted traffic this ratio is very close to 1: 1 - although, of course, the innocent may suffer. It is possible to ban addresses when there are too many or too long connections to non-very-popular-sites. There are quite a few such options, and if you think that false positives and damage from blocking respectable sites will stop censors, then you are mistaken.
When Roskomnadzor tried to block Telegram, they added entire subnets and hostings to the ban list, thus banning a bunch of innocent sites and services - and they got nothing for it. In Iran, due to the popularity of the aforementioned browser-like proxy client, censors generally bluntly banned Chrome TLS fingerprint connections to popular cloud services. In China, they massively fall under the distribution of CDN , whose services are used by harmless and innocent sites and services. In Turkmenistan , almost a third (!) of all IP addresses and subnets existing in the world are blocked in this way , because as soon as censors detect at least one VPN or proxy, a whole range of addresses near it or even the entire AS is sent to the ban.
You may probably ask, what about legal entities who also use VPN for work, or whose services can accidentally fall under the distribution? This issue is easily solved by whitelisting: if a legal entity needs a VPN server, or if you need to protect any of your services from accidental blocking, then you should oblige them to inform the relevant departments about the necessary addresses and protocols in advance so that they add them to the right list - it was precisely such requests that Roskomnadzor sent through the Central Bank to banks , thinking of something bad, and the mechanism for such lists already exists .
And, of course, a completely follow-up continuation of this will be "everything that is not allowed is prohibited." The Russian Federation already has a law banning VPNs and anonymizers in order to bypass blocking . Prohibition of the use of non-certified encryption tools - too . Tightening and expanding their areas of application and repeatedly tightening penalties for such "violations" is a simple matter. In China, polite guys came to visit the developers of the notorious ShadowSocks and GoAgent and made them offers that they could not refuse . Cases filed in Iranin connection with the use of a VPN to access prohibited sites. The mechanism of informing authorities on unreliable neighbors was perfectly worked out in the last century in the USSR. States have a monopoly on violence, remember. Are we fucked again?
4294967295
Why is it all?
As I said, most of what is described in the article is not fiction or pure theory - it has long been known, used in some countries of the world, implemented in code, and even described in scientific publications.
Bypassing blockages is a constant fight between a shield and a sword, and at the same time a game of cat and mouse : sometimes you eat a bear, sometimes a bear eats you, sometimes you are chasing, sometimes you are catching up.
If you now have a proxy or VPN, and it works - do not relax: you are only half a step ahead of your ill-wishers. You can, of course, sit quietly and think, “Yes, they are all fools and crooked-armed monkeys there and do nothing complicated and really work,” but, as they say, hope for the best, and prepare for the worst. It always makes sense to study the experience of the same Chinese colleagues and take a closer look at more difficult-to-detect and more censor-resistant developments. The more steps you stay ahead of the censors, the more time you will have left to adapt to the changed situation. If you are a developer and understand network protocols and technologies, you can join one of the existing projects, help with development and think about new ideas. People all over the world will thank you.
Interesting and useful in this regard will be Net4People , No Thought is a Crime , discussions in the XTLS project (there is most of it in Chinese, but the English auto-translator does a good job), GFW report . If someone knows more good resources and communities on this topic - write in the comments.
Well, do not forget that sooner or later, not being able to resist such love of freedom technically, the state may begin to resist administratively (the same monopoly on violence), and in such a way that on your part, in turn, it may no longer be possible to technically resist . But this is a completely different story, requiring a separate article, and, most likely, on a different resource.
When I wrote this publication, I wanted to insert pictures from some dark cyberpunk movie into it, where, as a result of the development of surveillance and censorship technologies and the inability to resist it, people became completely and completely controlled by states and lost all rights to freedom of thought and privacy personal life. But I hope it doesn't come to that. All in our hands.
Коментарі
Дописати коментар