Study on Encrypted Traffic Classification published in IEEE ACCESS

The paper “Early Traffic Classification With Encrypted ClientHello: A Multi-Country Study” written by the members of the Wireless Networks Lab Danil Shamsimukhametov, Anton Kurapov, Mikhail Liubogoshchev, and Evgeny Khorov has been published in a prestigious open access journal IEEE ACCESS.

The paper considers an early traffic classification problem, i.e., determining the type of transmitted data (e.g., video traffic, web traffic, telephony, etc.) to provision high-quality service in modern networks. Currently, more than 97% of global traffic is encrypted using the Transport Layer Security (TLS) protocol and does not explicitly contain the type of transmitted data. Still, several vulnerable parameters, including the server domain name with which the client establishes a connection, allow traffic to be classified quickly and accurately. However, the new version of TLS, Encrypted ClientHello (ECH), will hide these parameters and significantly complicate real-time traffic classification.

Nonetheless, some TLS service parameters, which carry a non-zero amount of information about the type of transmitted data, are left open. The published paper proposes a new traffic classification algorithm, the hybrid Random Forest Traffic Classifier (hRFTC), which uses not only these remaining unencrypted TLS ECH parameters but also statistical features, such as packet sizes and inter-arrival times. To analyze its effectiveness, a database of encrypted traffic from six countries in North America, Europe, and Asia was collected. The results show that relying solely on unencrypted TLS ECH parameters can achieve a classification quality of just 38.4% in terms of F-score. Meanwhile, the additional consideration of statistical features helps achieving the record F-score accuracy of 94.6% on the collected dataset.

Our hRFTC algorithm outperforms the best existing classifiers and can be used at intermediate network nodes to improve the quality of service in networks. From a different perspective, we identified remaining privacy leaks of encrypted traffic that need to be eliminated in future versions of transport layer security protocols

comments Anton Kurapov, a student who conducted the research as part of his master’s thesis.

The paper has been published in IEEE ACCESS, a first quartile (Q1) journal. It is the second work recently published by the Wireless Networks Laboratory staff in a highly cited scientific journal. Read about the previous work here.