By Pavel Odintsov — Jul 22, 2023

Data structures of DDoS detection

For quite long time DDoS detection and mitigation area was like magic with extremely limited access to information about algorithms used by such products. Almost all technologies were proprietary and their algorithms were patented. At FastNetMon we do our best to explain all technical aspects behind DDoS detection logic.

I’ll cover one of the main data structures used by FastNetMon to detect attacks: hash maps (or tables).

To detect network attacks FastNetMon we need to count traffic (number of incoming / outgoing bytes / packets) for every single IPv4 (unsigned 32 bit integer key) or IPv6 (16 byte array) address in customer network. Networks may have from tens of thousands IPs up to tens of millions.

Data about traffic arrives as uneven stream which may include information about single network packet (exactly 1 packet with length up to 1500 bytes) or as aggregated flow (aggregation is implemented on network device side) which may include multiple packets and arbitrary length of so called flow for each IP address. Each time when we receive information about traffic we need to find existing record or create new for specific IP address and then increment byte and packet counters fields for it. You may find example in FastNetMon Community code at GitHub.

List of IPs actually owned by ISP in any particular network is stable enough and we will see new IPs only after tool restart (aka warm up period). After few minutes we expected extremely small number of new IPs. So we mostly care about lookup performance. We do not remove elements unless we reach scalability limitations of unordered_map.

We call this logic host traffic accumulators. As hash map value we use pretty large structure around 200 plus bytes.

So we have total number of packets / bytes in incoming / outgoing directions in our hash map. For purpose of attack detection we need to calculate speed per second for each host. We use different algorithms but most common one is just average calculated every single second. We scan all keys in hash map and then export them to another structure and then set set every element in our traffic accumulator hash to zeroes.

So we’re both interested in element lookup time and full iteration time (aka full scan). We have no requirement about stable address for node or any other artificial standard driven constrains from STL. For most of the counters we rely on std::unordered map from C++ 20 STL library.

Unfortunately, it does not work that great as we expect as insert time is far away from our expectations. Even worse that hash table seriously degrades after crossing around half million of keys.

That’s a reason why we’re in constant search for new hash tables which can perform better for our workload. We maintain hash table performance test as part of our test suite and you can find some measurements here.

Thank you for reading!

Subscribe to Pavel's blog about underlying Internet technologies