Quirks of discarded traffic reporting on JunOS
Hello! I’m Pavel and I’m CTO and co-founder of FastNetMon LTD, London, 🇬🇧. We’re cyber security software vendor and we develop DDoS 🎯 detection and mitigation platform for Telecoms.
To detect DDoS attack we need to get information about observed traffic from network equipment first. Our customers use variety of platforms in their networks and we have complete support for all major vendors on market (Cisco, Juniper, Arista, Extreme, Nokia, Huawei, Mikrotik, VyOS, Netgate).
We use our own open source parsers (Netflow v9, IPFIX) to decode telemetry we receive from network equipment. Such pretty unique approach allows us to look on different implementations of traffic telemetry export from perspective of protocol implementation and from operational standpoint.
According to our 12 year experience Juniper MX is a the best platform for DDoS detection and mitigation available on market.
Why Juniper MX is great for DDoS defence:
- Mature and well tested BGP Flow Spec support
- Variety of supported telemetry protocols (sFlow, Netflow, IPFIX, inline monitoring services, sampled port mirror, sampled port mirror over GRE)
Unfortunately, there are some issues with telemetry on Juniper MX platform and almost all of them here for many many years and we do not see any evidence that they will be fixed.
That's why we decided to publish this article. In this article we will focus on discarded traffic monitoring but it's not the only issue we have and we will talk later about other issues.
Why discarded traffic monitoring is that important?
When we defend customer networks from DDoS attack we need to see what traffic was dropped by us. This information is very important as it allows us to monitor scale of attack and efficiency of our filtering.
RFC for IPFIX protocol has special entity called "Forwarding Status" with ElementID 89 which allows to encode some limited subset of reasons why packet was dropped and we support it in FastNetMon.
Let's look how major vendors report discarded traffic in their network traffic telemetry.
Nokia
Nokia SR uses forwarding status field to encode forwarding information in their IPFIX implementation:
Cisco
Cisco uses forwarding status field in Cisco ASR 9000 for both Netflow v9 and IPFIX telemetry:
Huawei
Huawei uses forwarding status for both Netflow v9 and IPFIX:
What is the deal with Juniper?
That's how Juniper MX reports information about discarded traffic in IPFIX mode:
Do we have forwarding status field here? No, it's clearly missing. I repeated my tests on latest stable JunOS and even on it fields forwarding status was missing.
How can we guess that this traffic was dropped? Well, there is a very obscure guide on Juniper web site about it:
If the report-zero-oif-gw-on-discard statement is not configured , the flow records display the available information for these elements, which is the default behavior.
After you configure the report-zero-oif-gw-on-discard statement, each sampled packet updates the forwarding action of that flow in the flow record. That is, the last sampled packet of the flow just before export determines the forwarding action of that flow. For example, in the case of a rate-limiting policer, forwarding action taken on a flow is not deterministic. The flow can be treated either as forwarded or as policed based on the forwarding action of the last sampled packet of that flow.
Yes, you read it right.
To get information about discarded traffic we need to enable following configuration option on Juniper MX platform first:
inline-services {
report-zero-oif-gw-on-discard;
}
After that we need to look on "Output Interface" field and "Next Hop" as 0 value for first and 0.0.0.0 for second have special meaning "dropped".
Tricky, isn't it?
To interpret such complicated encoding we have whole layer of logic in FastNetMon:
Such approach of encoding data (in this case "discarded" status) in values which are not supposed to carry such information is pretty close to magic number concept from software engineering and generally not recommended for use as it's very complicated to understand and requires additional logic to interpret correctly.
It's pretty clear that it was the only way to encode "dropped traffic" before IPFIX RFC emerged as standard in 2008 but it's was 16 years ago and these days we clearly have RFC compliant way to deliver and encode such information to Netflow / IPFIX collectors and majority of vendors do it right.
I'm not aware that any other Netflow / IPFIX collector does such heavy amount of post-processing to detect dropped traffic. If you know please share in comments.
Conclusions
I hope some day JunOS development team will introduce RFC compliant Forwarding Status to encode dropped traffic and we will be able to remove such complex logic from our product.
Thank you for reading!