The network intelligence game — active scanning v. passive
asset discovery
Dominic Storey
It was in the beginning of the 20th century that the founding father
of quantum physics, Werner Heisenberg, made a startling claim that
you could know either how fast a particle moved or where it was,
but you could never know both. Heisenberg’s Uncertainty Principle
implied that some things will forever remain invisible or unknown–
and that to observe something changed it forever.
'Invisible or unknown' can easily be applied to assets in corporate
networks. What exactly sits on your network is much more than about
the machine. It is much more about what services, which versions,
what operating systems are running. Because this determines a much
more important thing – what is vulnerable on your network?
And vulnerabilities are like currency to a hacker – know just
a single vulnerability could be handing outsiders keys to your kingdom.
And Heisenberg’s Uncertainty Principle makes a surprise visit
here too – for it transpires that quite often, to observe
a network is to change a network.
Many companies, in their rush to determine their vulnerabilities,
resort to scanning the entire network. Network scanning is a process
whereby each host is probed by sending it packets – often
deliberately malformed packets. These packets stimulate a response
from the host, which is recorded by the scanner. These records slowly
build up a map of what the network looks like.
This is where Heisenberg comes in. Often, the act of scanning a
host changes the host. This may be due to badly written network
stacks or services listening to a port becoming activated incorrectly
as a result of receiving the scan packet. The result is that the
service can become 'wedged' (unresponsive or totally inoperative)
or, in worst cases, the entire host may crash. Although it should
rarely happen, the laws of statistics dictate that on a scan of
a large network, somewhere, it will.
Danger, Will Robinson!
So what’s the deal if the odd machine on your network needs
rebooting? Thousands of machines get rebooted every day –
just ask most Windows users! The “deal” is if the host
is controlling something important, like an industrial process,
a medical scanner or a hospital patient’s life support system.
Imagine the consequences of wedging a host that controls the pouring
of molten steel into moulds? Or rebooting a drug delivery system
which then empties the entire reservoir of chemotherapy drugs into
a cancer patient? These deals quickly – and literally –
become a matter of life or death.
Companies who run control (aka SCADA – Service Control And
Data Acquisition) networks usually implement a “no scan zone”
around such critical assets. This has been an acceptable compromise
whilst these networks have remained isolated, but more and more,
these networks are being connected to the corporate backbone –
and are therefore becoming accessible from the internet. Couple
this with the unhealthy interest terrorists are taking in SCADA
networks run by utility companies and it becomes clear that trouble
is looming.
Painting reality
There is another issue with scanning – and that is time.
It takes a long time to scan machines – for example to perform
a completely comprehensive scan of hosts; you need to send over
128,000 packets to the machine. That’s because a machine can
receive UDP or TCP traffic and each machine has 64,000 ports. Although
most machines are not scanned in this way, you can see that even
scanning 5% of all ports across a network of 50,000 machines means
that you send out 320 million packets! To avoid excessive traffic
loads, scans are often executed out of business hours – when
network performance is less critical.
And this just tells you what your network looks like at one moment
in time. The scan is static – any subsequent change in the
network causes a divergence of reality from the model reflected
by the scan. The only solution is to repeat the scan – again,
again and again. Congratulations – you are now painting the
Forth Bridge.
You learn a great deal by just listening
So what’s the alternative? How else can you learn about your
assets? It turns out that you can discover almost all the things
you need to learn by just listening to the ebb and flow of traffic
on your network. On modern networks every time a host communicates
with another, it does so by sending a TCP, UDP or ICMP packet. The
fact that these packets can be understood at all by machines running
different operating systems is by virtue of the standardization
these protocols bring to the table. However, although these protocols
are standard, there is sufficient variation between, say, Microsoft
and Apple’s implementation of these protocols to make it quite
easy to discriminate hosts operating systems and services by just
examining the packets the machines transmit.
This process is called passive discovery and can reliably identify
host operating systems. But what about the services a host runs?
How can these be identified? One approach is to look at destination
port numbers of incoming traffic to determine the protocol and therefore
the service. The problem with this approach is that people don’t
always run protocols on standard ports – for example administration
consoles often have a web interface that runs on a high port number
(e.g. 12,500). A better approach is to use a real-time protocol
analyser, which can examine the traffic and make a determination
of protocol without referring to the port number.
Since passive discovery generates no traffic, it is safe to use
on any network. SCADA networks and other business critical networks
will not be affected. Passive discovery systems can also run continuously
and therefore do not suffer from the divergence from reality that
plagues active scanning.
One objection often raised by passive discovery is that a machine
needs to be active on the network before it can be profiled. What
if the host never talks? The reality is that all hosts talk, usually
quite a lot. Hosts will issue requests for IP addresses, address
resolution requests, time server queries, and domain discovery requests
– all before a user even logs onto the box! What’s more,
the rate of passive network discovery is determined by the amount
of traffic and the number of hosts talking. Contrast this with active
scanning, - the rate of the scan is determined by how fast gets
a response back from the individual client being scanned.
What about system obfuscation? Can you hide from a passive discovery
system? Whilst it is possible to alter the banner of, say a web
service, most passive discovery systems collect many pieces of information
in order to make a determination about a host. It is almost impossible
to disguise them all. What’s more, firewalls don’t help
– whilst a firewall can block an active scan, they cannot
hide a host from a passive discovery system. As soon as the host
initiates or replies to another host, the secret is out. To talk
is to be seen.
What’s the downside of passive discovery? Are there instances
where active scanning would be better? Passive discovery can discover
hosts, services and versions (and therefore vulnerabilities) only
if these details can be discriminated in the network traffic. A
patch that patches a buffer overflow on a host, but does to change
any network behaviour (for example, version number exchange when
systems are performing handshakes) cannot be identified. And of
course, a service running on a machine that never receives a packet
will not be known until it does so. In these cases, scanning may
be the better option – however the overall advantage of scanning
never really outweighs passive discovery. Sometimes a good compromise
is to combine both methods – scan hosts that have been previously
identified by passive discovery to determine further information
about those hosts. This is often referred to as surgical scanning.
If you don’t use it you may as well lose it
What can you do with all this intelligence? How can you put it
to practical use? One way is to integrate it with your intrusion
protection system. Most intrusion protection systems (IDS and IPS)
operate on a rules basis where they examine each packet searching
for suspicious activity such as hacking attempts or virus/worm propagation.
They do not understand the context of the packet in your network,
for example is the packet bound for a host that is not vulnerable
to the attack anyway? Consequently, most intrusion protection systems
generate a high rate of false positives, placing a burden on your
administrators or at worst case, creating a denial of service on
your own network. The intelligence from a passive discovery system
is well suited to addressing these problems – it is continuously
collected and thus can be highly correlated with the attack event
time, it is passive and so there is no risk in inadvertently triggering
an intrusion alert as you can get with active scanning.
Unlike active scanning, the real-time nature of passive discovery
lends itself to compliance monitoring. Alerts can be raised if hosts
appear on the network that run operating systems that are not supported
or services that are forbidden. Alerts are passed into a compliance
system that runs a rules base listing the allowed configurations
– any exception generates an alert.
If the compliance system can be given 'teeth', you can use the
discovery data to enforce compliance. This is achieved by extending
the compliance rules with a set of response rules that can communicate
with firewalls and switches – a compliance exception triggers
a response rule that blocks the non-compliant host from the Internet,
for example. One such response rule could be to initiate an active
scan on the non-compliant hosts, i.e. perform a surgical scan.
The Sourcefire 3D system is a good example of such a system. The
3 D’s refer to the process of Discovery (discover the events
on your network), Determination (determine what these events mean
to your network) and Defence (defend your network against the threat).
The Sourcefire 3D system incorporates Intrusion Detection and Prevention
in the form of Snort-powered sensors with passive network discovery
in the form of the Sourcefire Real-time Network Awareness sensors.
The intelligence from these sensors is aggregated, correlated, prioritised
and acted upon in real time according to policy by the Sourcefire
Defense Center. Sourcefire 3D typically achieves a data reduction
rate of about 100x, i.e. 99 out of 100 events can be discarded,
considerably reducing the false positive rate. The rich discovery
and intrusion intelligence can be mined in many different ways,
for example for asset identification, for intrusion detection correlation
(data pivoting) and for network tuning (for example firewall rules).
Compliance can be monitored and enforced and discovery can be configured
to be active, passive or surgical.
Passive network discovery advances the state of the art of asset
detection and has many advantages over active scanning. However,
both methods have their place and both can contribute to reducing
intruder false positives. What’s ultimately important is not
necessarily the means by which intelligence is collected, but to
the use in which it is put.
About the author
Dominic Storey is technical director, Sourcefire EMEA
|