Provided by: asncounter_0.5.0_all 

NAME
asncounter — collect hits per ASN and netblock
DESCRIPTION
Count the number of hits (HTTP, packets, etc) per autonomous system number (ASN) and related network
blocks.
This is useful when you get a lot of traffic on a server to figure out which network is responsible for
the traffic, to direct abuse complaints or block whole networks, or on core routers to figure out who
your peers are and who you might want to seek particular peering agreements with.
SYNOPSIS
asncounter OPTIONS [ADDRESS ...]
OPTIONS
-h, –help
show this help message and exit
--cache-directory, -C CACHE_DIRECTORY
where to store pyasn cache files, default: ~/.cache/pyasn
--no-prefixes
disable prefix count
--no-asn
disable ASN count
--no-resolve-asn
disable ASN to name resolution in output
--top, -t N
only show top N entries, default: 10
--input, -i INPUT
input file, default: stdin
--input-format, -I {line,tuple,tcpdump,scapy}
input format, default: line
--scapy-filter SCAPY_FILTER
BPF filter to apply to incoming packets, default: ip and not src host 0.0.0.0 and not src net
192.168.0.0/24
--interface [INTERFACE]
open an interface instead of stdin for packets, implies -I scapy, auto-detects by default
--output, -o OUTPUT
write stats or final prometheus metrics to the given file, default: stdout
--output-format, -O {tsv,prometheus,null}
output format, choices: tsv, prometheus, null, default: tsv
--port, -p [PORT]
start a prometheus server on the given port, default disabled, port 8999 if unspecified
--refresh, -R
download a recent RIB cache file and exit
--repl run a REPL thread in main loop
--manhole
setup a REPL socket with manhole
--debug
more debugging output
ADDRESS
zero or more IP addresses to parse directly from the command line, before the input stream is
read. disables the default stdin reading and --input-format cannot be changed.
INPUT FORMATS
The --input-formats option warrants more discussion.
line
The line input formats treats each line in the stream as an IP address, counting as one hit.
Empty lines are skipped, and comments – whatever follows the pound (#) sign – are trimmed. Whatever
cannot be parsed as an IP address is loggd as a warning and skipped.
This, for example, counts as a hit on two different IP addresses, for a total of two hits. It will also
yield a warning:
192.0.2.1 # comment
2001:DB8::1
# comment
garbage that generates a warning
tuple
Same as the line input format, except the count is specified in a second, whitespace-separated field.
This, for example, will count one hit for the first IP address, and two for the second one, and will
generate a warning.
192.0.2.1 1 # comment
2001:DB8::1 2
# comment
garbage that generates a warning
The “count” field can be anything: to represent a count, but also sizes, timings, asncounter doesn’t
care.
The counts are actually parsed as floats, as Python understand them.
The default output format (tsv) will round the numbers to the nearest even integer. This, for example,
adds up to 5, which might be surprising to some (because Python rounds 2.5 to 2 and not 3):
192.0.2.2 3.4
192.0.2.2 2.5
This is known as the “rounding half to even” rule or the IEEE 754 standard.
If the --output-format is set to prometheus floats will be recorded as accurately as Python allows. In
that context, the above correctly sums up to 5.9.
tcpdump
The tcpdump format is a bit of an odd ball: it parses a tcpdump(1) line with a regular expression to
extract the source IP address, and counts that.
It could be extended to count the packets sizes but currently does not do so. Likewise, it only tracks
the left (source) side of packets, and not the destination, but could be extended to track both.
This approach likely can’t deal with a multi-gigabit per second small packet attack (2 million packets
per second or more). But in a real production environment, it could easily deal with regular the 100-200
megabit per second traffic, where tcpdump and asncounter each took about 2% of one core to handle about
3-5 thousand packets per second.
scapy
The scapy input format is also special: instead of parsing text lines, it parses packets.
With the --interface flag, it will open the default interface unless one is provided (e.g. --interface is
generally equivalent to --interface eth0 if eth0 is the primary interface). This requires elevated
privileges.
This is much slower than the tcpdump parser (close to full 100% CPU usage) in a 100-200mbps scenario like
above, but could eventually be leveraged to implement byte counts, which are harder to extract from
tcpdump because of the variability of its output.
This only counts packets, regardless of direction, and, like tcpdump, only keeps track of source IP
addresses. Like tcpdump, it could also be improved by tracking sizes instead of counts, but does not
currently do so.
OUTPUT FORMATS
The --output-format argument also warrants a little more discussion.
tsv
TSV stands for Tab-Separated Values. It’s a poorly designed output formats that dumps two tables where
rows are separated by newlines and columns by tabs. One table shows per ASN counts, the other shows per
prefix counts.
As mentioned in the above tuple section, counts are rounded when recorded in tsv mode. This is to
simplify the display, in theory, the underlying
https://docs.python.org/3/library/collections.html#collections.Counterfl Counter supports floats as well.
If more precision, long term storage, or alerting are needed, the prometheus output format is preferred.
This format is useful because it doesn’t require any dependency outside of the standard library (and,
obviously, pyasn).
prometheus
The prometheus output format keeps tracks of counters inside Prometheus data structures. With the --port
flag, it will open up a port (defaulting to 8999) where metrics will be exposed over HTTP, without any
special security, on all interfaces.
Otherwise, upon completion, results will be written in a textfile collector-compatible format.
null
The null output formats doesn’t display anything. It can be used for debugging, but internally uses the
same recorder as the tsv format.
EXAMPLES
Simple web log counter
This extracts the IP addresses from current access logs and reports ratios:
> awk '{print $2}' /var/log/apache2/*access*.log | asncounter
INFO: using datfile ipasn_20250527.1600.dat.gz
INFO: collecting addresses from <stdin>
INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz...
INFO: finished reading data
INFO: loading /home/anarcat/.cache/pyasn/asnames.json
count percent ASN AS
12779 69.33 66496 SAMPLE, CA
3361 18.23 None None
366 1.99 66497 EXAMPLE, FR
337 1.83 16276 OVH, FR
321 1.74 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US
309 1.68 14061 DIGITALOCEAN-ASN, US
128 0.69 16509 AMAZON-02, US
77 0.42 48090 DMZHOST, GB
56 0.3 136907 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK
53 0.29 17621 CNCGROUP-SH China Unicom Shanghai network, CN
total: 18433
count percent prefix ASN AS
12779 69.33 192.0.2.0/24 66496 SAMPLE, CA
3361 18.23 None
298 1.62 178.128.208.0/20 14061 DIGITALOCEAN-ASN, US
289 1.57 51.222.0.0/16 16276 OVH, FR
272 1.48 2001:DB8::/48 66497 EXAMPLE, FR
235 1.27 172.160.0.0/11 8075 MICROSOFT-CORP-MSN-AS-BLOCK, US
94 0.51 2001:DB8:1::/48 66497 EXAMPLE, FR
72 0.39 47.128.0.0/14 16509 AMAZON-02, US
69 0.37 93.123.109.0/24 48090 DMZHOST, GB
53 0.29 27.115.124.0/24 17621 CNCGROUP-SH China Unicom Shanghai network, CN
This can also be done in real time of course:
tail -F /var/log/apache2/*access*.log | awk '{print $2}' | asncounter
The above report will be generated when the process is killed. Send SIGHUP to show a report without
interrupting the parser:
pkill -HUP asncounter
You can count sizes with --input-format=tuple as well. Assuming the size field is in the 10th column,
this will sum sizes instead of just number of hits:
tail -F /var/log/apache2/*access*.log | awk '{print $1 $10}' |
asncounter --input-format=tuple
If logs hold that information, you can also add up processing times, for example.
tcpdump parser
Extract IP addresses from incoming TCP/UDP packets on eth0 and report the top 5:
> tcpdump -c 10000 -q -i eth0 -n -Q in "(udp or tcp)" | asncounter --top 5 --input-format tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
INFO: collecting IPs from stdin, using datfile ipasn_20250523.1600.dat.gz
INFO: loading datfile /root/.cache/pyasn/ipasn_20250523.1600.dat.gz...
INFO: loading /root/.cache/pyasn/asnames.json
ASN count AS
136907 7811 HWCLOUDS-AS-AP HUAWEI CLOUDS, HK
8075 254 MICROSOFT-CORP-MSN-AS-BLOCK, US
62744 164 QUINTEX, US
24940 114 HETZNER-AS, DE
14618 82 AMAZON-AES, US
prefix count
166.108.192.0/20 1294
188.239.32.0/20 1056
166.108.224.0/20 970
111.119.192.0/20 951
124.243.128.0/18 667
A query similar to the HTTP log parser might be:
tcpdump -q -i eth0 -n -Q in "tcp and (port 80 or port 443)" | grep
'Flags \[S\]' | asncounter --input-format=tcpdump --repl
... otherwise you will get different results from a pure packet count, as various connections will yield
different number of packets! The above counts connection attempts, which is still different than an
actual HTTP hit, as the connection could be refused before it reaches the webserver or aborted before it
gets logged properly.
It’s still a good estimate, and is especially useful if you do not log IP addresses, for example on high
traffic caching servers.
Note that we use grep above because tcpdump’s tcp[tcpflags] & tcp-syn != 0 only works for IPv4 packets, a
disappointing (but understandable) limitation.
scapy parser
Extract IP addresses directly from the network interface, bypassing tcpdump entirely:
asncounter --interface
REPL
With --repl, you will drop into a Python shell where you can interactively get real-time statistics:
> awk '{print $2}' /var/log/apache2/*access*.log | asncounter --repl --top 2
INFO: using datfile ipasn_20250527.1600.dat.gz
INFO: collecting addresses from <stdin>
INFO: starting interactive console, use recorder.display_results() to show current results
INFO: recorder.asn_counter and .prefix_counter dictionaries have the full data
Python 3.11.2 (main, Apr 28 2025, 14:11:48) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> INFO: loading datfile /home/anarcat/.cache/pyasn/ipasn_20250527.1600.dat.gz...
INFO: finished reading data
>>> recorder.display_results()
INFO: loading /home/anarcat/.cache/pyasn/asnames.json
count percent ASN AS
13008 69.38 66496 SAMPLE, CA
3422 18.25 None None
total: 18748
count percent prefix ASN AS
13008 69.38 192.0.2.0/24 66496 SAMPLE, CA
3422 18.25 None
total: 18748
>>> recorder.asn_counter
Counter({66496: 13008, None: 3422, [...]})
>>> recorder.prefix_counter
Counter({'192.0.2.0/24': 13008, None: 3422, [...]})
So you can get the actual number of hits for an AS, even if it’s not listed in the --top entries with:
>>> recorder.asn_counter.get(66496)
13008
Blocking whole networks
asncounter does not block anything: it only counts. Another mechanism needs to be used to actually block
attackers or act on the collected data.
If you want to block the network blocks, you can use the shown netblocks directly in (say) Linux’s
netfilter firewall, or Nginx’s access module geo module. For example, this will reject traffic from a
network with iptables:
iptables -I INPUT -s 192.0.2.0/24 -j REJECT
or with nftables:
nft insert rule inet filter INPUT 'ip saddr 192.0.2.0/24 reject'
This will likely become impractical with large number of networks, look into IP sets to scale that up.
With Nginx, you can block a network with the deny directive:
deny 192.0.2.0/24;
This will return a 403 status code. If you want to be fancier, you can return a tailored status code and
build a larger list with the geo module:
geo $geo_map_deny {
default 0;
192.0.2.0/24 1;
}
if ($geo_map_deny) {
return 429;
}
Many networks can be listed in the geo block relatively effectively.
[pyasn][] doesn’t (unfortunately) provide an easy command line interface to extract the data you need to
block an entire AS. For that, you need to revert to some Python. From inside the --repl loop:
print("\n".join(sorted(recorder.asn_all_prefixes(64496))))
This will give you the list of ALL prefixes associated with AS64496, which is actually empty in this
case, as AS64496 is an example AS from RFC5398.
Note the list of prefixes is not aggregated by default. If netaddr is installed, you can pass
aggregate=True to reduce the set.
Aggregating results
It might be worth aggregating large number of netblocks for performance reasons. Network block
announcements can be spread in multiple contiguous blocks for various reasons and can often be unified in
smaller sets. For IPv4-only, iprange is good (and fast) enough:
> grep -v :: networks > networks-ipv4
> iprange < networks-ipv4 > networks-ipv4-filtered
> wc -l networks*
588 networks
495 networks-ipv4
181 networks-ipv4-filtered
If you have it installed, the netaddr Python package can also do that for you, and it supports IPv6:
import netaddr
print("\n".join([str(n) for n in netaddr.cidr_merge(recorder.asndb.get_as_prefixes(64496))]))
Note that asncounter can aggregate those results directly now, for example:
print(recorder.asn_all_prefixes_str(66496, aggregate=True))
... but, as above, it requires the netaddr package to be available.
Selective blocking
A more delicate approach is to block all network blocks from a specific ASN that have been found in the
result sets, instead of blocking the entire netblock.
The recorder.as_prefixes and recorder.as_prefixes_str functions can do this for you, merging multiple
ASNs and aggregating with netaddr as well:
print(recorder.asn_prefixes_str(66496, 66497, aggregate=True))
Note that the asn_prefixes selectors are not implemented in Prometheus mode.
Remember you can extract the list of current ASNs and prefixes just by looking at the dictionary keys as
well:
print("\n".join(recorder.asn_counter.keys()))
print("\n".join(recorder.prefix_counter.keys()))
FILES
~/.cache/pyasn/
Default storage location for pyasn cache files.
/run/$UID/asncounter-manhole-$PID or ~/.local/.state/asncounter-manhole-$PID
Default location for the debugging manhole socket, if enabled.
LIMITATIONS
• only counts, does not calculate bandwidth, but could be extended to do so
• does not actually do any sort of mitigation or blocking, purely an analysis tool; if you want such
mitigation, hook up asncount in Prometheus and AlertManager with web hooks, this is not a fail2ban
rewrite
• test coverage is relatively low, 37% as of this writing. most critical paths are covered, although not
the scapy parser or the RIB file download procedures
• requires downloading RIB files, could be improved by talking directly with a BGP router daemon like
Bird or FRR
• only a small set of tcpdump outputs have been tested
• the REPL shell does not have proper readline support (keyboard arrows, control characters like “control
a” do not work)
Note that this documentation and test code uses sample AS numbers from RFC5398, IPv4 addresses from
RFC5737, and IPv6 addresses from RFC3849. Some more well known entities (e.g. Amazon, Facebook) have not
been redacted from the output for clarity.
Performance considerations
As mentioned above, this will unlikely tolerate multi-gigabit denial of service attacks. The tcpdump
parser, however, is pretty fast and should be able to sustain a saturated gigabit link under normal
conditions. The scapy parser is slower.
Memory usage seems reasonable: on startup, it uses about 250MB of memory, and a long-running process with
about 40 000 blocks was using about 400MB.
By extrapolation, it is expect that data on the full routing table (currently 1.2 million entries) could
be held within 12 GB of memory, although that would be a rare condition, only occurring on a core router
with traffic from literally the entire internet.
Security considerations
There’s an unknown in the form of the C implementation of a Radix tree in pyasn. asncounter itself
should be fairly safe: it does not trust its inputs, and the worse it can do is likely a resource
exhaustion attack on high traffic.
It can run completely unprivileged as long as it has access to the input files, although in many
scenarios people will not bother to drop privileges before calling it and it will not, itself, attempt to
do so.
Privileges can be dropped with systemd-run, for example:
systemd-run --pipe --property=DynamicUser=yes \
--property=CacheDirectory=asncounter \
--setenv=XDG_CACHE_HOME=/var/cache/asncounter \
-- asncounter
This interacts poorly with --repl option, as it tries to reopen the tty for stdin. You might have better
luck with sharing a debug socket with --manhole:
systemd-run --pipe --property=DynamicUser=yes \
--property=CacheDirectory=asncounter \
--setenv=XDG_CACHE_HOME=/var/cache/asncounter \
-- asncounter --manhole=/var/cache/asncounter/asncounter-manhole
Then you can open a Python debugging shell for further diagnostics with:
nc -U /var/cache/asncounter/asncounter-manhole
AUTHOR
Antoine Beaupré anarcat@debian.org
SEE ALSO
tcpdump(8), fail2ban(1)
ASNCOUNTER(1)