Irfan Salam

Latest ICT news, Technology Reviews and Tutorials

TCP Global Synchronization

TCP global synchronization in computer networks can happen to TCP/IP flows during periods of congestion because each sender will reduce their transmission rate at the same time when packet loss occurs.

Routers on the Internet normally have packet queues, to allow them to hold packets when the network is busy, rather than discarding them.

Because routers have limited resources, the size of these queues is also limited. The simplest technique to limit queue size is known as tail drop. The queue is allowed to fill to its maximum size, and then any new packets are simply discarded, until there is space in the queue again.

This causes problems when used on TCP/IP routers handling multiple TCP streams, especially when bursty traffic is present. While the network is stable, the queue is constantly full, and there are no problems except that the full queue results in high latency. However, the introduction of a sudden burst of traffic may cause large numbers of established, steady streams to lose packets simultaneously.

TCP has automatic recovery from dropped packets, which it interprets as congestion on the network (which is usually correct). The sender reduces its sending rate for a certain amount of time, and then tries to find out if the network is no longer congested by increasing the rate again subject to a ramp-up. This is known as the slow-start algorithm.

Almost all the senders will use the same time delay before increasing their rates. When these delays expire, at the same time, all the senders will send additional packets, the router queue will again overflow, more packets will be dropped, the senders will all back off for a fixed delay… ad infinitum.

This pattern of each sender decreasing and increasing transmission rates at the same time as other senders is referred to as “global synchronization” and leads to inefficient use of bandwidth, due to the large numbers of dropped packets, which must be retransmitted, and because the senders have a reduced sending rate, compared to the stable state, while they are backed-off, following each loss.

This problem has been the subject of much research. The consensus appears to be that the tail drop algorithm is the leading cause of the problem, and other queue size management algorithms such as Random Early Detection (RED) and Weighted RED will reduce the likelihood of global synchronization, as well as keeping queue sizes down in the face of heavy load and bursty traffic.

source: http://en.wikipedia.org/wiki/TCP_global_synchronization

Advertisement

Tracetcp – Utility to trace network routes on specific ports

As Network Engineers often times we get stuck in troubleshooting reach-ability of servers. Always in doubt if certain port (HTTP/80 in website case) is accessible from Internet or is it getting blocked on some firewall in transit.

Usual traceroute utilities in operating system doesn’t help in this case as they do no let you to take traces on particular ports. While searching for a solution I found a ‘traceroute utility that uses tcp syn packets to trace network routes’.

This utility allows to run traces on any tcp port (e.g. FTP/21, HTTP/80, SMTP/25 etc.) to see if the port is getting block at any hop.

Download link: https://github.com/SimulatedSimian/tracetcp/releases/download/v1.0.0/tracetcp_v1.0.0.zip

Here is an example of trace on port 80:

$ tracetcp www.ebay.co.uk

Tracing route to 66.135.192.41 [www.ebay.co.uk] on port 80
Over a maximum of 30 hops.
1       1 ms    1 ms    2 ms    192.168.0.1     [wintermute]
2       10 ms   9 ms    11 ms   10.78.128.1
3       10 ms   11 ms   8 ms    62.30.193.33    [gsr01-so.blueyonder.co.uk]
4       10 ms   9 ms    10 ms   172.18.14.45
5       14 ms   13 ms   14 ms   172.18.14.62
6       12 ms   13 ms   14 ms   194.117.136.18  [tele2-witt-pos.telewest.net]
7       12 ms   12 ms   14 ms   166.63.222.37   [zcr1-so-5-0-0.Londonlnt.cw.net]
8       182 ms  164 ms  164 ms  208.172.146.100 [dcr2-loopback.SantaClara.cw.net]
9       163 ms  163 ms  164 ms  208.172.156.198 [bhr1-pos-0-0.SantaClarasc8.cw.net]
10      165 ms  165 ms  167 ms  66.35.194.50    [csr1-ve243.SantaClarasc8.cw.net]
11      165 ms  165 ms  164 ms  66.35.212.190
12      168 ms  169 ms  169 ms  66.135.207.253
13      166 ms  169 ms  171 ms  66.135.207.174
14      *       *       *       Request timed out.
15      Destination Reached in 170 ms. Connection established to 66.135.192.41
Trace Complete.

If we do the same trace on port 135 we can see that it is blocked after hop 2.

$ tracetcp www.ebay.co.uk:135

Tracing route to 66.135.192.41 [www.ebay.co.uk] on port 135
Over a maximum of 30 hops.
1       1 ms    1 ms    1 ms    192.168.0.1     [wintermute]
2       10 ms   13 ms   9 ms    10.78.128.1
3       *       *       *       Request timed out.
4       *       *       *       Request timed out.
5       *       *       *       Request timed out.
6 ... continues until maximum number of hops reached.

More examples: http://simulatedsimian.github.io/tracetcp_examples.html

This utility is perfect in troubleshooting server issues, when you want to make sure if service port is not blocked by ISP or network firewall.

Cisco Flex Links – Backup Interface

Cisco Flex Links gives network operators a simple, reliable, and more scalable method of layer 2 redundancy. The Spanning Tree Protocol (STP) is not destined for the scrap bin, but it will certainly fall out of favor with many enterprise networks.

Flex Links are a pair of layer 2 interfaces configured to act as a backup of each other. Configuring Flex Links is very simple, but it’s a manual process. Spanning tree can configure itself if you just enable it, albeit likely a sub-optimal configuration, but a working one nonetheless. Flex Links, on the other hand, require manual setup and layout of your layer 2 network. If you don’t want to leave anything to chance, then Flex Links are preferred over STP.

The benefits of Flex Links include:

  • simplicity, which equals stability.
  • instant failover.
  • rudimentary load balancing capabilities, so one link isn’t wastefully idle.
  • load balancing works across switches in a stack, including port channels.

A Flex Links pair’s primary operating mode is just like spanning tree: one on, one off. With per-VLAN spanning tree, a trunk port can have some VLANs enabled and some blocked at the same time, so on the surface it seems that STP is superior. In reality, you can configure Flex Links to load balance VLANs, and we’ll show you how shortly.

Configuration

Conceptually, you configure a Flex Links pair by telling one link it’s the active link, and another that it’s the backup of that primary (active) one. Without configuring VLAN load balancing, it will completely disable the backup, and if the active link goes down the backup will take over.

For example, to configure port gi1/0/1 as a active link, and gi1/0/2 as the backup, you’d run:

Switch# configure terminal
Switch(conf)# interface gigabitethernet1/0/1
Switch(conf-if)# switchport backup interface gigabitethernet1/0/2

That’s all there is to configuring the basic mode, which gets you failover but no load balancing. Before talking about load balancing, let’s take a look at preemption and “mac address-table move update.”

Preemption

Preemption, that is, the preferred port for forwarding traffic, is also configurable. This is most often used in combination with multiple links that have differing bandwidth capacities. If you wish to ensure that port 1, a primary port that has more bandwidth, will return to the active link when it comes back up, you would set: interface preemption mode bandwidth and switchport backup interface preemption delay. The delay is used to set the amount of time (in seconds) to wait before allowing port 1 to preempt port 2 and begin taking over traffic again.

MAC Address-Table Move Update

Enabling the MAC address-table move update feature allows for rapid convergence when a primary link goes down and the backup takes over traffic forwarding duties. Without this feature enabled, neighboring switches may continue to forward traffic for a short time to a dead port, since they have learned MAC addresses associated with that link.

When move update is enabled, the switch containing Flex Links will broadcast an update packet to let other switches know what happened, and they will in turn un-learn that false MAC address mapping.

On the switch with Flex Links, simply configure:

Switch(conf)# mac address-table move update transmit

All switches, including ones with Flex Links, need to receive these updates. This is not enabled by default, so you’ll need to run the following command on all of your devices:

Switch(conf)# mac address-table move update receive

To see the status and verify that “move update” is enabled, run: show mac address-table move update. Checking the status of your Flex Links is much the same:

show interfaces [interface-id] switchport backup.

Load Balancing

Flex Links should be configured such that both ports are forwarding traffic at the same time. This way, you get load balancing in addition to redundancy. The limitation is that only one port can be forwarding a single VLAN at a time. If we have VLANs 1-200, we need to choose which VLANs are forwarded primarily through which port. The most simple configuration, ignoring traffic requirements, would be that VLANs 1-100 use port 1, and VLANs 101-200 use port 2.

Before we get into configuring preferred VLANs, let’s talk about multicast . Multicast, of course, becomes an issue with this type of setup. If a port passed an IGMP join, and the switch is part of a multicast group, when the port goes down the switch will no longer be able to receive multicast traffic for that group. The quick fix is to make both Flex Links always be part of learned groups, with the command switchport backup interface gigabitEthernet 1/0/12 multicast fast-convergence.

Now, on to VLAN load balancing. It is quite easy; just specify which VLANs you prefer on which links:

Switch(config-if)#switchport backup interface gigabitEthernet1/0/2 prefer vlan 101-200.

If you have VLANs 1-200 on the switch, show interfaces switchport backup will show you:

Vlans Preferred on Active Interface: 1-100
Vlans Preferred on Backup Interface: 101-200

If a link goes down, VLANs that are preferred on that interface will be moved to the other link in the pair. Likewise, when a link returns to service, its preferred VLANs are blocked on the backup and returned to the preferred link.

Be sure to run show interfaces switchport backup detail to see the full status, including link speeds, preemption modes, the MAC address-table move update status.

In summary, the simplicity of Flex Links make it a better choice for carrier and core enterprise networks over the ubiquitous spanning tree protocol. Link-level redundancy is had via STP, but with Flex Links you have more control and better load balancing capabilities. This certainly means that it takes longer to configure since you are planning the layer 2 network manually, but when you need a stable no-surprises link-layer network, Flex Links are definitely the way to go.

written by: Charlie Schluting; the author of Network Ninja, a must-read for every network engineer.

Change Speed and Duplex of Ethernet card in Linux

To change Speed and Duplex of an Ethernet card, we can use ethtool – a Linux utility for Displaying or Changing Ethernet card settings.

1. Install ethtool

You can install ethtool by typing one of the following commands, depending upon your Linux distribution.

Install ethtool in Fedora, CentOS, RHEL etc.:

# yum install ethtool

Install ethtool in Ubuntu, Debian etc.

# sudo apt-get install ethtool

2. Get the Speed, Duplex and other information for the interface eth0

To get speed, duplex and other information for the network interface eth0, type the following command as root.

 # ethtool eth0

Sample output:

Settings for eth0:

Supported ports: [ MII ]

Supported link modes:   10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Half 1000baseT/Full

Supports auto-negotiation: Yes

Advertised link modes: 10baseT/Half 10baseT/Full

100baseT/Half 100baseT/Full

1000baseT/Half 1000baseT/Full

Advertised auto-negotiation: Yes

Speed: 100Mb/s

Duplex: Half

Port: Twisted Pair

PHYAD: 1

Transceiver: internal

Auto-negotiation: on

Supports Wake-on: g

Wake-on: d

Current message level: 0x000000ff (255)

Link detected: yes

3. Change the Speed and Duplex settings

The next command disables Auto-Negotiation, enables Full Duplex and sets up Speed to 100 Mb/s:

 # sudo ethtool -s eth0 autoneg off speed 100 duplex full

Restart the interface to apply changes:

 # ifdown eth0 && ifup eth0

4. Change the Speed and Duplex Settings permanently

Fedora, CentOS, RHEL etc.

To make settings permanent, you need to edit /etc/sysconfig/network-scripts/ifcfg-eth0 file for eth0 interface. This file is used by RHEL, CentOS, Fedora etc.

 # vi /etc/sysconfig/network-scripts/ifcfg-eth0

Append the line as follows to disable Auto-Negotiation, enable Full Duplex and set up Speed to 100 Mb/s:

 ETHTOOL_OPTS=”speed 100 duplex full autoneg off”

Ubuntu, Debian etc.

If you need to have this settings every time you boot up the PC, you have to add a command in the /etc/network/interfaces file, so:

 sudo vim /etc/network/interfaces

And at the beginning of the file add:

 pre-up /usr/sbin/ethtool -s eht0 autoneg off 100 duplex full

That will set the parameters before bringing the interface up.

Switch Where You Can, Route Where You Must

Pravin Mahajan - headshotBy Pravin Mahajan

Director, Corporate Marketing

Back in the days of FDDI optical and 10Base5 Thick Ethernet technologies (you remember them, don’t you?), I came across the phrase, “Switch where you can, route where you must.” As a young professional I wondered about it, and applied it to the situation at hand – designing a campus network for a mobile operator. The Synoptics 5000 system was deployed at several places in the network providing Layer 1 and 2 transmission hub and switching functions, while a single Wellfleet Backbone Concentrator Node (BCN) system provided Layer 3 routing (the two companies had actually merged several months before to form Bay Networks). The operator’s campus infrastructure worked perfectly, scaling as traffic volume exploded driven by the mobile revolution.

Route Where 1

I recalled having debates with my manager on the role of OSI Layer 1, 2, 3 technologies in network devices – can or should…

View original post 745 more words

Using Cisco VIRL for CCIE Preparation by Brian McGahan

source: http://blog.ine.com/2014/12/04/using-cisco-virl-for-ccie-preparation/

After long anticipation, Cisco’s Virtual Internet Routing Lab (VIRL) is now publicly available. VIRL is a network design and simulation environment that includes a GNS3-like frontend GUI to visually build network topologies, and an OpenStack based backend which includes IOSv, IOS XRv, NX-OSv, & CSR1000v software images that run on the built-in hypervisor. In this post I’m going to outline how you can use VIRL to prepare for the CCIE Routing & Switching Version 5.0 Lab Exam in conjunction with INE’s CCIE RSv5 Advanced Technologies Labs.

The first step of course is to get a copy of VIRL. VIRL is currently available for purchase from virl.cisco.com in two forms, a “Personal Edition” for a $200 annual license, and an “Academic Version” for an $80 annual license. Functionally these two versions are the same. Next is to install VIRL on a hypervisor of your choosing, such as VMWare ESXi, Fusion, or Player. Make sure to follow the installation guides in the VIRL documentation, because the install is not a very straightforward process. When installing it on VMWare Player I ran into a problem with the NTPd not syncing, which resulted in the license key not being able to register. In my case I had to edit the /etc/ntp.conf file manually to specify a new NTP server, which isn’t listed as a step in the current install guide. If you run into problems during install check the VIRL support community, as it’s likely that someone has already run into your particular install issue, and a workaround may be listed there.

Once VIRL and VM Maestro (the GUI frontend) is up and running, the next step is to build your topology. For the INE CCIE RSv5 Advanced Technology Labs, this topology will be 10 IOS or IOS XE instances that are connected to a single vSwitch. All you need to do to build this is to add the 10 IOS instances, and then connect them all to a single “Multipoint Connection”. Logical network segments will then later be built based on the initial configurations that you load on the routers for a specific lab. The end result of the topology should look something like this:

You may also want to add some basic customization to the topology file and the VM Maestro interface. I set the hostnames of the devices to R1 – R10 by clicking on the router icon, then setting the “Name” under the Properties tab.

Next under the File > Preferences > Terminal > Cisco Terminal you can set the options to use your own terminal software instead of the built in one. In my case I set the “Title format” variable to “%s”, which makes it show just the hostname in the SecureCRT tab, and set the “Telnet command” to “C:\Program Files\VanDyke Software\SecureCRT\SecureCRT.exe /T /N %t /TELNET %h %p”, which makes it spawn a SecureCRT tabbed window when I want to open the CLI to the routers. Your options of course may vary depending on your terminal software and its install location.

Next, click the “Launch Simulation” button on the topology to start the routers. Assuming everything is correct with your install, and you have enough CPU & memory resources, the instances should boot and show the “ACTIVE” state, similar to what you see below:

If you right click on the device name you’ll see the option to telnet to the console port. Note that the port number changes every time you restart the simulation, so I found it easier just to launch the telnet sessions from here instead of creating manual sessions under the SecureCRT database.

You should now be able to connect to the consoles of the routers and see them boot, such as you see below:

R1 con0 is now available

Press RETURN to get started.

**************************************************************************
* IOSv is strictly limited to use for evaluation, demonstration and IOS  *
* education. IOSv is provided as-is and is not supported by Cisco's      *
* Technical Advisory Center. Any use or disclosure, in whole or in part, *
* of the IOSv Software or Documentation to any third party for any       *
* purposes is expressly prohibited except as otherwise authorized by     *
* Cisco in writing.                                                      *
**************************************************************************
R1>
R1>enable
R1#show version
Cisco IOS Software, IOSv Software (VIOS-ADVENTERPRISEK9-M), Experimental Version 15.4(20141119:013030) [jsfeng-V154_3_M 107]
Copyright (c) 1986-2014 by Cisco Systems, Inc.
Compiled Tue 18-Nov-14 20:30 by jsfeng

ROM: Bootstrap program is IOSv

R1 uptime is 46 minutes
System returned to ROM by reload
System image file is "flash0:/vios-adventerprisek9-m"
Last reload reason: Unknown reason

This product contains cryptographic features and is subject to United
States and local country laws governing import, export, transfer and
use. Delivery of Cisco cryptographic products does not imply
third-party authority to import, export, distribute or use encryption.
Importers, exporters, distributors and users are responsible for
compliance with U.S. and local country laws. By using this product you
agree to comply with applicable laws and regulations. If you are unable
to comply with U.S. and local laws, return this product immediately.

A summary of U.S. laws governing Cisco cryptographic products may be found at:

http://www.cisco.com/wwl/export/crypto/tool/stqrg.html

If you require further assistance please contact us by sending email to
export@cisco.com.

Cisco IOSv (revision 1.0) with  with 484729K/37888K bytes of memory.
Processor board ID 9B2DD0A36JBLXZY7SLJTF
2 Gigabit Ethernet interfaces
DRAM configuration is 72 bits wide with parity disabled.
256K bytes of non-volatile configuration memory.
2097152K bytes of ATA System CompactFlash 0 (Read/Write)
0K bytes of ATA CompactFlash 1 (Read/Write)
0K bytes of ATA CompactFlash 2 (Read/Write)
1008K bytes of ATA CompactFlash 3 (Read/Write)

Configuration register is 0x0

R1#

With this basic topology you should have the 10 IOSv instances connected on their Gig0/1 interface to the same segment. The Gig0/0 interface is used for scripting inside the VIRL application, and can be shutdown for our purposes. The end result after the images boot should be something similar to this:

R1#show cdp neighbor
Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
                  S - Switch, H - Host, I - IGMP, r - Repeater, P - Phone,
                  D - Remote, C - CVTA, M - Two-port Mac Relay 

Device ID        Local Intrfce     Holdtme    Capability  Platform  Port ID
R9.openstacklocal
                 Gig 0/1           177              R B   IOSv      Gig 0/1
R8.openstacklocal
                 Gig 0/1           167              R B   IOSv      Gig 0/1
R3.openstacklocal
                 Gig 0/1           155              R B   IOSv      Gig 0/1
R2.openstacklocal
                 Gig 0/1           177              R B   IOSv      Gig 0/1
R7.openstacklocal
                 Gig 0/1           156              R B   IOSv      Gig 0/1
R6.openstacklocal
                 Gig 0/1           146              R B   IOSv      Gig 0/1
R5.openstacklocal
                 Gig 0/1           129              R B   IOSv      Gig 0/1
R4.openstacklocal
                 Gig 0/1           153              R B   IOSv      Gig 0/1
R10.openstacklocal
                 Gig 0/1           146              R B   IOSv      Gig 0/1

Total cdp entries displayed : 9

Next you can load your initial configs for the lab you want to work on, and you’re up and running! I’ve taken the liberty of converting the CSR1000v formatted initial configs for our Advanced Technologies Labs to the IOSv format, as the two platforms use different interface numbering. Click here to download these initial configs as well as the .virl topology file that I created.

For further discussions on this see the IEOC thread Building INE’s RSv5 topology on VIRL.

See Wifi Password of learned networks in Windows 8 / 8.1

Often time it happens that someone come to you and ask the password of the Wifi that you are connected with. And if you are using Windows 8 or 8.1 It becomes teasing if you forgot the passcode and haven’t written it down anywhere.

After being irritated several times trying to find the password of the Wifi Network that I previously connected, I finally found a solution and that’s quite simple.

Open command prompt (press ‘windows key + r’, type ‘cmd’ and click ‘ok’)

C:\Users\Irfan>netsh wlan show profile name=wifi-live key=clear

Profile wifi-live on interface Wi-Fi:
=======================================================================
Applied: All User Profile
Profile information
——————-
Version                     : 1
Type                         : Wireless LAN
Name                       : wifi-live
Control options          :
Connection mode      : Connect automatically
Network broadcast    : Connect even if this network is not broadcasting
AutoSwitch               : Do not switch to other networks

Connectivity settings
———————
Number of SSIDs     : 1
SSID name              : “wifi-live”
Network type           : Infrastructure
Radio type               : [ Any Radio Type ]
Vendor extension      : Not present

Security settings
—————–
Authentication         : WPA2-Personal
Cipher                     : CCMP
Security key           : Present
Key Content            : abcxyz123 <— The password is mentioned here

Cost settings
————-
Cost                            : Unrestricted
Congested                   : No
Approaching Data Limit : No
Over Data Limit             : No
Roaming                      : No
Cost Source                 : Default

GNS3 Launch Video

5.5-Inch iPhone 6 mockup compared to iPhone 5s and other Android phablets (Video)

DNS reflection defense – The Akamai Blog:

DNS reflection defense – The Akamai Blog:.

Recently, DDoS attacks have spiked up well past 100 Gbps several times. A common move used by adversaries is the DNS reflection attack, a category of Distributed, Reflected Denial of Service (DRDos) attack. To understand how to defend against it, it helps to understand how it works.

How DNS works

At the heart of the Domain Name System are two categories of name server: the authoritative name server, which is responsible for providing authoritative answers to specific queries (like use5.akam.net, which is one of the authoritative name servers for the csoandy.com domain), and the recursive name server, which is responsible for answering any question asked by a client. Recursive name servers (located in ISPs, corporations, and data centers around the world) query the appropriate authoritative name servers around the Internet, and return an answer to the querying client. An open resolver is a category of resolver that will answer recursive queries from any client, not just those local to them. Because DNS requests are fairly small and lightweight, DNS primarily uses the Universal Datagram Protocol (UDP), a stateless messaging system. Since UDP requests can be sent in a single packet, the source address are easily forgeable with any address desired by the true sender.


DNS reflection

A DNS reflection attack takes advantage of three things: the forgeability of UDP source addresses, the availability of open resolvers, and the asymmetry of DNS requests and responses. To conduct an attack, an adversary sends a set of DNS queries to open resolvers, altering the source address on their requests to be those of their chosen target. The requests are designed to have much larger responses (often, using an ANY request, a 64 byte request yields a 512-byte response), thus resulting in the recursive name servers sending about 8 times as much traffic at the target as they themselves received. A DNS reflection attack can directly use authoritative name servers, but it requires more preparation and research, making requests specific to the scope of each DNS authority used.


Eliminating DNS reflection attacks

An ideal solution would obviously be to eliminate this type of attack, rather than every target needing to defend themselves. Unfortunately, that’s challenging, as it requires significant changes by infrastructure providers across the Internet.


BCP38

No discussion of defending against DRDoS style attacks is complete without a nod to BCP38. These attacks only work because an adversary, when sending forged packets, has no routers upstream filtering based on the source address. There is rare need to permit an ISP user to send packets claiming to originate in another ISP; if BCP38 were adopted and implemented in a widespread fashion, DRDoS would be eliminated as an adversarial capability. That’s sadly unlikely, as BCP38 enters its 14th year; the complexity and edge cases are significant.


The open resolvers

While a few enterprises have made providing an open resolver into a business (OpenDNS, GoogleDNS), many open resolvers are either historical accidents, or resulting from incorrect configuration. Even MIT has turned off open recursion on its high-profile name servers.
Barring that, recursive name servers should implement rate limiting, especially on infrequent request types, to reduce the multiplication of traffic that adversaries can gain out of them.


Self-defense

Until ISPs and resolver operators implement controls to limit how large attacks can become, attack targets must defend themselves. Sometimes, attacks are targeted at infrastructure (like routers and name servers), but most often they are being targeted at high-profile websites operated by financial services firms, government agencies, retail companies, or whoever has caught the eye of the attacker this week.
An operator of a high-profile web property can take steps to defend their front door. The first step, of course, should be to find their front door; and to understand what infrastructure it relies on. And then they can evaluate their defenses.


Capacity

The first line of defense is always capacity. Without enough bandwidth at the front of your defenses, nothing else matters. This needs to be measurable both in raw bandwidth, as well as in packets per second, because hardware often has much lower bandwidth capacity as packet sizes shrink. Unfortunately, robust capacity is now measurable in the 300+ gigabits per second, well beyond the resources of the average datacenter. However, attacks in the 3-10 gigabit per second range are still common, and well within the range of existing datacenter defenses.


Filtering

For systems that aren’t DNS servers themselves, filtering out DNS traffic as far upstream as possible is a good solution, but certainly at a border firewall. One caveat – web servers often need to make DNS queries themselves, so ensure that they have a path to do so. In general, the principal of “filter out the unexpected” is a good filtering strategy.


DNS server protection

Since DNS servers have to process incoming requests (an authoritative name server has to respond to all of the recursive resolvers around the Internet, for instance), merely filtering DNS traffic upstream isn’t an option. So what is perceived as a network problem by non-DNS servers becomes an application problem for the DNS server. Defenses may no longer be simple “block this” strategies; rather, defense can take advantage of application tools to provide different defenses.


Redundancy

While the total number of authoritative DNS server IP addresses for a given domain is limited (while 13 should fit into the 512-byte DNS response packet, generally, 8 is a reasonable number), many systems use nowhere near the limit. Servers should be diversified, located in multiple networks and geographies, ensuring that attacks against two name servers aren’t traveling across the same links.


Anycast

Since requests come in via UDP, anycasting (the practice of having servers responding on the same IP address from multiple locations on the internet) is quite practical. Done at small scale (two to five locations), this can provide significant increases in capacity, as well as resilience to localized physical outages. However, DNS also lends itself to architectures with hundreds of name server locations sprinkled throughout the internet, each localized to only provide service to a small region of the Internet (possibly even to a single network). Adversaries outside these localities have no ability to target the sprinkled name servers, which continue to provide high quality support to nearby end users. 


Segregation

Based on Akamai’s experience running popular authoritative name servers, 95% of all DNS traffic originates from under a million popular name server IP addresses (to get 99% requires just under 2 million IP addresses). Given that the total IPv4 address space is around 4.3 billion IP addresses, name servers can be segregated; a smaller number to handle the “unpopular” name servers, and a larger amount to handle the popular name servers. Attacks that reflect of unpopular open resolvers thus don’t consume the application resources providing quality of service to the popular name service.


Response handling

Authoritative name servers should primarily see requests, not responses. Therefore, they should be able to isolate, process, and discard response packets quickly, minimizing impact to resources engaged in replying to requests. This isolation can also apply to less frequent types of request, such that when a server is under attack, it can devote resources to requests that are more likely to provide value.


Rate Limiting

Traffic from any name server should be monitored to see if it exceeds reasonable thresholds, and, if so, aggressively managed. If a name server typically sends a few requests per minute, having name servers not answer most requests from a name servers requesting dozens of time per second (these thresholds can and should be dynamic). This works because of the built in fault tolerance of DNS; if a requesting name server doesn’t see a quick response, it will send another request, often to a different authoritative name server (and deprioritizing the failed name server for future requests).

As attacks grow past the current few hundred gigabit-per-second up to terabit-per-second attacks, robust architectures will be increasingly necessary to maintains a presence on the Internet in the face of adversarial action.

 
%d bloggers like this: