Apr 08 2020

How Agencies Can Best Cope with Network Strain

Agencies need to properly configure VPNs and may need to buy more network bandwidth to support collaboration tools.

As federal agencies embrace telework solutions on a massive scale, the increase in traffic is placing a large and unique amount of stress on federal networks and IT systems. 

For example, the Energy Department’s Office of Environmental Management found that it could not have all of its users teleworking simultaneously, according to Melody Bell, associate deputy assistant secretary for resource management. 

“We are considering giving people flexible work hours, so not everybody is on the system at the same time,” Bell said during an ACT-IAC webinar last month, according to FedScoop. “Right now, we are having a critical challenge with bandwidth and everybody taxing the system at the same time, so we were just talking about having people adjust their hours and that we limit the number of people on the Citrix system at the same time.”

Meanwhile, the Transportation Department sent out a memorandum to employees in early March explaining its VPN could handle only 20,000 concurrent users, while its virtual desktop infrastructure could accommodate an additional 12,000. However, a department spokesperson later told Nextgov that by mid-March, the department could support a 100 percent remote workforce “due to interagency coordination and support from vendor partners.”

As agencies continue telework operations for the foreseeable future, handling network strain will become imperative to enabling users to work productively and achieve their missions. 

How to Approach Network Capacity Constraints

There are two issues that need to be considered, says Joel Snyder, a senior partner at Opus One, a Tucson, Ariz.-based IT security consulting firm (and a FedTech contributor). One is capacity to agency data centers, which is what IT administrators likely have running on a VPN. If agencies still have data center-based applications, for example, then they need to backhaul application traffic to the data center, Snyder notes. The second key issue is ensuring that agencies have enough network capacity to handle apps such as videoconferencing.

Most organizations are not running videoconferencing over VPNs, unless they are an office with a very small conferencing system that is used infrequently. “Most enterprises are using cloud-based conferencing solutions, and those that don’t are not using VPNs in any case,” Snyder says. 

If agencies are employing a lot of collaboration and videoconferencing right now, they need to be focused on both sides of the coin, says Snyder, “which means giving end users the tools they need to understand their performance and then upgrading if needed at the home side. On the data center side, it’s more of a simpler capacity planning exercise.”

READ MORE: See how agencies use teleconferencing tech to reach outside the Beltway. 

Ensuring Network Capacity and VPNs Are Configured Properly

Network capacity for any agency should be far in excess of internet/WAN connectivity, according to Snyder, “so an upgrade just means calling the internet service provider and having them bump up the speed.” 

However, IT leaders need to be careful, since certain speed bumps, such as those of more than 1 gigabit per second, are likely going to incur charges for the agency. If agencies do increase their bandwidth, then they should ensure that they do not have any old 100-megabit-per-second switches at the edge. A capacity bump, however, is the easy part, Snyder says, and agencies need to bear in mind how security appliances impact network traffic. 

“You also need to take a look at all the middleboxes, such as firewalls and IPS devices, that may be sitting between end users and the resources in the data center,” Snyder says. “Check specifications with the vendor to be sure your firewall can handle it. If you are on the edge, then look at changing your security policy — such as reducing some types of UTM, such as virus scanning for inbound ‘trusted’ traffic — to gain greater capacity while you wait for a firewall upgrade.” 

Agencies may also run into VPN capacity issues, “most often in the form of licenses but also in device capacity,” Snyder says. Such issues “will hit you hard and suddenly, so do a mini-audit to be sure you have licenses and device capacity for your expected increase in use.”

Joel Snyder - Senior Partner, Opus One
Blindly throwing more bandwidth at the problem without knowing your usage may waste valuable time you should be spending looking at more likely bottlenecks, such as firewalls and even application design.”

Joel Snyder Senior Partner, Opus One

“IT leaders need to review VPN configuration to ensure they are only VPN’ing the traffic they need to in an emergency situation,” Snyder says. “For example, if an agency was VPN’ing all traffic back to its data center for security reasons, even traffic going to the internet, then this can create capacity bottlenecks.” 

IT leaders should consider split-tunnel VPNs if they are not doing it already to ensure that they are leaving capacity available and not overwhelming their VPN concentrator or internet link with this “hairpin traffic,” according to Snyder. 

“Most VPN configurations are optimized for normal traffic patterns, but if you have been customizing your configuration, check settings such as MTU to be sure you are not creating unnecessary fragmentation into the picture,” Snyder adds. “An MTU of 512 is probably too small for modern networks, while one that is too large (above about 1350) can cause extra fragments or even connectivity problems.”

All of this depends on the specific VPN software an agency is using, Snyder notes, “so if you have any doubts, check with the technical support team to be sure you have an optimal MTU configured.” While that is a small issue, it can cause “big performance issues when the network is stressed,” Snyder says. 

Choosing shifts for workers is only going to help in very particular situations, Snyder says, and he cautions against it. “If you are in a special niche where your application traffic is actually the biggest part of network performance, you can try this, but this is going to be a rare case,” he says. 

MORE FROM FEDTECH: Follow these tips to protect VPNs from major vulnerabilities. 

How to Produce Optimal Performance for Collaboration

Agencies should not be running videoconferencing apps over a VPN, for efficiency reasons, according to Snyder. “If you were looking for a reason to move to the cloud, now you have it,” he says. 

Federal CIO Suzette Kent and her staff have been talking with ISPs and telecommunications companies about how to ensure users have enough bandwidth and how networks can be made more secure. 

“We started preparing for this a few weeks ago. Agencies did individual assessments of their capacity and took actions then to size it,” Kent tells Federal News Network. “Right now, over the last week and into this week, we see those investments in modernization, like moving to the cloud and the scaling that comes with it, prove the value and give us the results we wanted to see.” 

Agencies have been able to scale from the traffic volumes they would typically experience on a snow day in a region “to much larger scale volumes across the country,” Kent says. “We’ve done virtual private network testing, and vendors have been very responsive to scale up licenses and with technical tweaks that agencies needed.”

Internet capacity can be a bottleneck, but it is only one of many, Snyder says, including firewalls and VPN concentrators and other edge network equipment, which are just as likely to be the issue. 

“Yes, there is more traffic happening during the day than previously, but the bigger capacity issues are with collaborative tools (such as conferencing systems) that are seeing very high increases in use and a lot of stress,” Snyder says. 

Buying more bandwidth is almost always an inexpensive and easy option that will help if agencies are seeing flatline peaks in their usage. 

However, Snyder says, “blindly throwing more bandwidth at the problem without knowing your usage may waste valuable time you should be spending looking at more likely bottlenecks, such as firewalls and even application design.”

PeopleImages/Getty Images