January 22, 2019

By: Marcus Friman, Co-founder and Head of Engineering, Netrounds 

Top Networking Challenges

In a study by EMA Research, surveying 150 enterprises in North America­­­, the following three networking challenges made it to the initial top 8:

  1. Lack of end-to-end, multi-site network visibility and troubleshooting capabilities
  2. Network latency between internal and external cloud resources
  3. Lack of monitoring and troubleshooting of interconnections between public and private cloud environments

Being three of the eight most prevalent challenges, these can be assumed to have a significant, negative, effect on organizations. By rather simple means, you can control these challenges. How, you ask? Keep reading.

Challenge #1: Lack of End-to-End, Multi-site Network Visibility and Troubleshooting Capabilities

Looking at the data plane of the end-to-end connection, there are at least five separate segments as illustrated above that can cause a sluggish user experience. 

So the question is now: How do we detect and find the issues that cause problems for end-users?

A classical approach to service assurance has been to focus on the infrastructure health and collect syslogs and interface statistics from fault management systems. However, as the network is becoming more and more virtualized, this becomes more complex. While you might be able to collect metrics from virtual machines, the increased amount of components to collect data from due to the dynamic nature of these environments, and the fact that you do not have control of the infrastructure in the same way, makes it even more difficult than before to correlate with a complaint from the end-user to potential problems in your IT environment. The fact is that these device metrics have very, very little correlation to the real experience from the end-user.  Still, there are many initiatives in the industry that aim to map these device metrics into service quality.

Instead of reverse engineering end-to-end service quality from device metrics, you can now use software to actively measure high-grade metrics, across all domains that are relevant to your organization. This is done by sending real active traffic on the data plane like an end user, from layer 2 to layer 7, to directly measure the metrics that matter. With the widespread availability of computing resources, from the branch office to the cloud, it is now possible to deploy software for the active traffic generation in a way that has not been practically feasible before.

You can now shift focus from the health of the devices in your network, to the the metrics that matter for your end-users and to ensuring their satisfaction. The value you will see by doing this will be high.

Read our whitepaper SLA Exposé: Are Users Right in Complaining About Their Networks? for more information about the effects of MTTR in a study conducted where active test agents and measurements were administered in a number of corporate networks.

Challenge #2: Network Latency between Internal and External Cloud Resources

Cloud computing has many potential benefits for every business and in every industry. The hybrid cloud solution has become mainstream for enterprises as advancements in cloud technology have kept pace with the ever-increasing mobility of end-users who are re-shaping enterprises and creating a hybrid IT.

Some of the likely reasons why this is a leading challenge:

  • The distance between the private and public cloud resources might be large, affecting latency.
  • There might be several different providers involved, where peering agreements and routing may affect the delay, and there may also be places along the way that might introduce queue buildups in the network.
  • Networking is critical to delivering an uninterrupted and fault-free end-to-end user experience and there are many choke points here: from the distributed virtual appliances at the customer site to the overlay SD-WAN and its physical underlay networks comprising a mix of Internet, 5G and MPLS VPNs. There is additionally a lot of plumbing inside the different clouds and between their availability zones; add to this the service chains' performance that will depend on configurations and workload placements.

Thankfully there are solutions in which offer enterprises the end-to-end insight needed to successfully be in the cloud. As Netrounds is an active testing and monitoring solution, Netrounds can provide visibility of the latency you have between sites. Thanks to Netrounds' reporting capabilities, and path analysis, you will be able to not only quickly inform your service provider(s) that there is a problem, but also provide insight into where the problem is occurring. This will make it possible to significantly reduce Mean Time To Resolution (MTTR), and thus decrease both frustration and inefficiencies in your organization.

Challenge #3: Lack of Monitoring and Troubleshooting of Interconnections between Public and Private Cloud Environments

The more parts of an infrastructure that have transitioned towards multi-cloud and hybrid-cloud environments, and implementing service chains, the more difficult it becomes to monitor the network in detail using only passive methods. In a recent poll taken with Light Reading and Netrounds we asked individuals from service providers and large enterprises; Where in the network evolution are you currently? The poll revealed that 26.5% had classical networking, 67.3% mostly classical, with some virtual network appliances (Iaas) and 6.1% Network as a Service (NaaS).

Regardless of what network evolution stage you are at, you will ultimately need to achieve true visibility at all layers, instead of artificially trying to guess based on passive collected data. This is all the more important with the more black box networking you have in your network infrastructure. Actively measuring the network on all layers is essentially the only way to be certain that the service is working with desired quality, end-to-end.

End-to-End Service Quality monitoring Solution: System Architecture

In order to get a clear picture of how the network behaves it’s important to get a centralized overview spanning all data centers, offices and links in a single view. Netrounds implements this using a centralized control center where all interactions from orchestration systems takes place including executing NETCONF or REST calls, or manually analyzing results using the web GUI tools.

Test Agents are then deployed in each location where services are either hosted or users are accessing services, so basically in each data center and relevant branch offices. The central control center instructs the Test Agents to send active traffic between each other that is mapped to the corresponding QoS, firewall and routing policies that regular services fall into.

The Test Agent may also perform external service requests such as accessing video streams, doing HTTP requests, performing DNS lookups and even doing SIP VoIP calls.

For locations where it is not feasible to deploy a test agent it’s possible to measure the network performance using reflection technologies such as for example Two-Way Active Measurement Protocol – TWAMP.

In summary you could say that each Test Agent acts as a client on the network, like a normal user, and tests the end-to-end network performance and availability using real traffic, traversing your network and accessing real services. If the Test Agent identifies an issue it is very likely that the issue is also affecting regular users, regardless of the underlying root cause of the issue.

Final words

Many Enterprises are eager to reap the benefits of a multicloud strategy, but at the same time realize that there are challenges to also overcome. These challenges should most definitely be taken seriously, since there is a risk for networking issues resulting in frustrated users, loss of productivity and high operational costs that will hinder the implementation of the strategy.

Three of the most prevalent challenges can be quite easily met by introducing an active testing and monitoring solution to measure the metrics that matter for end-users in the data plane between different sites.


Are you interested in learning more? Watch our webinar "A New Level of Visibility in Multi Cloud Environments and Service Chains."

For a more technical deep dive on the topic, check out Netrounds at #NFD20