4 things your network operation team
should know about your access networks
In the last decade, businesses have become increasingly reliant on connected associates and customers to get them engaged when they plan and conduct their business. Furthermore, the popularity of cloud-based Internet of Things (IoT) that conduct machine-to-machine (M2M) communication has built a love-hate relationship with the IT operation team as many of these devices may be operating on the network without their knowledge, posing potentially devastating security and operational risk. Needless to say, the switched network that eventually connects all things has evolved and is getting more complex both from a configuration and security point of view. This white paper discusses the state of the switch network and key attributes that the IT operation team should have visibility into. This paper discusses best practices that will aid in making the team more efficient in maintaining control of the switch network, improve collaboration of the IT team and keep the connected people and things up and running.
Today’s switched networks
The role of the switched network has changed over the years since it was first conceived back in the early nineties. With the deployment of more mobile users and more bring-your-own-device (BYOD) products, the switch network has taken a more important role in getting more diverse types of end devices connected while maintaining security of the corporate network. Let’s look into the four key functions of today’s switch networks that the network operation team must maintain control of, and suggested best practices:
|Key Functions||Key attributes to manage|
|Connectivity||Offering Power over Ethernet, Duplex, Speed to the device and the Link Control mechanism to facilitate it.|
|Authentication & Addressing||Device and user-authentication mechanism, and the address and access provisioning service.|
|Routing||The switch and VLAN topology, and packet-routing services such as DNS, Gateway, NAT, that get IP packets from the client to its target.|
|Efficiency||Network path bandwidth, packet loss, delay and jitter characteristic that affects transmission efficiency and hence user experience.|
How a switch network works together to connect and provision access to connected devices
Power over Ethernet has become a popular way to power end devices because it reduces the cost of deployment and maintenance. Many networking devices, such as Access Points, VoIP phones and more recently IoTs are virtually all powered by PoE exclusively. PoE is power rated by IEEE 802.3 standards, and devices are classified based on their voltage and wattage. There are two kinds of PoE devices:
- Power Supply Equipment (PSE) that provides power on the ethernet cable. In new deployment, PSE is usually the switch and is commonly referenced as endspan. A PoE injector, called midspan, can be placed between a non-PoE switch and the PoE powered device as a retrofit. Based on the PoE standard that the PSE supports, it falls in to a PoE TYPE, 0 – 4 (see Table 1). PSEs can operate in two modes: Mode A PSE supplies power using the 12, 36 pairs on the 4 pairs UTP cable, and Mode B PSE uses the spare pairs 45, and 78. It is important to note that the PSE defines which mode the power is offered. The standard does not require PSE to support both Mode A and B.
- Powered Device (PD) is a device powered by a Power Supply Equipment, and thus consumes energy. 802.3af and 802.3at compliant PD must be able to support BOTH modes A and B. Based on the wattage that the PD draws, it falls into a PoE Class, 0 – 4 (see Table 2).
|POE Type||Common Name||Related standard||Pairs Used||Max. Power to PSE Port||Max. Power to PD|
|2||PoE+, PoE Plus||802.3at||2||30W||25.5W|
|3||4-pair PoE, PoE++, UPoE#||802.3bt*||4||60W||51W|
Table 1: PoE PSE Types
#: UPoE is a Cisco proprietary classification reference in their “Digital Ceiling” Solution.
*: 802.3bt is a proposed IEEE standard that is schedule to be ratified in early 2018.
|PoE PD Class||Type/Standard||DC Voltage at PSE||DC Voltage at PD||Min. Power from PSE Port||Power used by PD|
|0||1 / 802.3af||44-57V||37-57V||15.4W||0.44 – 12.95W|
|2||1 / 802.3af||44-57V||37-57V||4.5W||0.44 – 3.84W|
|3||1 / 802.3af||44-57V||37-57V||7.5W||3.84 – 6.49W|
|4||1 / 802.3af||44-57V||37-57V||15.4W||6.49 – 12.95W|
|4||1 / 802.3at||50-57V||42.5–57V||30W||12.95 – 25.5W|
Table 2: PoE PD Classes
The 802.3 standard defines LLDP to be the protocol used for PD to communicate with the PSE, and the class it belongs so that the PSE can provision the right voltage/current. But there are PoE devices in the market that use proprietary protocols, such as the Cisco Discovery Protocol (CDP), before the standard was ratified. Not all PoE devices are necessarily fully standard compliant so we must check.
What could go wrong:
The challenge to the network operation team is that as more and more PD at different classes are deployed on the network, the power budget of the PSE as well as the interoperability between the PD and PSE needs to be managed and understood. In addition, not all PoE implementations are standards-compliant nor is existing cabling system capable of supporting PoE.
|Cannot get power||1. Cable fault:
3. PSE port not enabled for deliver PoE
4. PSE and/or PD are not fully standard compliant (e.g. PD does not have 25 ohms on the powered pairs or does not support both Mode A & B).
5. PSE does not have enough power budget to support powering up all the PDs connected to it.
|Intermittently drop off||1. Cable fault:
- Train your staff to understand how PoE works.
- Read carefully the specification of the equipment and deploy only standard compliance devices. Avoid non-standard compliant midspan PSE, such as Ethernet Y-cables (already a no-no) or so-called “8-port PoE Passive Splitters” that simply bolt a 48VDC supply to all the “idle” pairs.
- Document the wattage of PSE and PDs.
- Check when changing and adding PDs to ensure that the PSE can support all PDs connected to it.
- Offer standardized tools and procedures for the team to validate the health of PD, PSE, and cable during deployment and troubleshooting (e.g. verify that the voltage and wattage at the PD side is available and meets requirement).
The other consideration when it comes to connected devices is the linking process between the device and the network. The first thing to consider is that the cable between the end-device and the switch must be able to support the link. Most structured cabling systems today require all four pairs to be connected and certified with length to be <100 m during deployment. That will be sufficient to support all networks up to 1Gbps as shown in the table below. Table 2 below shows the minimum cable standard that is required to support different types of deployments. During upgrade, it is important to re-certify the cabling system to avoid wear and tear or undocumented changes from causing issues.
|Standard||Certification level||Pair used|
|10BASE-T||Cat3||12 & 36|
|100BASE-T||Cat5||12 & 36|
|1000BASE-T||Cat5||12, 36, 45, 78|
The linking process is negotiated between the end-device and the switch to establish the speed, duplex and the cable pairs to allow the data communication to occur. It has become less of an issue as auto-negotiation has been the default setting on switch ports and the Network Interface Card (NIC) leaving interoperability usually maintained and well understood. The following table shows the hit-and-miss situation when either the NIC or switch is manually set to use a specific speed and/or duplex. The rule of thumb is that if one side is forced, the other side needs to be forced to the same. When one side is auto-negotiating, the other side should be as well. Even under the case when link is established when one side is set as auto and the other side is not, it is highly possible that the auto side will periodically try to re-negotiate, causing temporary loss of link. As the price of 10G-capable switches drop, more switches with 1/10G ports are begin deployed. In most cases, a 1/10G switch port does not support half duplex or auto-negotiation. Therefore, both the switch port and the NIC must match for connection to happen.
Link result for 10/100/1000Mbps Ethernet Switch and NIC based on link settings
What could go wrong:
|Cannot link (no link light)||Cabling fault
- Open, short on transmit pair
Wrong fiber SFP used: singlemode vs multimode
Mismatch in link setting between switch and NIC
|Less than optimal link speed/duplex & Intermittent reconnect||Either NIC or switch was set as auto-negotiate while the other was set as fixed rate for 10/100/1000Mbps link
- split pairs
- Always use auto-negotiation for NICs and Switch Ports on 10/100/1000Mbps ports. If a 1/10G switch port, hard code it to the speed needed.
- Document switch port settings and structured-cabling path used, and more importantly make the information easily accessible to all team members.
- Offer an easy way to check current switch port link configuration, either directly using LLDP or via management system. The best way is to have a passive tool that can be connected inline between the NIC and the switch to observe the link capability offered and the link/duplex that the pair settled.
Switch Port can be tested for PoE against a PoE Type/Class and display the TruePower™ available to the PD Class.
Link test shows the link capability of the switch port.
Inline analysis showing the speed/duplex advertised and used between the switch and the device.
Before a device can start to communicate with other devices on the network, it will need to go through an authentication process for three purposes: security, address and access provisioning. Authentication allows an authorized device to access the network but also prevent a rogue device from connecting to the network.
- Supplicant: the element that wants to be able to access the network, such as the security camera.
- Authenticator: the element through which the supplicant may access the network, such as the switch or the Wi-Fi access point.
- Authentication server: it contains the information which is used to decide if a supplicant may or may not access the network resources. It is typically a server running Radius protocol. The authentication mechanism can be based on the MAC address of the device, the user account such as a guest password to the guest SSID for BYOD, or the private certification programmed on a smart card of a security camera. The example below shows how a fixed-location security camera gets authenticated. In this case, EAP protocol is used for added security, commonly seen during Wi-Fi end-device authentication.
In the above example, the authenticator serves as the proxy to communicate the authentication request to the authentication server. After the device is authenticated, the device can send a DHCP request to the local DHCP server to obtain an IP address and not until then. It is important to note that authentication and the pool of IP address allocated needs to be in sync. Take Wi-Fi user authentication as an example:
After the guest BYOD device is authenticated to the network via the Guest SSID, the AP is setup to send the traffic onto VLAN 1 while Corporate Users connected to the Company SSID will be sent to the VLAN 101. These VLANs need to be setup on the switch for the WLAN networks, and each VLAN needs to be connected to a local DHCP server via layer 2 broadcast mechanism so that the IP address can be provided to the device. In some cases, a DHCP protocol bridge, such as the Wi-Fi controller, can be used to forward a DHCP request from the clients on different VLANs to a single DHCP server. Typically, the IP address for each VLAN are mutually exclusive as below such that clients that belong to different groups can access their distinct sets of network assets:
|User Group||SSID||VLAN||IP Address Pool||Accessible assets|
|Guests||Guests||1||10.10.10.1-10.10.11.255||Limited Internet Bandwidth, Guests Printer|
|Corporate Users||Company||101||18.104.22.168-22.214.171.124||Internet, Corporate VPN, Corporate Servers, Printers…|
|Security Camera||–||201||126.96.36.199-188.8.131.52||Video Servers and Storage|
|Network Admin||NetAdmin||301||184.108.40.206-220.127.116.11||Wi-Fi controller, Switch/Router management ports|
In addition to assigning an IP address to the device, DHCP can offer other key information to the end-device that is critical to its operation. For example, VoIP phone receives the IP address of the configuration server that contains the address of call manager and SIP port# to use via DHCP option code 66 (TFTP server) or 150 (VoIP Configuration Server).
Table below shows the commonly used DHCP options and their code #:
|DNS Option Code||Description|
|1||Subnet mask (must be sent after the router option, option 3, if both are included)|
|6||DNS servers, should be listed in order of preference|
|15||DNS domain name, should be listed in order of preference|
|44||WINS server (NetBIOS name server)|
|45||NetBIOS datagram distribution server (NBDD)|
|46||WINS/NetBIOS node type|
|47||NetBIOS scope ID|
|66||TFTP Server Name (RFC2132) or in name field (RFC2131)|
|150||IP address(es) of VoIP Configuration Server(s) [has precedence over option 66 (RFC5859)]|
What could go wrong:
|Cannot get IP address||Authentication issue
- Wrong end-device setting (authentication protocol, wrong certificate)
- No end-device configuration on Authentication Server
No IP address available
- Not enough IP address poll
- DHCP server not reachable from VLAN
|Wrong IP address||Network issue
- Wrong VLAN assigned
Wrong or rogue DHCP Server offer IP address when multiple DHCP Server is present
- Document the VLAN configuration on switches, switch-to-switch uplink ports, VLAN to user group/address correlation, and DHCP provisioned for each VLAN/broadcast domain.
- Make the documentation accessible to team members responsible for setting up and troubleshooting the switch networks.
- Have standardized test workflow and procedures that allow any team member to verify the switch configuration from the client location to the proper VLAN and address.
- Have tools that can offer visibility to all DHCP responses from the network to detect rogue DHCP servers, the IP address and options provided to the client by user credential.
OneTouch AT supports 802.1x with EAP to emulate that of the user.
Determine if multiple DHCP responses are received and what were the parameters offered.
Gain visibility of VLANs configured on switch ports, and other status, such as utilization and # of devices connected.
Once an end-device obtains an IP address and key configuration information, it can then communicate with other devices on the network. Routing is the fundamental mechanism that the network uses to connect various IP-based devices together across private and public owned networks. There are a number of key and fundamental services on the network that facilitate this function that the network operation team should know or be made aware of:
|VLAN||Virtual LAN is a Layer 2 mechanism to allow switches to group end-devices and switch ports into a broadcast domain.|
|Router||A router is a device that joins networks together and routes traffic between them. A router will have at least two network interface cards (NICs), one physically connected to one network and the other physically connected to another network. Some routers can be configured to only allow traffic on certain well-known ports. Applications that run on protocol with special ports will require configuration change to open those ports.|
|DNS||A domain name server, also called a DNS server or name server, manages a massive database that maps domain names to IP addresses. When you enter a URL into your web browser, the default DNS server uses its resources to resolve the name into the IP address for the appropriate web server.|
|NAT||Network Address Translation allows a single device, such as a router, to act as an agent between the internet (or "public network") and a local (or "private") network. This means that only a single or a few recognized IP addresses are required to represent an entire group of devices with un-recognized IP addresses.|
The way that these services work together can be summarized as the follow examples:
Client communicating to a server in the intranet
- Client knows the name of the server
- Client send query to default DNS IP (parameter from DHCP)
a. If default DNS is not on the same IP subnet, send query to default router on its VLAN
b. Router forward query to the router port connected to the DNS IP subnet, and VLAN ID will most likely change
- DNS server reply with IP address of the intranet server, via router if needed
- Send connection request to the intranet server’s IP address, again via router if needed
- Router forward query to the router port connected to the intranet subnet, and VLAN ID will change
- The first four steps are the same as when communicating to intranet server except that the name of the server may be a website via a web browser
- Router forwards the IP packet to the router port connected to the internet link
- If NAT is used, the NAT will change the source address of client A to a recognizable public address before forwarding to the internet link
What could go wrong:
|All users on same VLAN cannot connect to intranet server||Incorrect IP address or default DNS failed
Router not reachable or failed
Broken or oversubscribed VLAN trunk path
|All users on same VLAN cannot connect to the internet||Router Port or link to internet down
Router not reachable or failed
DNS not reachable or failed
|Some applications cannot run||Router may have block protocol port that the application requires|
|VoIP call does not work||Call Manager not reachable?
DHCP VoIP configuration server information not available or misconfigured?
- During installation, have standardized tools and procedures so that technicians can sample check accessibility and path to the local router and critical servers, intranet and internet, from a VLAN edge using each credential.
- Document the correct default router, DNS IP address that should be provisioned to the client based on user/device credential for reference during troubleshooting. Make the information accessible to the team.
- For troubleshooting, have tools that can show traceroute and switch path used, and note DNS process pass/fail when accessing asset beyond the local subnet/broadcast domain.
Conduct TCP connect to verify DNS resolution, connectivity to server and response time of both.
Verify default Gateway/Router is reachable.
Determine the switch paths between the switch port and the target device.
With the connectivity, authentication and route verified, the last but not the least important thing is to ensure that the network can help to deliver application traffic efficiently. There are several key factors that can affect user experience of the application due to the network:
- Available bandwidth can affect class of service provisioning, especially on WAN links, as well as the amount of load on the network.
- Network path used can affect transverse latency as well as available bandwidth.
- Smart devices such as load balancer and WAN accelerators that may re-engineer the application transaction.
|Frame Type||UDP only||TCP, UDP||TCP, UDP|
|Key network tests||Information rate, delay, and data loss. Jitter is optional||TCP: Information Rate,
UDP: Information Rate,
delay, jitter and data loss
|Information rate, delay, jitter and data loss, CBS, and EMS|
|Main Tunable parameters||IPv4, DSCP, TOS and VLAN;
Seven frame Sizes (byte):
64, 128, 256, 512, 1024,
1280, 1518; Same Port#
send and receive
|IPv4 or IPv6,
DSCP, TOS and VLAN;
TCP: Total byte sent,
MTU/MSS, TCP Window size,
and file to send; UDP:
user defined frame; size
Diff Port# send and receive
|IPv4 or IPv6,
Layer 3 tag: MPLS,
DSCP and COS;
Stream profile: MTU,
CIR, EIR, EMIX;
Diff Port# send and receive
|# of simultaneous connections||One||Multiple||Multiple|
|HW Platform||Professional Test Equipment||Window/Linux/Unix based computer||Professional Test Equipment|
|Benefits||Simple configuration for maximum bandwidth||TCP and UDP test;
free under BSD license
|TCP and UDP test;
short test time
Requires dedicated HW
|Transmission rate slave to NIC driver Command Line UI||Complex configuration not typical in enterprise LAN; requires dedicated HW|
Of these three test approaches, the RFC2544 is the first to be used and is still the most widely used. It has been sufficient to validate end-to-end network performance. iPerf has been gaining popularity in the network engineer community because of its ability to perform bandwidth tests with TCP flows, and its low cost of deployment. Y.1564 is used mainly for Metro-network link testing where SLA is a must. It has not been widely adopted in Enterprise.
What could cause application to be slow?
When a user complains that the performance of the network is low, there are a few questions to ask to determine if the network is the issue:
a. What application is affected? real-time voice/data or data traffic
b. If it is not a corporate application, ask questions to determine if the application streams are contained within the corporate network?
c. How many clients were affected? How is the relationship between these clients?
|Symptom||What may be the problem|
|All users of only one intranet application experienced slow performance||Application or server has issue;
The network leading up to the application server(s) is bad
|All users of one internet application experienced slow performance||Internet application issue;
Internet application flow blocked
|One user’s experience with an application was bad||Client device or account configuration;
Client to network connectivity issue, especially if connected over Wi-Fi
|Few users in the same VLAN experienced bad performance||VLAN to application network path issue
VLAN group provisioning issue
- Run network performance test for end-to-end link between critical paths up to the maximum bandwidth of the weakest link, and against the SLA requirement of the weakest link. If there is no SLA parameter available, the following guideline could be used: one-way end-to-end delay <150msec, jitter <100msec and packet loss <1%.
- Document test result of the link for future reference.
- If it is a TCP application, try to run a TCP Connect test to the server. If the test completes 100% with little delay, it is very likely that the server itself is the issue, not the network delay. The next step is to prove that packet loss on the network path between the client and the server is good to totally eliminate the network as the cause. Typically, we want it to be <1% at the rate that the maximum information rate required for the application. If the server is outside the corporate network, you will only need to verify up to the point before the stream leaves the network.
- If the application is real-time voice/video, some VoIP phones may give you Jitter/Packet Loss statistic of the call. Otherwise, you can run a RFC2544 or iPerf test against the target end-point. Typical voice and video requires one-way end-to-end delay <150msec, jitter <40msec and packet loss <1%, at the UDP stream rate around that of the voice/media stream.
- Document all test results. If it is not the network, try to capture the application transaction at both ends of the application: near the client and the server. The best approach is to capture the traffic using inline TAP or via SPAN/mirror port.
Exercise connectivity to application server and verify DNS resolution, routed response time of both.
Ensure performance of the wired link to the router with the OneTouch AT’s performance test. Measure upstream and downstream throughput up to 1Gbps as well as loss, jitter and delay.
Capture packets inline between the switch and device, and apply filter to store relevant info to SD-RAM.
Getting it all together
For the network team to be efficient in supporting switch networks today, the team needs to be up-to-date on the technologies that makes it all possible. It is also critical for team members to effectively share information, not only about knowledge such as the configuration of the network made, but also on-site information during troubleshooting or deployment. Despite the best effort, not all members of the team have the same skill level. There are many freeware and tools available in the market, but not all team members have the knowledge to utilize the tools, as well as interpret and share the test results. Freeware is also notorious for lack of documentation and test report that can be easily shared. The ability to save and share real-time information about the network not only improves collaboration between teams during troubleshooting, it also importantly serves as an evidence when 3rd parties, such as service providers, need to be called in to resolve a problem caused by them.
NETSCOUT Handheld Network Test Tools not only give network operation teams the means to gain visibility, the family of tools offer two key attributes that helps the team to be more effective.
1. Automated test to support programmable standardize test procedure.
The tools offer an AutoTest that will provide visibility to all four aspects of the switch network with the push of a button with user programmable Pass/Fail limits and automated reporting. Three choices are available that offer different levels of detail and depth of testing:
|AutoTest features|| LinkSprinter
|| OneTouch AT
|Connectivity – PoE||Type 1||Type 1 and 2 with
|Type 1 and 2 with
|Connectivity - Link||10/100/1000Mbps
Copper or Fiber
Copper or Fiber and
Up to 802.11ac
|Connectivity – Switch ID||LLDP/CDP reports
|Address||DHCP & Static||DHCP & Static||DHCP & Static
Ping 1 IP device with
Ping 10 IP devices with
Ping, TCP connect,
EMAIL, FTP, IGMP,
WEB test for user
definable # of devices
|Efficiency||Response time for Ping test||Response time for Ping test||Response time for
RFC2544 up to 1Gbps
|Notable Tools||• View test result
over Wi-Fi from
• Powered by PoE or AA battery
• Distance to fault
|• Tone Generator for
• Wiremap for cable
• Distance to fault
|• Packet Capture
• Inline PoE & VoIP Analysis
• Device Discovery &
• Remote Control
• Distance to fault
Table: Key feature comparison between NETSCOUT Handheld Network Test for Switched Networks
To facilitate visibility and collaboration across the network operation team, all NETSCOUT handheld test tools share a cloud-based result and report management database called Link-Live. It is a free cloud-based service that supports automated upload of test results from all hand held tools. During network deployment, a progress report can be easily generated showing switch ports tested each day, their link speed and duplex distribution, and PoE test results. During troubleshooting, previous test results from a switch port can be compared against current test results for quick change identification.
Fig: Link-Live result dashboard showing summary of test results
Fig: Expanded test results showing detail information
Fig: Summary Report from Link-Live showing progress from the perspective of test results over a time period
The switch networks have evolved from simply connecting devices to the network to supplying power, authenticating a device and user, and routing their traffic effectively and automatically. The network operation team needs to maintain their knowledge of the technology adopted as well as how to gain visibility to the construct and make changes on their switch network, especially around the edge, where devices and users are constantly moving, and new M2M devices are added. Having a best practice that allows standardization of test procedure, sharing of information, from design and configuration to real-time, on-location status, will improve overall efficiency of the team. NETSCOUT hand held network test tools offer the best-in-class test features, and automated test procedures to allow the network operation team to be efficient and take control of the 4 key aspects of the switch networks.