For service providers deploying LTE, the success of Voice over LTE (VoLTE) is a milestone to differentiate their HD-Audio/Video service from other over-the-top voice application/services such as iTalkBB, Phonepower, Skype and Viper. Getting VoLTE service right is very important, because it is the most basic service that subscribers expect. It is also the service where it is most noticeable if a mishap occurs. It is absolutely critical that it is done right the first time.
This whitepaper provides an overview to engineers and technicians deploying and maintaining VoLTE. It describes the key network elements and their role in VoLTE calls, and typical issues found during VoLTE deployment. It also offers best practices highlighting where to focus when troubleshooting VoLTE issues in live LTE networks.
Key Components for VoLTE
Evolve Packet Core (EPC): eNodeB, SGW, MME, PGW
These components work together to establish and maintain subscriber connectivity to the data network as the user equipment moves across the mobile network. One or more bearers need to be created through the IMS APN, and a special IP address should be assigned to the UE when making VoLTE calls.
IP Multi-media Subsystem (IMS)
It contains application servers, call session controllers and media control functions supporting inter-network calls and messaging, AAA and routing. SIP traffic will be routed to IMS after it leaves the EPC. The IMS determines where 3G/PSTN/EPC the callee resides in order to route the SIP and RTP traffic.
Session Border Management
Session Border Management is part of the IMS that enforces security, quality of service and admission/routing control between the EPC/IMS to other networks such as PSTN and 2/3G. It governs the manner in which VoIP sessions are initiated, conducted and terminated between the different types of networks' Media Switching Center/Media Gateway (MSC/MGW). It works with the Media Gateway to provide inter-networking support and effective management of media codec and signaling traffic between LTE to 2G/3G/PSTN. The Media Resource Function Processer (MRFP) that conducts the transcoding of voice codec between LTE and 3G/PSTN is a key element between the EPC and the 2G/3G/PSTN.
Understanding the Basic VoLTE call process
Connecting to LTEUE needs to be authenticated and authorized by the MME so it too may be connected to the network.
Connection and Registration to IMS Service(a) The UE requests data service with IMS to establish the default bearer. (a new IPv4 and/or IPv6 address will be assigned to the UE) (b) The UE then “register” with IMS to get “provisioned” so that calls can be directed to other VoLTE, 3G/2G or PSTN subscribers through the IMS.
Making the CallWhen a UE initiates a call, it sends SIP “Invite” message over the default bearer to establish connection with the callee. The IMS receives the SIP message, and locates the callee (LTE/3G/PSTN) and establishes the connection. If the callee is on PSTN or a service provider’s network, routing of SIP and Media will be conducted via the SBC and transcoding of codec will be conducted with the MRF.
Establishing Media BearerThe IMS instructs the PGW/APN to initiate establishment of the dedicated bearer to carry the voice packets over RTP and RTCP protocol streams. Based on the 3GPP standard, quality control index (QCI) level 1 should be assigned for the voice bearer. Finally, the RTP (voice conversation) is transferred over the dedicated bearer. The dedicated bearer will be deleted after the voice call.
|QCI||Resource Type||Priority||Packet Delay Budget||Packet Error Loss Rate||Example Services|
|1||GBR||2||100 ms||10-2||Conversational Voice|
|2||4||150 ms||10-3||Conversational Video (Live Streaming)|
|3||3||50 ms||10-3||Real Time Gaming|
|4||5||300 ms||10-6||Non-Conversational Video (Buffered Streaming)|
|5||Non-GBR||1||100 ms||10-6||IMS Signaling|
|6||6||300 ms||10-6||Video (Buffered Streaming) TCP-based (e.g., www, e-mail,chat, ftp, p2p file sharing, progressive video, etc.|
|7||7||100 ms||10-3||Voice, Video (Live Streaming), Interactive Gaming|
|8||8||300 ms||10-6||Video (Buffered Streaming) TCP-based (e.g., www, e-mail,chat, ftp, p2p file sharing, progressive video, etc.|
These are GTP-U based tunnels that are created to carry data traffic for the subscriber across the EPC. When a subscriber’s equipment is connected to the network and establishes connections to the data services, via PDN-Gateways (PGWs), default bearer(s) are created to carry the base communication protocol for the data service. Two types of bearers may be created, Guaranteed Bit Rate (GBR) or non-Guaranteed Bit Rate (nGBR). GBR bearers are assigned guaranteed bandwidth to carry jitter and packet-drop sensitive traffic such as voice over RTP. Voice carrying GBR is resource consuming and is created when a VoLTE call is successful and is deleted as soon as the call ends. nGBR is typically created for normal data traffic such as Internet traffic that is base-effort. Most default bearers, (such as that for VoLTE where SIP traffic flows over or non-critical Internet service) are nGBR.
Quality Control Identifier (QCI):
It indicates QoS parameters (packet delay and loss budget) as well as the priority class for each bearer. QCI’s assignment is based on the subscriber’s profile in HSS and the data service provisioned by the service provider. Although 3GPP offers 9 suggested QCI values as reference, service providers can assign their own QCI to data service.
High Traffic VolumeAll VoLTE traffic is IP based. Call signaling is based on TCP/SIP and audio is carried over UDP/RTP with AMR-WB as the audio codec. These VoLTE IP flows will be buried among all other IP data traffic flows in the LTE core, including streaming video and Internet traffic.
Different PathsWhen a VoLTE call is made, control signals that build the data bearers go through different data paths than the media traffic. In addition, SIP signaling and media traffic also go through different path and network elements after leaving the EPC. Troubleshooting VoLTE call setup and quality issues requires visibility and correlation between the control signal and the user bearer created.
Segment-to-segment VisibilityQoS of the audio is assured by dynamically creating a dedicated bearer within multiple interfaces in the EPC. End-to-end root cause analysis of audio quality issues requires correlation and visibility to QCI parameters established across multiple segments.
Asymmetric Media Flows VisibilityA different VLAN may be assigned to each direction of the RTP flow in and around the EPC. Engineers must be able to correlate SIP and RTP flows and extract packets despite this asymmetric nature to have visibility to all RTP packets. When troubleshooting for abnormal VoLTE call drop, engineer will need to extract the packet to analyze the timing and behavior of the RTP payload.
Best Practices for VoLTE Troubleshooting
- First, connect and capture traffic to gain visibility of traffic across c-plane and u-plane interfaces. Typically, an aggregation switch such as one from VSS Monitoring or Brocade can filter, aggregate and load-balance traffic to the tool with very low latency.
- From the traffic captured, analyze the VoLTE related bearer setup and QoS parameters provisioned.
- Analyze SIP flows across the EPC to IMS and SBC for calls to 2G/3G/PSTN.
- Track RTP flows correlated to calls made across interfaces in the EPC and across MRFP and SBC for calls to CDMA/PSTN.
Network Time Machine offers high performance packet capture capability up to 20 Gbps with postcapture analysis that correlates c-plane to u-plane and SIP to RTP, so the user may go back-in-time to conduct root cause analysis of VoLTE issues end-to-end.
Time Synchronization Consideration
All of the analysis, correlation of c-plane to u-plane, and SIP to RTP traffic, can be difficult if all the traffic is not captured on the same device where timestamps of all traffic are synchronized. When multiple capture devices are used, the timestamp mechanism of the capture devices must be synchronized using external NTP or PTP/GPS clock sources. As an alternative, advanced aggregation switches, such as those from VSS Monitoring, accept external clock sources and can add timestamps to the trailer of the packet. Capture devices that use the timestamp provided to reconstruct the packet when exporting for correlated analysis, will make life much easier for engineers.
The user cannot make calls at all.
- Default bearer was not set up
- User did not register with IMS (Authentication issue, IMS overloaded)
For these issues, connect to the S1 and S11 interfaces to examine the initial connection and bearer setup process of the UE, as well as whether or not there are SIP flows from UE. An analyzer, such as Network Time Machine, can capture all traffic at S1 and S11, up to 20Gbps, select the UE of interest and show the default bearer and dedicated bearer setup procedure with IMS.
Note the default Bearer with ID 5 for carrying SIP traffic was setup and the dedicated bearer for voice (RTP) with ID 6 was setup with a QCI (Quality Control Index) of 1 and GBR/MBR setting of 80 kbps under the IMS APN (voice66.testnetz….). If the UE has not registered with IMS, no dedicated bearer is possible. Once the symptom has been verified, the packet can be extracted to allow the appropriate party, such as equipment manufacturer or provisioning engineers, to resolve the issues. The scenario below shows a SIP Register that completes successfully:
Why a user cannot make calls to PSTN users?
With the support of IMS, calls from VoLTE subscribers can be routed to PSTN. When this fails, SIP flows from each segment, from end-to-end, need to be captured and examined to determine if and where the call setup process failed. For engineers responsible for the EPC, it is important to capture the signaling traffic around the demarcation point between the IMS and the Regional Core, i.e. between the PGW and the IMS, around the SBC and across the MRFP for media conversion. The error and delay in call setup time should be noted, and the payload of packets examined via root cause analysis from the origin of the failure.
During troubleshooting, look for the SIP flow that exhibited failure. SIP cause-code will provide a hint of why a call failed, such as a 503 Service Not Available; it may mean that either a service is not available because of overload or is misconfigured.
Network Time Machine can analyze the SIP traffic captured and offer a statistic of SIP errors and the calls that trigger the error.
Why is the call quality poor?
- Quality issue (packet loss or jitter in LTE Radio, packet drop in backhaul, SGi, wrong QCI provisioned)
- Call drop (LTE Radio quality issue, Network Equipment reset, dropped because S1 handover failed)
- One way audio (transcoding signaling or processing issue, persistent silent suppression stream)
For quality issues, the analyzer should be able to show the calls that the UE made and track the RTP flows and the QoS parameters, such as MOS, packet loss/Jitter, across each EPC u-plane interface. For one way audio, it would be important to see that both uplink and downlink are present and the correct codec is applied. Then, the RTP stream should be easily extracted for investigation of how network elements may induce packet loss, jitter and silent suppression that caused OWA. Having the ability to playback the voice helps to identify when and how OWA starts that speeds root cause analysis.
The Test Solution should be able to show the QCI value of the dedicated bearer that is setup as well as the actual CoS related parameter of the flow in the dedicated bearer, such as 802.1pq, DSCP and MPLS parameters across each u-plane interface of the network.
NTM allows users to observe the c-plane bearer setup for the VoLTE call examined. In the above example, the dedicated bearer with ID#6 was established with a QCI of 5 instead of 1 and it is an nGBR bearer with no GBR and MBR value. If the MOS score of the call was bad during a busy traffic hour, this could be the root cause.
When troubleshooting OWA issue, engineer may extract both SIP and RTP packets to examine the reason field in the BYE message. For example, reason caused by RTP/RTCP Timeout originated from UE may be caused by MRFP not sending silent suppression packets to keep the audio alive, causing the RTP timeout after 10 seconds.
VoLTE is a very important application for service providers deploying LTE. To manage the service, engineers must have the right tools that can provide visibility to traffic even under heavy loads, as well as to correlate u-plane to c-plane and SIP to RTP so problems can be quickly verified and packets can be extracted for root cause analysis.
Network Time Machine offers fast capture, is easy to deploy and provides smart analysis that allows engineers to quickly capture and visualize the packets that reveal the root cause of VoLTE issues.