This month’s Application Focus discusses the evolution of the EVS codec and two use cases applicable to wireless network operators.
We begin with a short reminder of what EVS is, how it is being adopted around the world, and the potential - and challenges - it presents to network operators.
The 3GPP organization held a competition in which commercial companies were invited to submit candidate codecs to meet specific requirements laid down by 3GPP. It happened that the majority of the candidates met a high proportion of the requirements, and 12 companies announced their intention in September 2013 to cooperate on the development of a single codec for submission to 3GPP. They planned to do this by combining the strongest features of their individual developments. The companies were Ericsson, Fraunhofer IIS, Huawei, Nokia, NTT, NTT DoCoMo, Orange, Panasonic, Qualcomm, Samsung, VoiceAge and ZTE Corporation. There is a symmetry to this group, which consists of three Chipset/Technology Vendors, three Terminal Vendors, three Infrastructure Vendors and three Mobile Operators. (It is also interesting to note that ten of those organizations own MultiDSLA systems.)
The ensuing 3GPP Selection Phase used multiple subjective test organizations to analyze the behavior of the emerging codec through 24 ‘experiments’, in two languages. As a result, the ‘single joint candidate’ was selected in August 2014 and the EVS codec specifications were approved in September the same year. The Verification and Characterization Phases followed, consisting of further intensive subjective and objective testing designed to encompass all the features of EVS and to investigate specific use cases in detail. The standardization process was completed in December 2014. The first EVS-enabled smartphones appeared in early 2016 and the first commercial service was announced by T-Mobile USA in April 2016: https://www.t-mobile.com/news/volte-enhanced-voice-services.
Other early adopters of EVS include Vodafone Germany, Deutsche Telekom Germany and NTT DoCoMo Japan.
The criteria for the specification of EVS were based on emerging and anticipated requirements for (primarily) mobile networks, including:
- a design for packet-switched networks, mobile VoIP and 4G/5G VoLTE in particular, improving performance and resilience to network impairments, with advanced jitter buffer management;
- support for a range of input/output sampling rates, independent of audio bandwidth;
- support for variable and fixed bit-rates up to 128kbps; enabling transmission of speech, audio and mixed speech/audio;
- the need for enhanced voice quality, including “full HD Voice”, through extended audio bandwidth, whilst making economical use of data bandwidth. N.B: The use of the AMR-WB codec in 3G and 4G networks does not deliver the potential “super-wideband” voice performance of over-the-top (OTT) applications such as Skype, FaceTime and WhatsApp;
- backwards compatibility with the nine AMR-WB bit-rates, allowing inter-operability with legacy devices.
The potential benefits of EVS over legacy codecs include these:
- maintaining AMR and AMR-WB voice quality with reduced network bandwidth, resulting in increased capacity;
- providing enhanced voice quality in wideband and super-wideband modes, compared to AMR narrowband and wideband modes respectively, using the same network bandwidth;
- approximating AAC audio quality at wideband and super-wideband but with lower delay;
- improved resilience to packet errors.
Background and technical details, with clear graphics:
3GPP Technical Report, EVS:
Use Cases Introduction
The use cases described below discuss the use of the Opale Systems MultiDSLA test system with VPP+ nodes (User Case 1) and DSLAII and VPP+ nodes (Use Case 2).
DSLAII is a family of hardware products featuring analog interfaces to mobile devices (user equipment, UE), conventional analog lines or analog lines derived from VoIP networks by an analog telephone adaptor (ATA) or similar gateway device. Products in the DSLAII family support 2, 4 or 6 such interfaces.
VPP+ is a family of software products supporting VoIP using the SIP protocol and a wide range of codecs, including EVS.
The Use Cases below concern the evaluation and selection of EVS modes and the evaluation of VoLTE devices, both in the context of voice quality optimization.
Use Case 1: EVS Mode Selection
Purpose: To predict the voice quality performance of the various EVS codec modes with any required network conditions.
The image shows two distinct VPP+ installations for clarity, but in practice both the MultiDSLA Controller and VPP+2 (VPP+ with two ‘nodes’) may be installed on the same PC, making this a very compact test system.
SIP calls are made between the two VPP+ nodes using the required EVS mode settings. The ‘clean network’ must not introduce significant packet transmission errors and it is wise to disable any Internet connections and work in isolation of any other local network. Using a single isolated PC for both MultiDSLA and VPP+2 will normally achieve this objective.
Packet Impairment Simulation
This use case employs the powerful Packet Impairment Overlay feature of VPP+ to simulate network impairments (packet loss and jitter). Unlike traditional impairment generators, VPP+ generates repeatable patterns of loss and jitter, so no matter how many times you use a particular packet impairment overlay, it always has precisely the same effect on the RTP stream. Because of this, you can make accurate comparisons because you are in control of all the variables.
The Packet Impairment Designer shown above allows the user to generate any required number of Overlays. Each Overlay is a profile containing a particular pattern of loss and jitter, with precisely definable characteristics.
The table below shows a sample Test Plan built around a number of packet impairment overlays which are used with each of a number of EVS Mode selections in order to characterize and compare the modes in terms of voice quality and network bandwidth efficiency. The ‘Average POLQA Score’ would typically be calculated from the individual scores obtained for two different female and two different male voices.
Sample Test Plan – Use Case 1
- Because each packet impairment overlay is completely repeatable, it is not normally necessary to run more than a few tests for each combination of settings. It is always wise to run at least two iterations however, to make sure that results are stable. Any instability in the scores would suggest that the ‘clean’ network itself is contributing to the packet impairments.
- The effect of packet loss on voice quality depends on whether the loss occurs during an active speech segment or during a silence interval. The Overlay method ensures that the loss and jitter pattern maps precisely to the speech sample used, thus ensuring complete repeatability. It follows that if the loss and jitter pattern were to be offset in relation to the speech sample, different results could be obtained. This can be easily exploited to extend the usefulness of each Overlay, by specifying an offset of say 0 to 99 packets, with a step size of 1 packet. Using this simple setup, shown below, a set of 100 results per speech sample are obtained for a single Overlay. This sophisticated test is achieved with a simple MultiDSLA tasklist, as seen here:
This Trend report graph shows a set of 100 scores using the above configuration with the Female 1 speech sample. The graph maps the successive POLQA scores as the Overlay impairment pattern is advanced by one packet at a time, in relation to the speech sample. The dotted red line is set at 3.00.
Use Case 2: VoLTE Device Voice Quality Analysis
Purpose: To ensure optimized voice quality performance before new VoLTE devices are released for sale. This may include, for example:
- verification of compatibility with EVS modes selected for the network;
- verification of compatibility of a device’s EVS (AMR-BW Interop Mode) with network’s AMR-WB (i.e. before EVS is adopted in the network);
- comparison of devices against competing products, or against a preferred ‘gold standard’ device.
This use case employs the MultiDSLA test system with a DSLAII and a VPP+, plus a VoLTE reference base station (base station emulator). A clean, isolated network is required, as for Test Case 1 – this is ideally provided by an Ethernet switch as shown here. The packet impairment techniques outlined for Use Case 1 may also be used in this application.
The key benefit of Use Case 2 is that voice quality can be assessed separately for uplink and downlink paths, completely independent of production and test networks, for a wide range of packet impairment conditions, and for all required modes of EVS and AMR codecs. The analysis of downlink voice quality is likely to be the more important, since this encompasses the way the device handles network jitter. The jitter buffer implementation will be a determinant of voice quality and this can be exercised extensively using the RTP packet impairment capabilities of VPP+.
This diagram shows the MultiDSLA application controlling both the DSLA and the VPP+. Test calls are made between the VoLTE device (DUT) and the VPP+ and both are registered to the IMS server associated with the reference base station. Either terminal may originate the call, so both Mobile Originated (MO) and Mobile Terminated (MT) modes of call setup are possible.
This type of testing is discussed further in an Opale Systems blog article here, and may be extended to audio quality testing using the Perceptual Evaluation of Audio Quality (PEAQ) metric available for MultiDSLA. The streaming of music over EVS is a potential value-added service option for operators with EVS-enabled networks.