
6 rue Dewoitine
78140 Vélizy-Villacoublay
France

Parc de Sophia Antipolis
06560 Valbonne
France
We added POLQA (ITU-T P.863) to our MultiDSLA product some seven months ago. Since then, we have been busy discovering what works and what doesn’t work quite so well in the new algorithm. Most of this learning comes from working with our early adopters of POLQA. We continue to gain further knowledge about using POLQA and how to understand results that don’t make sense. We thought it would be useful to share these odd behaviours over the coming weeks. Here is the third one.
A telephone network introduces a delay between when we speak and when we are heard by the other person. This is the one-way bulk transmission delay and is the result of geographical distance as well as the processing delay introduced by network equipment. Delay variations, or jitter, describe how the transmission delay varies around the bulk delay. This is a relatively new phenomenon and was first seen when VoIP networks were introduced in the late 1990s. A circuit-switched network has a constant delay once a call is established.
Packet networks inherently exhibit jitter in the arrival time of packets. In VoIP networks the jitter in the arriving packet stream must be removed before playing to the user. This is done by the use of a jitter buffer. The introduction of variable delay in the playback was first seen on VoIP networks when developers realised that they could reduce the bulk delay of a connection by reducing the jitter buffer size if the observed jitter level was low on the arriving stream. Rather than needing to create a large buffer to cope with all network types, the terminal would monitor the incoming arrival times and either increase or decrease the size of the buffer based on the level of jitter observed. This type of delay variation would result in step delay changes in the received audio signal, preferably during silence periods as this passes unnoticed by a listener, but also sometimes during the speech signal. PESQ is good at dealing with this type of delay change and in MultiDSLA we provide a couple of analysis views to help understand where this occurs. Analysing changes in delay can help determine if a drop in speech quality is the result of the underlying packet network having a high level of jitter.
Codec compression techniques have been developed which add a new form of delay variation. This is often called a time-warping, time-stretching or time-scaling effect. It was found that stretching a signal by a number of milliseconds, or compressing the signal in time, without changing the pitch is almost imperceptible to the listener – while delivering a compression gain or improved packet loss concealment. This type of coding is not well handled by PESQ and results in a lower score than seen in a subjective test.
Contact Opale Systems or your distributor for more information.