A look at Skype VoLTE (Voice over LTE)

How Skype maintains speech continuity ?

There is much talk about LTE and OTT apps. Skype is one such app and we made some measurements using two iPhones connected to LTE and WiFi. Speech signals were applied to and collected from the handsfree ports of the iPhones.

The measurements below demonstrate very clearly how Skype maintains speech continuity in adverse network conditions by buffering delayed packets and then speeding up playback from the jitter buffer. This is done in such a way that correct pitch is preserved; all that may be heard is that the speech ‘speeds up’ for a short time. The cost of this is that speech delay can be considerably greater than in a conventional telephone call, with a risk that conversational flow is disrupted.

Across the LTE network the score is around 2 (Poor) using the ITU-T Rec. P.863 POLQA v2.4 SWB model.

There were some substantial delay changes, both during active speech and during the silence between utterances. Listen you might be able to hear the speech slowing down in the second utterance.

Over a WiFi LAN, the one way delay is typically 130ms. The score is around 3 (Satisfactory). Delay changes can still be necessary.

The glitch at about 6.5s is audible and seems to correspond to the delay variation shown above at about the same time. Listen

The IP connection between an LTE phone app and an office WiFi network phone app will be complex. It is no surprise that the one way delay can be very long.

It is interesting to see how Skype deals with the delay. Compressing the active speech in time can reduce delay. This is very clear in the screenshot below.

We can see how the delay decreased from 840ms to 760ms at the end of the first utterance, 120ms during silence and then again from 640ms to 560ms at the beginning of the second utterance. Listen

The POLQA Superwideband scale

The score is Satisfactory on the POLQA Superwideband scale. In part, this will be due to the restricted bandwidth that the app is allowing. Skype between laptops, for example, may show speech extending to 12kHz. Listen to the reference speech

Several brief intervals of poor speech quality are visible but mostly the score is around 3.5.

With thanks to Chris at GCom for setting up the experiments.

Contact Opale Systems or your distributor for more information.

POLQA v2.4 – Notes on the Upgrade’s Behaviours (Part 2)