POLQA – Notes on Odd Behaviours (Part 1)

Olivier Willi

Thursday, 23rd June 2022

< Back to list

We added POLQA (ITU-T P.863) to our MultiDSLA product some seven months ago. Since then, we have been busy discovering what works and what doesn’t work quite so well in the new algorithm. Most of this learning comes from working with our early adopters of POLQA. We continue to gain further knowledge about using POLQA and how to understand results that don’t make sense. We thought it would be useful to share these odd behaviours over the coming weeks. Here is the third one.

1. Don’t use more than 10s of speech in narrowband mode

POLQA, like previous ITU voice quality metrics, has been calibrated and tested against a large quantity of subjective test material. The subjective tests, and hence the subjective test material, conform to ITU-T P.800 recommendations and contain speech recordings of around 8 to 10 seconds. This length of material was found optimal in achieving repeatable subjective listening quality scores.

This signal length recommendation is also made for objective measurements and certainly we have always promoted the use of 8s per measurement. This is because it is difficult to correlate long recordings with subjective test data. However, there are times when both shorter and longer test sequences are useful. For example, long speech files can show up issues with jitter buffers or signal processing and in these cases we would recommend that the score is used as an indicator only.

The problem is that in POLQA narrowband mode long recordings of, say, 32 seconds can show a drop in score of around 1.1 compared to the average of four discrete measurements. This was not identified earlier because all tests were performed using 8s subjective test material. This does not happen with POLQA super-wideband mode; scores remain consistent as the file length is increased.

The root cause of the problem has been identified, but a new release of the standard that includes a fix will take some time. Meanwhile we recommend you split long speech sequences into 8-10s sections containing two sentences. Obtain the score for each section and then calculate a mean of the scores to represent the overall quality for the longer sequence.

Note: PESQ applies a lower weighting to impairments that occur early in a long recording when calculating its score.

Part 2: Ensure reference files pass the transparency test

Part 3: Expect reporting of small delay variations which are not actually there

Part 4: Understand how POLQA time aligns signals before analysing surface views

Contact Opale Systems or your distributor for more information.

How can we help you?

mod_content_opalesystemsfooter_address_marker

Vélizy-Villacoublay

Green Plaza
6 rue Dewoitine
78140 Vélizy-Villacoublay
France

Nice

470 Promenade des Anglais
Air Promenade – CS 61138
06203 Nice Cedex 3
France

Send

This site uses cookies to ensure its proper functioning. It also uses cookies from third party services to provide advanced functionality. At any time, you can choose which services you wish to activate or decide to withdraw your consent.

Customise accepted services

You are free to choose which services you wish to enable. By authorising these third party services, you agree to the deposit and reading of cookies and the use of tracking technologies necessary for their proper functioning. By withdrawing your consent for some of these services, some website features may no longer function.

Website navigation Read more

The site writes a session cookie to enable it to function properly and to help with navigation. It cannot be deactivated.
Usage: 1 cookie, records the session identifier.
Time to live: The cookie is present during the entire session on the site. It becomes obsolete after 24 minutes of inactivity.

Mandatory

Media Popup

Display videos from Youtube or Dailymotion.

Google Analytics Read more

Records website statistics.

Accept all Refuse all Manage