Discover our other sites:

You might be looking for...

This article was produced and financed by The Research Council of Norway

Children communicating via a speech device are still forced to use a synthetic adult voice. (Photo: Shutterstock)

Success in creating artificial child’s voice

It is very difficult to get a PC to recognise the voice of a child. Equally problematic is using a computer to synthesise speech in a child’s voice. Simple, effective solutions are now found.

Norunn K. Torheim Norunn K. Torheim

Else Lie Else Lie

Article from The Research Council of Norway Article from The Research Council of Norway

Published 27 February 2012 - 05:00

Denne artikkelen er over ti år gammel og kan inneholde utdatert informasjon.

The Research Council of Norway

The Research Council of Norway is a government agency responsible for awarding grands for research as well as promoting research and science. It also advises the government in matters related to research.

Synthesised speech has grown more and more similar to human speech. Yet children communicating via a speech device are still forced to use a synthetic adult voice.

This drawback was the driver behind a collaborative research project, which is developing synthesised childlike voice. This is a first in Norway, but very little research has been carried out on this subject internationally.

Now they are putting an entirely new method to the test.

Modified master voice

Professor Torbjørn Svendsen from the Norwegian University of Science and Technology. (Photo: NTNU)

“We start with what is known as a master voice, which is the product of three or four adult speakers recording several thousands of phrases," says Torbjørn Nordgård from the software company Lingit and also a professor of linguistics at the University of Nordland.

Then the researchers record a single child reading a smaller number of phrases aloud. This recording is used to modify the master voice, making it sound like a child’s voice.

The phrases recorded by the child have been selected to include a number of the most essential sounds found in the language.

“The master voice still carries the intonation, i.e. a phrase’s melody. The result sounds rather like a child with unusual elocution skills, but it’s still much better than the voice of an adult,” says Nordgård.

FUNDING

Synthesised childlike voice - a collaborative research project involving MedialT, a company developing tools to assist disabled persons, and Lingit, a software company. Funds are granted the Research Council programme ICT for the disabled (IT Funk).

The second research project is carried out in cooperation with the researchers in the Voice control in multimodal dialogue (SMUDI) project, which received funding from the Research Council’s Large-scale programme Core Competence and Value Creation in ICT (VERDIKT) and the Ministry of Education and Research.

Everything is now in place to start testing trial versions of the child’s voice.

“We hope to have a beta version in place this summer,” says Magne Lunde, Managing Director of Media LT, a company developing tools to assist disabled persons.

Verbal commands

Lunde and his colleagues are also researching voice control such as use of verbal commands to operate a PC.

In order to operate a computer by means of speech, the machine must successfully decipher what is being said. Interpreting the speech of individuals on both the young and the older end of the scale is especially challenging since the distance from their vocal cords to their lips is shorter than that of the average adult.

“Teaching a speech recognition program to understand the pronunciation of the various sounds of a language requires a relatively large amount of recorded speech. Unfortunately, insufficient data exist today in terms of actual children’s speech,” states Professor Torbjørn Svendsen from the Norwegian University of Science and Technology.

Professor Svendsen and his research partners have come up with a very elegant, yet simple method of overcoming the challenges associated with speech recognition and children – they have synthesised children’s voices and used the results to compile a collection of data.

A vast improvement in quality

The length of the vocal tract affects the frequency distribution of the speech energy. The researchers are using technology to render the energy distribution of adult speech so that it more closely resembles that of a child.

“The converted adult speech resembles the way children speak in terms of sound as well. Thus, we could apply our conversion technique to a large database of adult speech and generate a functional database of artificial childlike voices. We then used this to train a separate speech recognition program for children,” explains Professor Svendsen.

This process greatly improved the recognition fidelity of children’s speech. The error rate was reduced by 50 to 70 per cent.

Norwegian: a tough language

The Norwegian language poses a number of especially steep challenges to speech recognition experts.

“In general, the degree of variation in any language is large enough to make it difficult to model. But Norwegian is especially tricky; there are two distinct written forms of the language, countless dialects and a wide range of accepted alternatives for words, declensions and compounds. On top of all this, there is no single pronunciation standard,” stresses Torbjørn Svendsen.

Svendsen also points out that people can experience considerable difficulty when faced with voice-controlled devices. A video clip of two Scotsmen using a speech-operated lift illustrates this rather humorously.

“It is easy to get caught up in our fascination with speech recognition and the many possibilities it holds, so it’s important not to replace existing technology when it remains the best option for getting something done – like using buttons to operate a lift,” he concludes.

Translated by: Glenn Wells og Carol B. Eckmann

You might be looking for...

Success in creating artificial child’s voice

The Research Council of Norway

Modified master voice

FUNDING

Verbal commands

A vast improvement in quality

Norwegian: a tough language

The myth of reef-safe sunscreen: how chemicals hurt coral life

Tired? Here’s how you can get your energy back

What makes Ebola one of the world's deadliest viruses?

Is it dangerous to have treated timber in your vegetable garden?

Experts explain: Here’s what Crown Princess Mette-Marit will have to go through after her lung transplant

A brain researcher says most people would choose this shortcut

Electric car batteries last much longer than researchers and people thought. Why?

People in Norway hunted whales 5,000 years ago

The oldest water in the Arctic Ocean has been there for several hundred years

New findings about the evolution of the brain: "I was blown away. My first reaction was, is this real?

UK: Andy Burnham wins Makerfield and could become prime minister

Can a swarm of AI agents create better music?

Why it's important to forget

The battery that promises everything – and why researchers are sceptical

Excavation underway: Believed to be an important Viking clue

Peru has a 'dugnad' too, and it’s a tradition that has been changing for centuries

The unique Porcelain Wreck: What the chandelier recovered from the seabed may have looked like

Running up the hill – Climate change is forcing plants to move upwards

What people regret most when death approaches

Should the state pay for free weight loss drugs?

Why you should store olive oil in the fridge

Norwegian history is full of gaps and myths

Sensational shipwreck discovery off Norway: Intact Chinese porcelain and chandeliers at a depth of 600 metres

Christening gowns have not always been white

Follow us