“If people like Google, Apple and Amazon are putting money into this stuff, localization needs to be a part of it,” said Simon Musgrave, a lecturer at Monash University in Melbourne.
Current voice recognition software, he explained, draws from a training database of sounds and uses statistical modeling to match audio with the understanding of vowels, consonants and words.
“They’re going to need a body of recordings from Australians with good annotations” that connect audio to meaning, he said.
The Australian accent, for example, is distinct from American English in that it is non-rhotic — the “r” sound is not pronounced. Australian English also has a lot of diphones and triphones — “multiple vowels within the same space,” said Howard Manns, a lecturer in linguistics at Monash University.
“And that might be difficult for a computer or voice recognition to make sense of,” Mr. Manns said.
The software is getting better. In May, Sundar Pichai, the chief executive of Google, announced that the company’s voice recognition software had a word error rate of less than 5 percent, an improvement on the 23 percent error rate it had in 2013.
Jobs that tie linguistics and technology are becoming more common.
For example, Appen, a company that collects data for machine learning for technology companies, employs more than 70 linguists and has thousands of other linguists that it can consult.
Linguists annotate recordings of people speaking, down to the pauses in their voices, which are then fed into an algorithm that connects the audio with the meaning, said Phil Hall, the senior vice president of Appen’s language resources.
“So the linguist is the person that knows how to turn thousands of hours of voice recordings into commonly categorized information that the recognizer can use,” said Mark Brayan, the company’s chief executive.
As for the Amazon listing, it’s an attractive prospect for any linguist wanting to work on a cutting-edge project.
“‘I’m going to apply for that job right now,” Mr. Manns joked.