Breakthrough for new Swedish speech-to-text engine: “Significantly better than Google”

Voxo CEO Johan Wadenholt, and Voxo Head of New Technology and Machine Learning Amanda Eliasson.

Nearly 1.8 billion people across the world speak either English, Spanish, or Mandarin Chinese. In comparison, 10 million people use Swedish. Many speech-to-text engines focus on the most popular languages — an issue Swedish tech company Voxo is trying to solve with its new engine.

The new speech-to-text engine, which is fine-tuned for the Swedish language, is 60 percent more accurate than alternatives like Google and Speechmatics, according to the company. Voxo plans to add more Nordic languages to its engine, building onto their original platform which analyzes and visualizes voice conversation data.

“This is a big step for us, enabling us to go from being a solutions provider, to also being a technology provider,” Voxo CEO Johan Wadenholt said. “We already have clients who are implementing our new engine, which is a huge milestone for us.”

The company began using AI and machine learning in 2016 to document and understand financial advisory meetings. In 2018, Voxo also began helping businesses in other industries and today the company has customers in more than 10 industries.

Wadenholt says that the company has relied on working with speech to text providers but realized that the quality didn’t increase fast enough for the demand for the nordic languages. This led the company to prioritize building its own speech technology.

‎”During the most recent years we have had access to supercomputer clusters in the EU with ENCCS, which has enabled us to accelerate the development of our speech technology”.

Wadenholt says that the increase in quality provides more accurate analysis of things like sentiment, meeting context, and enables completely new solutions.

“Our new speech-to-text engine has exceeded our own expectations”

“When we started to do more detailed analysis for the clients, we realized that it wasn’t good enough,” Wadenholt said. “The smaller European languages were far behind, due to their limited number of speakers and market size.”

“It’s almost like Google and the other ones are trying to fit a square in a round hole,” Wadenholt said. “We use more of a phonetic interpretation, so we might get one letter wrong, but if we don’t know what it is, we’re not trying to fit something else in there.”

Voxo also plans to soon release soundwave-based technology using advanced analysis, to help find emotional cues in conversations.

“As our customer base grows, we have seen a great need to own the entire chain right down to the sound waves,” Wadenholt said. “This is where we are now reaping the fruits with our new speech engine.”

Voxo is planning to bring the new engine to the rest of the Nordics within the year.

Har du nyhetstips eller synpunkter? Kontakta oss

Skriv ut

Grunden i vår journalistik är trovärdighet och opartiskhet. Techarenan är obereoende i förhållande till politiska, religiösa, ekonomiska, offentliga och privata särintressen.