January 10, 2025

Google unveils AudioPaLM, new language model that can translate text with your voice

0

 

There have been many new advances and developments in large language models (LLM) lately. These models are a type of artificial neural network that has many parameters and are trained on a large amount of text data using self-supervised learning or semi-supervised learning.

These large language models power new generative AI tools such as Google Bard and OpenAI’s ChatGPT. Recently, Google researchers have unveiled a new language model called AudioPaLM, which can perform well at listening, speaking, and translating.

AudioPaLM is a multimodal architecture that combines the advantages of two existing models: PaLM-2 and AudioLM. The system can handle and produce text and speech and can be applied for speech recognition or to create translations with original voices.

PaLM-2 is a text-based language model that is skilled at comprehending text-specific linguistic knowledge. AudioLM is adept at retaining paralinguistic information like speaker identity and tone.

By combining these two models, AudioPaLM uses PaLM-2’s linguistic ability and AudioLM’s paralinguistic information preservation, resulting in a more in-depth comprehension and generation of both text and speech.

The model can also do zero-shot speech-to-text translations for many languages, even for speech combinations that it did not see during training. This capability can be useful for real-world applications such as real-time multilingual communication.

AudioPaLM can also transfer voices across languages based on short spoken prompts, and it can capture and reproduce distinct voices in different languages.

AudioPaLM has achieved top outcomes in speech translation benchmarks and has demonstrated competitive performance in speech recognition tasks.

Google Search’s Perspective filter

Google announced a new filter for Google Search known as ‘Perspectives’ at its annual developers’ conference, Google I/O 2023, last month. Now, almost a month and a half later, the company has commenced rolling out the new Perspective filter to all Google Search users globally.

Google made the announcement via a post on its social media handles. “Last month at #GoogleIO we shared updates we’re making to Search to help you find and explore diverse perspectives from experts and everyday people. Today you’ll be able to try it out,” the company wrote on a post on its official Twitter handle.

Google Search’s new Perspectives filter provides a human aspect to search results. At present, the search results that users see on the platform are affected by the company’s algorithm based on several factors such as dates, authors, ratings, and proximity among others. Now, the new Perspective feature changes that by bringing in views and suggestions from real human beings.

The post Google unveils AudioPaLM, new language model that can translate text with your voice appeared first on Techlusive.

 

 

There have been many new advances and developments in large language models (LLM) lately. These models are a type of artificial neural network that has many parameters and are trained on a large amount of text data using self-supervised learning or semi-supervised learning.

These large language models power new generative AI tools such as Google Bard and OpenAI’s ChatGPT. Recently, Google researchers have unveiled a new language model called AudioPaLM, which can perform well at listening, speaking, and translating.

AudioPaLM is a multimodal architecture that combines the advantages of two existing models: PaLM-2 and AudioLM. The system can handle and produce text and speech and can be applied for speech recognition or to create translations with original voices.

PaLM-2 is a text-based language model that is skilled at comprehending text-specific linguistic knowledge. AudioLM is adept at retaining paralinguistic information like speaker identity and tone.

By combining these two models, AudioPaLM uses PaLM-2’s linguistic ability and AudioLM’s paralinguistic information preservation, resulting in a more in-depth comprehension and generation of both text and speech.

The model can also do zero-shot speech-to-text translations for many languages, even for speech combinations that it did not see during training. This capability can be useful for real-world applications such as real-time multilingual communication.

AudioPaLM can also transfer voices across languages based on short spoken prompts, and it can capture and reproduce distinct voices in different languages.

AudioPaLM has achieved top outcomes in speech translation benchmarks and has demonstrated competitive performance in speech recognition tasks.

Google Search’s Perspective filter

Google announced a new filter for Google Search known as ‘Perspectives’ at its annual developers’ conference, Google I/O 2023, last month. Now, almost a month and a half later, the company has commenced rolling out the new Perspective filter to all Google Search users globally.

Google made the announcement via a post on its social media handles. “Last month at #GoogleIO we shared updates we’re making to Search to help you find and explore diverse perspectives from experts and everyday people. Today you’ll be able to try it out,” the company wrote on a post on its official Twitter handle.

Google Search’s new Perspectives filter provides a human aspect to search results. At present, the search results that users see on the platform are affected by the company’s algorithm based on several factors such as dates, authors, ratings, and proximity among others. Now, the new Perspective feature changes that by bringing in views and suggestions from real human beings.

The post Google unveils AudioPaLM, new language model that can translate text with your voice appeared first on Techlusive.