November 2, 2024

Google updates its policy on use of data to train AI models: Here’s what it says

0

 

We are seeing a proliferation of large language models such as ChatGPT and PaLM, which are the brains behind generative AI tools. These languages are much more advanced in their capabilities and can generate text, images, audio and much more by taking inputs in natural language. To boost these capabilities, these models need a large amount of data to train themselves and tech giants are resorting to publicly available data online for this purpose.

Google has recently updated its privacy policy to reflect how it will collect and use publicly available online information. Under the publicly accessible sources heading in its privacy and terms, Google has replaced the term ‘language models’ with AI models and it has further announced that it will use this data to build products such as Bard and Cloud AI capabilities. 

Earlier, the use of publicly available data was limited to building features like Google Translate. The updated policy informs and clarifies that Google can use anything people post online publicly to train Bard, its future versions and any other generative AI product it creates. 

“We may collect information that’s publicly available online or from other public sources to help train Google’s languageAI models and build products and features like Google Translate, Bard, and Cloud AI capabilities. Or, if your business’s information appears on a website, we may index and display it on Google services,” as per Google updated Privacy and Terms

Tech leaders and experts are concerned about how these companies collect and use publicly available online data to train their AI models for generative AI purposes. OpenAI, for example, was sued in a class action lawsuit that claimed it scraped “huge amounts of personal data from the web,” including “private information that was stolen,” to train its GPT models without asking for permission, as reported by Engadget

Some social media platforms are also making changes in the ways their users were accessing these platforms to avoid data scrapping by tech giants to train their AI models. 

Recently, Reddit has started charging for access to API and Elon Musk has announced measures such as stopping browsing access to Twitter users who are not logged in to their account and a ‘view limit’ on the number of posts a Twitter user can see in a day. 

“It is rather galling to have to bring large numbers of servers online on an emergency basis just to facilitate some AI startup’s outrageous valuation,” Elon Musk said. 

 

The post Google updates its policy on use of data to train AI models: Here’s what it says appeared first on Techlusive.

 

 

We are seeing a proliferation of large language models such as ChatGPT and PaLM, which are the brains behind generative AI tools. These languages are much more advanced in their capabilities and can generate text, images, audio and much more by taking inputs in natural language. To boost these capabilities, these models need a large amount of data to train themselves and tech giants are resorting to publicly available data online for this purpose.

Google has recently updated its privacy policy to reflect how it will collect and use publicly available online information. Under the publicly accessible sources heading in its privacy and terms, Google has replaced the term ‘language models’ with AI models and it has further announced that it will use this data to build products such as Bard and Cloud AI capabilities. 

Earlier, the use of publicly available data was limited to building features like Google Translate. The updated policy informs and clarifies that Google can use anything people post online publicly to train Bard, its future versions and any other generative AI product it creates. 

“We may collect information that’s publicly available online or from other public sources to help train Google’s languageAI models and build products and features like Google Translate, Bard, and Cloud AI capabilities. Or, if your business’s information appears on a website, we may index and display it on Google services,” as per Google updated Privacy and Terms

Tech leaders and experts are concerned about how these companies collect and use publicly available online data to train their AI models for generative AI purposes. OpenAI, for example, was sued in a class action lawsuit that claimed it scraped “huge amounts of personal data from the web,” including “private information that was stolen,” to train its GPT models without asking for permission, as reported by Engadget

Some social media platforms are also making changes in the ways their users were accessing these platforms to avoid data scrapping by tech giants to train their AI models. 

Recently, Reddit has started charging for access to API and Elon Musk has announced measures such as stopping browsing access to Twitter users who are not logged in to their account and a ‘view limit’ on the number of posts a Twitter user can see in a day. 

“It is rather galling to have to bring large numbers of servers online on an emergency basis just to facilitate some AI startup’s outrageous valuation,” Elon Musk said. 

 

The post Google updates its policy on use of data to train AI models: Here’s what it says appeared first on Techlusive.