Yahoo releases biggest-ever cache of Internet behavior data

Yahoo releases biggest-ever cache of Internet behavior data

In a Thursday announcement, beleaguered US Internet firm Yahoo! Inc. said that it has released the biggest-ever machine learning dataset to researchers and the public.

The massive cache of Internet data released by Yahoo is essentially a collection of records which unfold information about how users interact with Yahoo's different Web services, including the Yahoo homepage, Yahoo News, Yahoo Sports, Yahoo Finance, Yahoo Real Estate, and Yahoo Movies.

According to the details shared by Yahoo, the huge amount of Internet behavior data released by the company includes the clicks, hovers as well as scrolls of nearly 20 million anonymous users on various Yahoo pages pertaining to sports, news, finance, real estate, among others.

Yahoo has also revealed that the massive machine learning dataset released by the company does not include sensitive personal information of Yahoo users. The company asserted that -- other than clicks and other interactions with Yahoo Web properties -- the cache largely includes basic demographic information like age, gender, and city.

Yahoo also said that the Internet-behavior data trove will be available only to universities, so as to enable researchers to comprehend the online behaviors of Internet users. The company also added that the analysis of machine learning problems registered in the dataset collection will allow researchers to gain a deep understanding in several areas, including search ranking, information retrieval, computational advertising, and core machine learning.