Web Profiling: Real-time and batch analytics with large data processing engines

Web Profiling: Real-time and batch analytics with large data processing engines
1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)

“The evolution of real-time and batch analytics, in combination with big data processing engines, makes it possible to track people’s habits, activities, and whereabouts with greater accuracy, says IBM engineer Jeff Jonas. He compares big data to puzzle pieces that are integrated by analytics engines such as Hadoop and Cassandra. “It will change our existing notions of privacy,” Jonas says.

“A surveillance society is not only inevitable, it’s worse. It’s irresistible.” He notes that the ability to analyze big data over a period of time can provide even more insights into a person’s behavior. “This is super food [for big data analytics],” Jonas says. “With 87 percent certainty, I can tell you where you’ll be next Thursday at 5:35 p.m.”

Google’s Alfred Spector foresees a time when totally transparent processing will be available to Web developers via distributed computer systems, and he says that “we want to make these capabilities available to users through a prediction [application programming interface]. You can provide data sets and train machine algorithms on those data sets.” Hadoop’s largest contributor so far has been Yahoo!, which has 43,000 servers, many of which are configured in Hadoop clusters, says Yahoo!’s Todd Papaioannou. By year’s end he expects his server farms to have 60,000 machines because the site is producing 50 terabytes of data daily and has stored more than 200 petabytes.”

Source: Computer World via ACM Tech News