Big Data: promises, threats, and challenges
Since 2011, the growing hype around Big Data has provoked mixed reactions among statisticians. For a while, there will be jobs in the data science industry. This provides arguments when trying to recruit students in a statistics curriculum. But beyond this opportunity, the next question is how connected to statistics is this emerging data science? Comparing trends in the use of terms like big data analytics and big data statistics suggests that the Big Data industry is not reducible to Statistics. I will not try to define here what Big Data is or could be. Several articles in this Bulletin have already safely avoided this point. But I will comment on connections between (mathematical) statistics and computer science that are emphasized by the so-called Data Deluge.
Big Data is sometimes said to be hijacked by computer science at the expense of statistics. This feeling is a real concern on the applied side where databases, business intelligence, analytics, visualization and reporting tools get most of the attention, while advances in computational statistics and statistical learning remain in the shadows.