Forget Big Data, we need Big Insights


A recent article in the UK Computing journal suggests that the new frontier in industry is extracting insights from big data. While a decade or more ago data ingestion was the greatest challenge to companies collecting large swathes of data, that problem has now largely been overcome.

Of 300 IT professionals interviewed, only 13% saw raw data access as the biggest challenge in their work. The majority were split between transforming raw data into useful data and extracting actionable insights from the useful data.

Database platforms such as SQL and Hadoop and other tools have largely standardised the warehousing and accessing of data across big data consumers. However, insight extraction is the next frontier and there is still no runaway leader or dominant player in this space. It is difficult to build a one size fits all solution to the problems a traditional business analyst might work on.

But it is likely we will see more and more contenders coming to the fore in this potentially lucrative space. It will be an exciting arms race to see play out in the data science world!

2014 data science survey out now

dscsurveybook The annual data science skills and salary survey from O’Reilly is now freely available from their website. The survey uses responses from 800 participants from over 50 countries.

Inside are comparisons of the different tools used by data science practitioners and the corresponding salary they can expect to earn. The data is also cut by geographic location, career level, academic record, and industry type amongst others.

A lot of the key findings are expected: R, Python, and SQL are the most widely used tools; top USA salaries are in California. But some results are more surprising: Spark has emerged as a popular tool in 2014; the ‘Entertainment’ industry boasts the highest median salary for data scientists.

Highlights in this edition include a cluster analysis of the tools used, which showed the emergence of a new cluster around Max OS X, MySQL, and D3. There is also salary regression model which puts a dollar weight on geographic, demographic, and company predictors to give an in-sample R2 of over 50%.

A shame the number of respondents is so low but all in all a good read to give a directional sense of the state of play in 2014 and what might be up and coming in 2015.