I've been watching some of the tweets coming out of Data Incite 2013. A few have caused my eye-brows to raise.
Caveat on this, tweets are always a bit dangerous because people have to get their entire thought over in 140 characters and you don't know what context they are saying these things in. Reading some of my past tweets I sometime wonder what I was drinking at the time :-)
Caveat on this, tweets are always a bit dangerous because people have to get their entire thought over in 140 characters and you don't know what context they are saying these things in. Reading some of my past tweets I sometime wonder what I was drinking at the time :-)
We need 100s more Data Scientists ...but manual doesn't scale. We need Automation
How do you automate curiosity, creativity and innovation, three of qualities a Data Scientist needs? Tricky!
There's a great book called The Genesis Machine by James P Hogan. In the story one of the scientists crucial to a project leaves. The government paymasters tell the remaining scientists just to reproduce his work and carry on. The scientists try to explain that it's just not possible to reproduce genius to order "You can't tell a Rembrandt to go paint a masterpiece today.". It's a bit like that for Data Science, we can automate some of the process that surround it, but we can't automate the core, the creativity and the curiosity.
There's a great book called The Genesis Machine by James P Hogan. In the story one of the scientists crucial to a project leaves. The government paymasters tell the remaining scientists just to reproduce his work and carry on. The scientists try to explain that it's just not possible to reproduce genius to order "You can't tell a Rembrandt to go paint a masterpiece today.". It's a bit like that for Data Science, we can automate some of the process that surround it, but we can't automate the core, the creativity and the curiosity.
Traditional Manual Analytics is Dead
Hmmmm, I don't think so, at least not for a very long time.
Firstly, manual analytics is one of the primary tools of a data scientist. You look at some data, you see something interesting, you do a bit of quick and dirty analysis on it. It's interesting but not quite right, you tweak it, you run some more analysis, you add some more data to the mix, it's getting better, so you .... repeat as necessary.
Secondly, it's too deeply embedded :-) There was a great keynote at the Strata EU Conference earlier this week from Felienne Hermans called "Spreadsheets: The Dark Matter of IT ...". She makes the point that while we have great BI tools now, but people are not going to stop using Excel. it's too useful and too easy to use. And guess where people do a lot of their manual analytics - ummm, that'd be Excel.
Firstly, manual analytics is one of the primary tools of a data scientist. You look at some data, you see something interesting, you do a bit of quick and dirty analysis on it. It's interesting but not quite right, you tweak it, you run some more analysis, you add some more data to the mix, it's getting better, so you .... repeat as necessary.
Secondly, it's too deeply embedded :-) There was a great keynote at the Strata EU Conference earlier this week from Felienne Hermans called "Spreadsheets: The Dark Matter of IT ...". She makes the point that while we have great BI tools now, but people are not going to stop using Excel. it's too useful and too easy to use. And guess where people do a lot of their manual analytics - ummm, that'd be Excel.