Nothing Special

So, to start this, please excuse my english skills. I was born and raised in California, and have never left the state, however, it is common that I find myself incapable of expression, or the communication of thought.

To start, Data Science is a big, beautiful thing. It allows for complex systems of measurement: Pandora Song Reccomendations, Amazon, NSA Wire Tapping, Targetted Advertisement, Determining Political Elections, Turning Water to Wine, The Miracle of Child Birth, etc, etc. Okay, some of those things may have been overstatements. However, it isn’t rocket science. Over the course of these bite-sized tutorials hopefully I can take some of that magic away, and if you do the Data Mining project come week 5, we can convince you that Big Data really is “Nothing Special”.

Finding the Good Stuff

The problem in the 21st century is finding what we need. iTunes has 11 million tracks to choose from,Spotify has over 15 million songs, Amazon has 2 million books. Netflix has over 100,000 movies, Hulu has 50,000, Amazon Prime has another 100,000. Geez la Wheez, just look at Youtube. GEEEZ. How does it work? Magic? Mice? Employees? Spies? Unicorns!? NO!!!


The problem of dealing with the vastness of the world is what we will be focusing on within this series of bite-sized guides. We will be dealing with existential questions as well as practical ones. What does it mean that my entire life can be reduced to a data point among millions? Absolutely nothing, lmao.

But seriously, hopefully you have fun. It really is beautiful that we can deal with data of this scale in equations/algorithms that only take 10 lines of code.


W-wait, what are you talking about and why does this matter?

I’ll explain through a section of a book called The Ancient Art of the Numerati by Ron Zacharski.

“There’s lots of stuff out there (movies, music,books, rice cookers). There’s going to be a huge growth in the amount of stuff out there. Theproblem with having all this stuff available is finding the stuff that is relevant to us. Of all the movies out there, what movie should I watch. What’s the next book I should read? This problem of identifying relevant stuff is what data mining is about. Most websites will have some component dealing with ‘finding stuff’. In addition to the movies, music, books, and rice cookers mentioned above, you might want recommendations about what friends to follow. How about a personalized newspaper showing just the news you are most interested in? If you are a programmer, particularly a web developer, it would be useful to know data mining techniques.”

