In response to "Your Data Science Portfolio: Math Skills Don't Matter"

mardi 13 janvier 2015



Quote:








This article is so out-of-touch with what Data Science is I struggle to justify the time it will take to comment on it. First off, any Data Scientist worth their salary knows ETL as we know it is virtually dead and only here because of old legacy applications (built by people concerned with "pipelines") and outdated BI (what's BI?...exactly) bloated data warehouses that are good for...building dashboards that nobody uses? Building software with no connection to the science that underlies WHY algorithms do what they do, and the assumptions they make about the knowledge representation that surfaces, is like building a rocket with no payload. Engineering without science (the language of which is math) is just a pretty tool that does nothing truly useful or competitive.



Software alone is nothing but building a tool that allows organizations to do what they have always done more efficiently..and probably with a prettier interface. If companies wanted to do what they have always done they wouldn't hire us. Writing code has absolutely nothing to do with the strategic direction of the company or with brining fresh insight into how an organization can compete analytically. 15 year old children can write an App and sell it for millions. How many 15 year olds are winning Kaggle competitions? Black boxing machine learning techniques is assuming that everyone's data is going to look the same. Anyone who has actually worked with real-world data knows this is not even close to being true. EVERY situation requires a deep understanding of the mathematical frameworks that underly the approaches as it is the MATH that uncovers the structure in the data, learns the concepts of the domain, and generalizes out to unseen instances. It is the math that is making all the assumptions and all the predictions.



Science is here making all the difference because we finally have the volume and variety of data to apply our scientific theories in machine learning and AI to real-world data. This requires, above all else, a deep understanding of the science and mathematics of how these algorithms works. It requires a deep understanding of the scientific approach to problem solving and vetting out hypotheses. This has nothing to do with coding or pipelining...these are mere vehicles that deliver the goods. Data Science is about math and science. Building an ETL pipeline (do people still use these?) with no science is like a rocket with no payload. What are you going to deliver with your fancy pipeline? Packaged algorithms that nobody understands and therefore have no relevance to the business you're building it for? Is your client going to compete analytically because you pressed the go button on some vendor's so-called machine learning development kit?



Anyone can build a Hadoop cluster and lay some Mahout on top...then tell the client they are doing Data Science. This is beyond dishonest and taking advantage of the Data Science hype. You have to do SCIENCE to model out the massive amounts of data underlying the client's business. You have to understand the assumptions being made by any data cleaning and modeling techniques applied. You must understand how to PROPERLY parameterize models to predict incoming data, and adapt to how the markets change. You need to understand the trade-offs when managing various models with varying prediction accuracies. This is all spoken in the language of MATH.



The math degree is the #1 degree ranked by salary + work-life balance. Anyone focused on mere tool-building should stay away from Data Science. The day will come when real scientists ask you about "your" models...what will you say? "They seem to work well with my pipeline and cluster"???. Good luck with that.



http://ift.tt/1DSWXfR



The response was written by this guy:



http://ift.tt/1BhfUcp



The original post was written by this guy:



http://ift.tt/1DSWV7x





In response to "Your Data Science Portfolio: Math Skills Don't Matter"

0 commentaires:

Enregistrer un commentaire

 

Lorem

Ipsum

Dolor