官术网_书友最值得收藏!

What kind of skills are required to become a data scientist?

In the industry, the reality is that data science is so new that companies do not yet have a well-defined career path for it. How do you get hired for a data scientist position? How many years of experience is required? What skills do you need to bring to the table? Math, statistics, machine learning, information technology, computer science, and what else?

Well, the answer is probably a little bit of everything plus one more critical skill: domain-specific expertise.

There is a debate going on around whether applying generic data science techniques to any dataset without an intimate understanding of its meaning, leads to the desired business outcome. Many companies are leaning toward making sure data scientists have substantial amount of domain expertise, the rationale being that without it you may unknowingly introduce bias at any steps, such as when filling the gaps in the data cleansing phase or during the feature selection process, and ultimately build models that may well fit a given dataset but still end up being worthless. Imagine a data scientist working with no chemistry background, studying unwanted molecule interactions for a pharmaceutical company developing new drugs. This is also probably why we're seeing a multiplication of statistics courses specialized in a particular domain, such as biostatistics for biology, or supply chain analytics for analyzing operation management related to supply chains, and so on.

To summarize, a data scientist should be in theory somewhat proficient in the following areas:

  • Data engineering / information retrieval
  • Computer science
  • Math and statistics
  • Machine learning
  • Data visualization
  • Business intelligence
  • Domain-specific expertise

Note

If you are thinking about acquiring these skills but don't have the time to attend traditional classes, I strongly recommend using online courses.

I particularly recommend this course: https://www.coursera.org/: https://www.coursera.org/learn/data-science-course.

The classic Drew's Conway Venn Diagram provides an excellent visualization of what is data science and why data scientists are a bit of a unicorn:

Drew's Conway Data Science Venn Diagram

By now, I hope it becomes pretty clear that the perfect data scientist that fits the preceding description is more an exception than the norm and that, most often, the role involves multiple personas. Yes, that's right, the point I'm trying to make is that data science is a team sport and this idea will be a recurring theme throughout this book.

主站蜘蛛池模板: 元朗区| 旬邑县| 丹凤县| 顺平县| 彰化市| 湛江市| 萝北县| 新疆| 龙井市| 清涧县| 汾西县| 永定县| 洛川县| 江门市| 湘潭市| 云安县| 井研县| 收藏| 苏尼特左旗| 定结县| 柞水县| 雷波县| 红河县| 珠海市| 漳浦县| 云南省| 阿巴嘎旗| 井陉县| 蓬溪县| 高唐县| 桐柏县| 凤翔县| 靖宇县| 盐池县| 沈丘县| 荆州市| 兖州市| 阿拉善右旗| 罗田县| 峨眉山市| 高台县|