官术网_书友最值得收藏!

A running example

There will be many practical use cases throughout the book, sometimes a couple in each chapter. But we will also have a running example, building a search engine. This problem is interesting for a number of reasons:

  • It is fun
  • Business in almost any domain can benefit from a search engine
  • Many businesses already have text data; often it is not used effectively, and its use can be improved
  • Processing text requires a lot of effort, and it is useful to learn to do this effectively

We will try to keep it simple, yet, with this example, we will touch on all the technical parts of the data science process throughout the book:

  • Data Understanding: Which data can be useful for the problem? How can we obtain this data?
  • Data Preparation: Once the data is obtained, how can we process it? If it is HTML, how do we extract text from it? How do we extract inpidual sentences and words from the text?
  • Modeling: Ranking documents by their relevance with respect to a query is a data science problem and we will discuss how it can be approached.
  • Evaluation: The search engine can be tested to see if it is useful for solving the business problem or not.
  • Deployment: Finally, the engine can be deployed as a REST service or integrated directly to the live system.

We will obtain and prepare the data in Chapter 2Data Processing Toolbox, understand the data in Chapter 3Exploratory Data Analysis, build simple models and evaluate them in Chapter 4, Supervised Machine Learning - Classification and Regression, look at how to process text in Chapter 6Working with Text - Natural Language Processing and Information Retrieval, see how to apply it to millions of webpages in Chapter 9Scaling Data Science, and, finally, learn how we can deploy it in Chapter 10Deploying Data Science Models.

主站蜘蛛池模板: 全椒县| 绿春县| 剑阁县| 淮南市| 饶平县| 阿瓦提县| 康平县| 祁门县| 土默特右旗| 三原县| 喀喇| 府谷县| 息烽县| 常山县| 乌兰察布市| 马鞍山市| 大田县| 饶阳县| 武冈市| 卓尼县| 大田县| 宿州市| 黎川县| 揭阳市| 托里县| 满城县| 乐平市| 肃宁县| 长沙市| 达日县| 渭源县| 遵义县| 枝江市| 淳安县| 剑河县| 纳雍县| 巴中市| 乐陵市| 勃利县| 郧西县| 十堰市|