官术网_书友最值得收藏!

Algorithm selection

We need to iterate on the complex problem of the creating the algorithm. This entails exploring the data to gain a deep understanding of the underlying variables. Once we have an idea of the kind of algorithm we want to apply, we'll need to further prepare the data, possibly combining it with other data sources (for example, census data). In our example, this could mean creating a song similarity matrix. Once we have the data, we can train a model so that it is capable of making predictions, and test that model against holdout data to see how it performs. There are many considerations in this process that make it complex:

  • How the data is encoded (for example, how the song matrix is constructed)
  • What algorithm is used (example, collaborative filtering or content-based filtering)
  • What parameter values your model takes (for example, values for smoothing constants or prior distributions)

Our goal in this book is to make this step easier for you by presenting iterations a data scientist would undergo in the task of creating a successful model using real-world applications as examples.

主站蜘蛛池模板: 淳安县| 马关县| 延川县| 济源市| 湄潭县| 石渠县| 鹤山市| 来安县| 南涧| 宜兰县| 甘洛县| 西藏| 临沭县| 鄂托克前旗| 海淀区| 荥经县| 虎林市| 安仁县| 明星| 东平县| 大城县| 孟津县| 新平| 靖安县| 广河县| 巴彦淖尔市| 咸宁市| 博湖县| 肃宁县| 崇阳县| 芦溪县| 宁波市| 西乌珠穆沁旗| 邹城市| 朝阳市| 张家口市| 伊宁县| 连平县| 双鸭山市| 巍山| 翁源县|