官术网_书友最值得收藏!

Choosing the right algorithm

In the previous section, we learned the difference between various types of machine learning algorithms. So, we understood the basic principles that underlie the different techniques. Now it's time to ask ourselves the following question: What is the right algorithm for my needs?
Unfortunately there is no common answer for everyone, except the more generic: It depends. But what does it depend on? It mainly depends on the data available to us: the size, quality, and nature of the data. It depends on what we want to do with the answer. It depends on how the algorithm has been expressed in instructions for the computer. It depends on how much time we have. There is no best method or one-size-fits-all. The only way to be sure that the algorithm chosen is the right one is to try it.

However, to understand what is most suitable for our needs, we can perform a preliminary analysis. Beginning from what we have (data), what tools we have available (algorithms), and what objectives we set for ourselves (the results), we can obtain useful information on the road ahead.

If we start from what we have (data), it is a classification problem, and two options are available:

  • Classify based on input: We have a supervised learning problem if we can label the input data. If we cannot label the input data but want to find the structure of the system, then it is unsupervised. Finally, if our goal is to optimize an objective function by interacting with the environment, it is a reinforcement learning problem.
  • Classify based on output: If our model output is a number, we have to deal with a regression problem. But it is a classification problem if the output of the model is a class. Finally, we have a clustering problem if the output of the model is a set of input groups.

The following is a figure that shows two options available in the classification problem:


Figure 1.9: Preliminary analysis

After classifying the problem, we can analyze the tools available to solve the specific problem. Thus, we can identify the algorithms that are applicable and focus our study on the methods to be implemented to apply these tools to our problem.

Having identified the tools, we need to evaluate their performance. To do this, we simply apply the selected algorithms on the datasets at our disposal. Subsequently, on the basis of a series of carefully selected evaluation criteria, we carry out a comparison of the performance of each algorithm.

主站蜘蛛池模板: 婺源县| 舟山市| 饶阳县| 金平| 修文县| 株洲县| 嘉善县| 灵丘县| 信阳市| 韩城市| 孝义市| 洪雅县| 金塔县| 益阳市| 崇信县| 西城区| 洪泽县| 抚州市| 漳平市| 兴文县| 漠河县| 施秉县| 正宁县| 泸溪县| 泰州市| 鄂伦春自治旗| 共和县| 广西| 九龙城区| 乐都县| 申扎县| 陇川县| 大庆市| 承德县| 新闻| 阳高县| 巧家县| 阳东县| 永定县| 吴忠市| 溧阳市|