- Python Machine Learning Blueprints
- Alexander Combs Michael Roman
- 344字
- 2021-07-02 13:49:39
groupby
Let's now look at an operation that is highly useful, but often difficult for new pandas users to get their heads around: the .groupby() function. We'll walk through a number of examples step by step in order to illustrate the most important functionality.
The groupby operation does exactly what it says: it groups data based on some class or classes you choose. Let's take a look at a simple example using our iris dataset. We'll go back and reimport our original iris dataset, and run our first groupby operation:

Here, data for each species is partitioned and the mean for each feature is provided. Let's take it a step further now and get full descriptive statistics for each species:

And now, we can see the full breakdown bucketed by species. Let's now look at some other groupby operations we can perform. We saw previously that petal length and width had some relatively clear boundaries between species. Now, let's examine how we might use groupby to see that:
In this case, we have grouped each unique species by the petal width they were associated with. This is a manageable number of measurements to group by, but if it were to become much larger, we would likely need to partition the measurements into brackets. As we saw previously, that can be accomplished by means of the apply function.
Let's now take a look at a custom aggregation function:

In this code, we grouped petal width by species using the .max() and .min() functions, and a lambda function that returns a maximum petal width less than the minimum petal width.
Hopefully, you now have a solid base-level understanding of how to manipulate and prepare data in preparation for our next step, which is modeling. We will now move on to discuss the primary libraries in the Python machine learning ecosystem.
- 網絡服務器配置與管理(第3版)
- 辦公通信設備維修
- 從零開始學51單片機C語言
- The Deep Learning with Keras Workshop
- 分布式系統與一致性
- Hands-On Machine Learning with C#
- Machine Learning Solutions
- Spring Cloud微服務和分布式系統實踐
- Spring Cloud實戰
- Intel FPGA權威設計指南:基于Quartus Prime Pro 19集成開發環境
- FPGA實驗實訓教程
- USB應用分析精粹:從設備硬件、固件到主機端程序設計
- Raspberry Pi Home Automation with Arduino
- 計算機組裝、維護與維修項目教程
- Applied Deep Learning with Keras