- Machine Learning with Swift
- Alexander Sosnovshchenko
- 222字
- 2021-06-24 18:54:55
Converting categorical variables
As you already have noticed, a data frame can contain columns with the data of different types. To see which type has each column, we can check the dtypes attribute of the data frame. You can think about Python attributes as being similar to Swift properties:
In []: df.dtypes Out[]: length float64 color object fluffy bool label object dtype: object
While length and fluffy columns contain the expected datatypes, the types of color and label are less transparent. What are those objects? This means those columns can contain any type of the object. At the moment, we have strings in them, but what we really want them to be are categorical variables. In case you don't remember from the previous chapter, categorical variables are like Swift enums. Fortunately for us, data frame has handy methods for converting columns from one type to another:
In []: df.color = df.color.astype('category') df.label = df.label.astype('category')
That's it. Let's check:
In []: df.dtypes Out []: length float64 color category fluffy bool label category dtype: object
color and label are categories now. To see all colors in those categories, execute:
In []: colors = df.color.cat.categories.get_values().astype('string') colors Out[]: array(['light black', 'pink gold', 'purple polka-dot', 'space gray'], dtype='|S16')
As expected, we have four colors. '|S16' stands for strings of 16 characters in length.
- 觸摸屏實用技術與工程應用
- 計算機組裝與系統配置
- Getting Started with Qt 5
- Camtasia Studio 8:Advanced Editing and Publishing Techniques
- VCD、DVD原理與維修
- Building 3D Models with modo 701
- Intel Edison智能硬件開發指南:基于Yocto Project
- 微型計算機系統原理及應用:國產龍芯處理器的軟件和硬件集成(基礎篇)
- Managing Data and Media in Microsoft Silverlight 4:A mashup of chapters from Packt's bestselling Silverlight books
- 電腦組裝與維護即時通
- Hands-On Deep Learning for Images with TensorFlow
- 基于網絡化教學的項目化單片機應用技術
- 電腦主板維修技術
- PIC系列單片機的流碼編程
- USB 3.0編程寶典