- Machine Learning for Cybersecurity Cookbook
- Emmanuel Tsukerman
- 213字
- 2021-06-24 12:28:56
Summarizing large data using principal component analysis
Suppose that you would like to build a predictor for an individual's expected net fiscal worth at age 45. There are a huge number of variables to be considered: IQ, current fiscal worth, marriage status, height, geographical location, health, education, career state, age, and many others you might come up with, such as number of LinkedIn connections or SAT scores.
The trouble with having so many features is several-fold. First, the amount of data, which will incur high storage costs and computational time for your algorithm. Second, with a large feature space, it is critical to have a large amount of data for the model to be accurate. That's to say, it becomes harder to distinguish the signal from the noise. For these reasons, when dealing with high-dimensional data such as this, we often employ dimensionality reduction techniques, such as PCA. More information on the topic can be found at https://en.wikipedia.org/wiki/Principal_component_analysis.
PCA allows us to take our features and return a smaller number of new features, formed from our original ones, with maximal explanatory power. In addition, since the new features are linear combinations of the old features, this allows us to anonymize our data, which is very handy when working with financial information, for example.
- ABB工業機器人編程全集
- Dreamweaver CS3網頁制作融會貫通
- Photoshop CS4經典380例
- 計算機應用復習與練習
- 西門子S7-200 SMART PLC實例指導學與用
- Photoshop CS3圖層、通道、蒙版深度剖析寶典
- Java Web整合開發全程指南
- Lightning Fast Animation in Element 3D
- MATLAB/Simulink權威指南:開發環境、程序設計、系統仿真與案例實戰
- 傳感器與新聞
- 菜鳥起飛系統安裝與重裝
- 學練一本通:51單片機應用技術
- AMK伺服控制系統原理及應用
- DynamoDB Applied Design Patterns
- 渲染王3ds Max三維特效動畫技術