舉報

會員
Hands-On Data Science with R
Risthemostwidelyusedprogramminglanguage,andwhenusedinassociationwithdatascience,thispowerfulcombinationwillsolvethecomplexitiesinvolvedwithunstructureddatasetsintherealworld.Thisbookcoverstheentiredatascienceecosystemforaspiringdatascientists,rightfromzerotoalevelwhereyouareconfidentenoughtogethands-onwithreal-worlddatascienceproblems.ThebookstartswithanintroductiontodatascienceandintroducesreaderstopopularRlibrariesforexecutingdatascienceroutinetasks.Thisbookcoversalltheimportantprocessesindatasciencesuchasdatagathering,cleaningdata,andthenuncoveringpatternsfromit.Youwillexplorealgorithmssuchasmachinelearningalgorithms,predictiveanalyticalmodels,andfinallydeeplearningalgorithms.YouwilllearntorunthemostpowerfulvisualizationpackagesavailableinRsoastoensurethatyoucaneasilyderiveinsightsfromyourdata.Towardstheend,youwillalsolearnhowtointegrateRwithSparkandHadoopandperformlarge-scaledataanalyticswithoutmuchcomplexity.
目錄(231章)
倒序
- coverpage
- Title Page
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the authors
- About the reviewer
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Getting Started with Data Science and R
- Introduction to data science
- Key components of data science
- Computer science
- Predictive analytics (machine learning)
- Domain knowledge
- Active domains of data science
- Finance
- Healthcare
- Pharmaceuticals
- Government
- Manufacturing and retail
- Web industry
- Other industries
- Solving problems with data science
- Using R for data science
- Key features of R
- Our first R program
- UN development index
- Summary
- Quiz
- Descriptive and Inferential Statistics
- Measures of central tendency and dispersion
- Measures of central tendency
- Calculating mean median and mode with base R
- Measures of dispersion
- Useful functions to draw automated summaries
- Statistical hypothesis testing
- Running t-tests with R
- Decision rule – a brief overview of the p-value approach
- Be careful
- Running z-tests with R
- Elaborating a little longer
- A/B testing – a brief introduction and a practical example with R
- Summary
- Quiz
- Data Wrangling with R
- Introduction to data wrangling with R
- Data types formats and sources
- Data extraction transformation and load
- Basic tools of data wrangling
- Using base R for data manipulation and analysis
- Applying families of functions
- Aggregation functions
- Merging DataFrames
- Using tibble and dplyr for data manipulation
- Basic dplyr usage
- Using select
- Filtering with filter
- Using arrange for sorting
- Summarise
- Sampling data
- The tidyr package
- Converting wide tables into long tables
- Converting wide tables into long tables
- Joining tables
- dbplyr – databases and dplyr
- Using data.table for data manipulation
- Grouping operations
- Adding a column
- Ordering columns
- What is the advantage of searching using key by?
- Creating new columns in data.table
- Deleting a column
- Pivots on data.table
- The melt functionality
- Reading and writing files with data.table
- A special note on dates and/or time
- Miscellaneous topics
- Checking data quality
- Reading other file formats – Excel SAS and other data sources
- On-disk formats
- Working with web data
- Web APIs
- Tutorial – looking at airline flight times data
- Summary
- Quiz
- KDD Data Mining and Text Mining
- Good practices of KDD and data mining
- Stages of KDD
- Scraping a dwarf name
- Retrieving text from the web
- Legality of web scraping
- Web scraping made easy with rvest
- Retrieving tweets from R community
- Creating your Twitter application
- Fetching the number of tweets
- Cleaning and transforming data
- Looking for patterns – peeking visualizing and clustering data
- Peeking data
- Visualizing data
- Cluster analysis
- Summary
- Quiz
- Data Analysis with R
- Preparing data for analysis
- Data categories
- Data types in R
- Reading data
- Managing data issues
- Mixed data types
- Missing data
- Handling strings and dates
- Handling dates using POSIXct or POSIXlt
- Handling strings in R
- Reading data
- Combining strings
- Simple pattern matching and replacement with R
- Printing results
- Data visualisation
- Types of charts – basic primer
- Histograms
- Line plots
- Scatter plots
- Boxplots
- Bar charts
- Heatmaps
- Summarizing data
- Saving analysis for future work
- Packrat
- Checkpoint
- Rocker
- Summary
- Quiz
- Machine Learning with R
- What is machine learning?
- Machine learning everywhere
- Machine learning vocabulary
- Generic problems solved by machine learning
- Linear regression with R
- Tricks for lm
- Tree models
- Strengths and weakness
- The Chilean plebiscite data
- Starting with decision trees
- Growing trees with tree and rpart
- Random forests – a collection of trees
- Support vector machines
- What about regressions?
- Hierarchical and k-means clustering
- Neural networks
- Introduction to feedforward neural networks with R
- Summary
- Quiz
- Forecasting and ML App with R
- The UI and server
- Forecasting machine learning application
- Application details
- Summary
- Quiz
- Neural Networks and Deep Learning
- Daily neural nets
- Overview – NNs and deep learning
- Neuroscience inspiration
- ANN nodes
- Activation functions
- Layers
- Training algorithms
- NNs with Keras
- Getting things ready for Keras
- Getting practical with Keras
- Further tips
- Summary
- Quiz
- Markovian in R
- Markovian-type models
- Markovian models – real-world applications
- The Markov chain
- Programming an HMM with R
- Summary
- Quiz
- Visualizing Data
- Retrieving and cleaning data
- Crafting visualizations
- Summary
- Quiz
- Going to Production with R
- What is R Shiny?
- How to build a Shiny app
- Building an application inside R
- The reactive and isolate functions
- The observeEvent and eventReactive functions
- Approach for creating a data product from statistical modeling and web UI
- Some advice about Shiny
- Summary
- Quiz
- Large Scale Data Analytics with Hadoop
- Installing the package and Spark
- Manipulating Spark data using both dplyr and SQL
- Filtering and aggregating Spark datasets
- Using Spark machine learning or H2O Sparking Water
- Providing interfaces to Spark packages
- Spark DataFrames within the RStudio IDE
- Summary
- Quiz
- R on Cloud
- Cloud computing
- Cloud types
- Things to look for
- Why Azure?
- Azure registration
- Azure Machine Learning Studio
- How modules work
- Building an experiment that uses R
- Summary
- Quiz
- The Road Ahead
- Growing your skills
- Gathering data
- Content to stay tuned to
- Meeting Stack Overflow
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-10 19:13:14
推薦閱讀
- 大數(shù)據(jù)項目管理:從規(guī)劃到實現(xiàn)
- 腦動力:Linux指令速查效率手冊
- 智能傳感器技術(shù)與應用
- PowerShell 3.0 Advanced Administration Handbook
- 商戰(zhàn)數(shù)據(jù)挖掘:你需要了解的數(shù)據(jù)科學與分析思維
- 手把手教你學AutoCAD 2010
- Expert AWS Development
- PostgreSQL 10 Administration Cookbook
- Unity Multiplayer Games
- 中文版AutoCAD 2013高手速成
- 強化學習
- 工業(yè)機器人集成應用
- Creating ELearning Games with Unity
- 常用傳感器技術(shù)及應用(第2版)
- EJB JPA數(shù)據(jù)庫持久層開發(fā)實踐詳解
- Wireshark Revealed:Essential Skills for IT Professionals
- Mastercam X5應用技能基本功特訓
- 工業(yè)機器人編程指令詳解
- 工業(yè)機器人設計與實例詳解
- 筆記本電腦維修實用教程
- Implementing Cisco UCS Solutions(Second Edition)
- 無線傳感器網(wǎng)絡節(jié)能、優(yōu)化與可生存性
- Machine Learning with Scala Quick Start Guide
- Splunk 7.x Quick Start Guide
- Photoshop CS5摳圖與調(diào)色圣經(jīng)
- Selenium測試實踐
- Ensemble Machine Learning Cookbook
- UGNX 5三維造型
- Flash CS3中文版無敵課堂
- Presto實戰(zhàn)