官术网_书友最值得收藏!

What this book covers

Chapter 1, Jupyter and Data Science, covers the details of the Jupyter user interface: what objects it works with and what actions can be taken by Jupyter. We'll see what the display tells us about the data, what tools are available, and some real-life examples from the industry showing R and Python coding. We will also see some of the ways to share our notebook with other users and, correspondingly, how to protect our notebook with different security mechanisms.

Chapter 2, Working with Analytical Data in Jupyter, covers using Python to scrape a website to gather data for analysis. Then we use Python NumPy, pandas, and SciPy functions for in-depth computations of results. The chapter goes further into pandas and explores manipulating data frames. Lastly, it shows examples of sorting and filtering data frames.

Chapter 3, Data Visualization and Prediction, demonstrates prediction models from Python and R under Jupyter. Then it uses Matplotlib for data visualization and interactive plotting (under Python). Then it covers several graphing techniques available in Jupyter and density maps with SciPy. We use histograms to visualize social data. Lastly, we generate a 3D plot in Jupyter.

Chapter 4, Data Mining and SQL Queries, covers Spark Context. We show examples of using Hadoop map/reduce and use SQL with Spark data. Then we combine data frames, operate on the resulting set, import JSON data, and manipulate it with Spark. Lastly, we look at using a pivot to gather information about a data frame.

Chapter 5, R on Jupyter, covers setting up R to be one of the engines available for a notebook. Then we use some rudimentary R to analyze voter demographics for a presidential election and trends in college admissions. Finally, we look at using a predictive model to determine whether some flights would be delayed or not.

Chapter 6, Data Wrangling, teaches reading in CSV files and performing some quick analysis of the data, including visualizations to help understand the data. Next, we consider some of the functions available in the dplyr package. We also use piping to more easily transfer the results of one operation into another operation. Lastly, we look into using the tidyr package to clean up or tidy up our data.

Chapter 7, Jupyter Dashboards, covers visualizing data graphically using glyphs to emphasize important aspects of the data. We use markdown to annotate a notebook page and Shiny to generate an interactive application. We show a way to host notebooks outside of Jupyter.

Chapter 8, Statistical Modeling, teaches converting a JSON file to a CSV file. We evaluate the yelp cuisine review dataset, determining the top rated and most rated firms. We use Python to perform a similar evaluation of yelp business ratings, finding very similar distributions of the data.

Chapter 9, Machine Learning Using Jupyter, covers several machine learning algorithms in both R and Python to compare and contrast. We use naive Bayes to determine how the data might be used. We apply nearest neighbor in a couple of different ways to see results. We also use decision trees to come up with an algorithm for predictions and a neural net to explain housing prices. Finally, we use a random forest algorithm to do the same.

Chapter 10, Optimizing Jupyter Notebooks, deploys your notebook so that others can access it. It shows optimizations you can make to increase your notebook's performance. Then we look at securing the notebook and the mechanisms of sharing it.

主站蜘蛛池模板: 岑溪市| 深州市| 徐水县| 鹤山市| 韶关市| 廉江市| 商丘市| 海阳市| 德钦县| 伊川县| 利川市| 天全县| 江陵县| 绥棱县| 平乡县| 资兴市| 灵璧县| 新民市| 崇礼县| 寻甸| 从江县| 竹山县| 手游| 福泉市| 环江| 尤溪县| 香格里拉县| 乌拉特前旗| 清新县| 睢宁县| 疏勒县| 黔西| 抚顺市| 松溪县| 栖霞市| 漳平市| 颍上县| 七台河市| 沭阳县| 东阳市| 龙岩市|