官术网_书友最值得收藏!

Introduction

This book targets intermediate to advanced users who are familiar with Python, IPython, and scientific computing. In this chapter, we will give a brief recap on the fundamental tools we will be using throughout this book: IPython, the notebook, pandas, NumPy, and matplotlib.

In this introduction, we will give a broad overview of IPython and the Python scientific stack for high-performance computing and data science.

What is IPython?

IPython is an open source platform for interactive and parallel computing. It offers powerful interactive shells and a browser-based notebook. The notebook combines code, text, mathematical expressions, inline plots, interactive plots, and other rich media within a sharable web document. This platform provides an ideal framework for interactive scientific computing and data analysis. IPython has become essential to researchers, data scientists, and teachers.

IPython can be used with the Python programming language, but the platform also supports many other languages such as R, Julia, Haskell, or Ruby. The architecture of the project is indeed language-agnostic, consisting of messaging protocols and interactive clients (including the browser-based notebook). The clients are connected to kernels that implement the core interactive computing facilities. Therefore, the platform can be useful to technical and scientific communities that use languages other than Python.

In July 2014, Project Jupyter was announced by the IPython developers. This project will focus on the language-independent parts of IPython (including the notebook architecture), whereas the name IPython will be reserved to the Python kernel. In this book, for the sake of simplicity, we will just use the term IPython to refer to either the platform or the Python kernel.

A brief historical retrospective on Python as a scientific environment

Python is a high-level general-purpose language originally conceived by Guido van Rossum in the late 1980s (the name was inspired by the British comedy Monty Python's Flying Circus). This easy-to-use language is the basis of many scripting programs that glue different software components (glue language) together. In addition, Python comes with an extremely rich standard library (the batteries included philosophy), which covers string processing, Internet Protocols, operating system interfaces, and many other domains.

In the late 1990s, Travis Oliphant and others started to build efficient tools to deal with numerical data in Python: Numeric, Numarray, and finally, NumPy. SciPy, which implements many numerical computing algorithms, was also created on top of NumPy. In the early 2000s, John Hunter created matplotlib to bring scientific graphics to Python. At the same time, Fernando Perez created IPython to improve interactivity and productivity in Python. All the fundamental tools were here to turn Python into a great open source high-performance framework for scientific computing and data analysis.

Note

It is worth noting that Python as a platform for scientific computing was built slowly, step-by-step, on top of a programming language that was not originally designed for this purpose. This fact might explain a few minor inconsistencies or weaknesses of the platform, which do not preclude it from being one of the most popular open frameworks for scientific computing at this time. (You can also refer to http://cyrille.rossant.net/whats-wrong-with-scientific-python/.)

Notable competing open source platforms for numerical computing and data analysis include R (which focuses on statistics) and Julia (a young, high-level language that focuses on high performance and parallel computing). We will see these two languages very briefly in this book, as they can be used from the IPython notebook.

In the late 2000s, Wes McKinney created pandas for the manipulation and analysis of numerical tables and time series. At the same time, the IPython developers started to work on a notebook client inspired by mathematical software such as Sage, Maple, and Mathematica. Finally, IPython 0.12, released in December 2011, introduced the HTML-based notebook that has now gone mainstream.

In 2013, the IPython team received a grant from the Sloan Foundation and a donation from Microsoft to support the development of the notebook. IPython 2.0, released in early 2014, brought many improvements and long-awaited features.

What's new in IPython 2.0?

Here is a short summary of the changes brought by IPython 2.0 (succeeding v1.1):

  • The notebook comes with a new modal user interface:
    • In the edit mode, we can edit a cell by entering code or text.
    • In the command mode, we can edit the notebook by moving cells around, duplicating or deleting them, changing their types, and so on. In this mode, the keyboard is mapped to a set of shortcuts that let us perform notebook and cell actions efficiently.
  • Notebook widgets are JavaScript-based GUI widgets that interact dynamically with Python objects. This major feature considerably expands the possibilities of the IPython notebook. Writing Python code in the notebook is no longer the only possible interaction with the kernel. JavaScript widgets and, more generally, any JavaScript-based interactive element, can now interact with the kernel in real-time.
  • We can now open notebooks in different subfolders with the dashboard, using the same server. A REST API maps local URIs to the filesystem.
  • Notebooks are now signed to prevent untrusted code from executing when notebooks are opened.
  • The dashboard now contains a Running tab with the list of running kernels.
  • The tooltip now appears when pressing Shift + Tab instead of Tab.
  • Notebooks can be run in an interactive session via %run notebook.ipynb.
  • The %pylab magic is discouraged in favor of %matplotlib inline (to embed figures in the notebook) and import matplotlib.pyplot as plt. The main reason is that %pylab clutters the interactive namespace by importing a huge number of variables. Also, it might harm the reproducibility and reusability of notebooks.
  • Python 2.6 and 3.2 are no longer supported. IPython now requires Python 2.7 or >= 3.3.

Roadmap for IPython 3.0 and 4.0

IPython 3.0 and 4.0, planned for late 2014/early 2015, should facilitate the use of non-Python kernels and provide multiuser capabilities to the notebook.

References

Here are a few references:

主站蜘蛛池模板: 鹤峰县| 和政县| 赞皇县| 广昌县| 莱阳市| 桑日县| 无锡市| 濮阳市| 榆中县| 靖州| 雷州市| 西藏| 黑山县| 枣庄市| 井陉县| 合肥市| 延津县| 建始县| 宁陵县| 昌宁县| 阿克苏市| 延津县| 高雄市| 东源县| 临洮县| 龙井市| 贵港市| 长顺县| 武定县| 沙洋县| 古交市| 岐山县| 留坝县| 辉南县| 大竹县| 遂平县| 泰宁县| 罗甸县| 新和县| 澄江县| 称多县|