官术网_书友最值得收藏!

Installing the Data Science Toolbox

The Data Science Toolbox (DST) is a virtual environment based on Ubuntu for data analysis using Python and R. Since DST is a virtual environment, we can install it on various operating systems. We will install DST locally, which requires VirtualBox and Vagrant. VirtualBox is a virtual machine application originally created by Innotek GmbH in 2007. Vagrant is a wrapper around virtual machine applications such as VirtualBox created by Mitchell Hashimoto.

Getting ready

You need to have in the order of 2 to 3 GB free for VirtualBox, Vagrant, and DST itself. This may vary by operating system.

How to do it...

Installing DST requires the following steps:

  1. Install VirtualBox by downloading an installer for your operating system and architecture from https://www.virtualbox.org/wiki/Downloads (retrieved July 2015) and running it. I installed VirtualBox 4.3.28-100309 myself, but you can just install whatever the most recent VirtualBox version at the time is.
  2. Install Vagrant by downloading an installer for your operating system and architecture from https://www.vagrantup.com/downloads.html (retrieved July 2015). I installed Vagrant 1.7.2 and again you can install a more recent version if available.
  3. Create a directory to hold the DST and navigate to it with a terminal. Run the following command:
    $ vagrant init data-science-toolbox/dst
    $ vagrant up
    

    The first command creates a VagrantFile configuration file. Most of the content is commented out, but the file does contain links to documentation that might be useful. The second command creates the DST and initiates a download that could take a couple of minutes.

  4. Connect to the virtual environment as follows (on Windows use putty):
    $ vagrant ssh
    
  5. View the preinstalled Python packages with the following command:
    vagrant@data-science-toolbox:~$ pip freeze
    

    The list is quite long; in my case it contained 32 packages. The DST Python version as of July 2015 was 2.7.6.

  6. When you are done with the DST, log out and suspend (you can also halt it completely) the VM:
    vagrant@data-science-toolbox:~$ logout
    Connection to 127.0.0.1 closed.
    $ vagrant suspend
    ==> default: Saving VM state and suspending execution...
    

How it works...

Virtual machines (VMs) emulate computers in software. VirtualBox is an application that creates and manages VMs. VirtualBox stores its VMs in your home folder, and this particular VM takes about 2.2 GB of storage.

Ubuntu is an open source Linux operating system, and we are allowed by its license to create virtual machines. Ubuntu has several versions; we can get more info with the lsb_release command:

vagrant@data-science-toolbox:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04 LTS
Release: 14.04
Codename: trusty

Vagrant used to only work with VirtualBox, but currently it also supports VMware, KVM, Docker, and Amazon EC2. Vagrant calls virtual machines boxes. Some of these boxes are available for everyone at http://www.vagrantbox.es/ (retrieved July 2015).

See also

主站蜘蛛池模板: 公主岭市| 临沂市| 敦化市| 同心县| 山东| 铜川市| 汝州市| 紫阳县| 浠水县| 镇赉县| 衡阳县| 晋江市| 贵定县| 大邑县| 章丘市| 陵水| 罗田县| 达孜县| 友谊县| 尼木县| 宽城| 南开区| 阳高县| 建德市| 灌阳县| 苍山县| 泰宁县| 洞头县| 鄂托克前旗| 万山特区| 高州市| 志丹县| 米林县| 方城县| 扎赉特旗| 望都县| 伊宁市| 信宜市| 新民市| 余江县| 甘洛县|