官术网_书友最值得收藏!

To get the most out of this book

The examples have been implemented using Scala, Java, R, and Python on a Linux 64-bit. You will also need, or be prepared to install, the following on your machine (preferably the latest version):

  • Spark 2.3.0 (or higher)
  • Hadoop 3.1 (or higher)
  • Flink 1.4
  • Java (JDK and JRE) 1.8+
  • Scala 2.11.x (or higher)
  • Python 2.7+/3.4+
  • R 3.1+ and RStudio 1.0.143 (or higher)
  • Eclipse Mars or Idea IntelliJ (latest)

Regarding the operating system: Linux distributions are preferable (including Debian, Ubuntu, Fedora, RHEL, and CentOS) and, to be more specific, for example, as regards Ubuntu, it is recommended having a complete 14.04 (LTS) 64-bit (or later) installation, VMWare player 12, or Virtual box. You can also run code on Windows (XP/7/8/10) or macOS X (10.4.7+).

Regarding hardware configuration: Processor Core i3, Core i5 (recommended) ~ Core i7 (to get the best result). However, multicore processing would provide faster data processing and scalability. At least 8 GB RAM (recommended) for a standalone mode. At least 32 GB RAM for a single VM and higher for cluster. Enough storage for running heavy jobs (depending on the dataset size you will be handling) preferably at least 50 GB of free disk storage (for stand alone and SQL warehouse).

主站蜘蛛池模板: 大足县| 南宁市| 确山县| 桃江县| 苗栗市| 泰和县| 镇原县| 乌鲁木齐县| 辉县市| 县级市| 从江县| 玛曲县| 藁城市| 泽州县| 通州区| 白河县| 德令哈市| 江孜县| 邵东县| 榆树市| 康乐县| 湘潭县| 方正县| 同仁县| 库伦旗| 北海市| 福建省| 旬阳县| 伊金霍洛旗| 凌海市| 靖州| 民丰县| 宣化县| 广灵县| 九龙县| 合作市| 永清县| 武宣县| 汉阴县| 永寿县| 柞水县|