- Learning Hadoop 2
- Garry Turkington Gabriele Modena
- 276字
- 2021-07-23 20:57:38
AWS – infrastructure on demand from Amazon
AWS is a set of cloud-computing services offered by Amazon. We will use several of these services in this book.
Simple Storage Service (S3)
Amazon's Simple Storage Service (S3), found at http://aws.amazon.com/s3/, is a storage service that provides a simple key-value storage model. Using web, command-line, or programmatic interfaces to create objects, which can be anything from text files to images to MP3s, you can store and retrieve your data based on a hierarchical model. In this model, you create buckets that contain objects. Each bucket has a unique identifier, and within each bucket, every object is uniquely named. This simple strategy enables an extremely powerful service for which Amazon takes complete responsibility (for service scaling, in addition to reliability and availability of data).
Elastic MapReduce (EMR)
Amazon's Elastic MapReduce, found at http://aws.amazon.com/elasticmapreduce/, is basically Hadoop in the cloud. Using any of the multiple interfaces (web console, CLI, or API), a Hadoop workflow is defined with attributes such as the number of Hadoop hosts required and the location of the source data. The Hadoop code implementing the MapReduce jobs is provided, and the virtual Go button is pressed.
In its most impressive mode, EMR can pull source data from S3, process it on a Hadoop cluster it creates on Amazon's virtual host on-demand service EC2, push the results back into S3, and terminate the Hadoop cluster and the EC2 virtual machines hosting it. Naturally, each of these services has a cost (usually on per GB stored and server-time usage basis), but the ability to access such powerful data-processing capabilities with no need for dedicated hardware is a powerful one.
- Java異步編程實(shí)戰(zhàn)
- 數(shù)據(jù)結(jié)構(gòu)和算法基礎(chǔ)(Java語言實(shí)現(xiàn))
- MySQL數(shù)據(jù)庫應(yīng)用與管理 第2版
- Java從入門到精通(第4版)
- 你不知道的JavaScript(中卷)
- 飛槳PaddlePaddle深度學(xué)習(xí)實(shí)戰(zhàn)
- Visual Studio 2015高級(jí)編程(第6版)
- Building Serverless Web Applications
- Deep Learning with R Cookbook
- Visual Basic 程序設(shè)計(jì)實(shí)踐教程
- Python 快速入門(第3版)
- 深入理解Kafka:核心設(shè)計(jì)與實(shí)踐原理
- WCF全面解析
- 城市信息模型平臺(tái)頂層設(shè)計(jì)與實(shí)踐
- R語言:邁向大數(shù)據(jù)之路