官术网_书友最值得收藏!

AWS – infrastructure on demand from Amazon

AWS is a set of cloud-computing services offered by Amazon. We will use several of these services in this book.

Simple Storage Service (S3)

Amazon's Simple Storage Service (S3), found at http://aws.amazon.com/s3/, is a storage service that provides a simple key-value storage model. Using web, command-line, or programmatic interfaces to create objects, which can be anything from text files to images to MP3s, you can store and retrieve your data based on a hierarchical model. In this model, you create buckets that contain objects. Each bucket has a unique identifier, and within each bucket, every object is uniquely named. This simple strategy enables an extremely powerful service for which Amazon takes complete responsibility (for service scaling, in addition to reliability and availability of data).

Elastic MapReduce (EMR)

Amazon's Elastic MapReduce, found at http://aws.amazon.com/elasticmapreduce/, is basically Hadoop in the cloud. Using any of the multiple interfaces (web console, CLI, or API), a Hadoop workflow is defined with attributes such as the number of Hadoop hosts required and the location of the source data. The Hadoop code implementing the MapReduce jobs is provided, and the virtual Go button is pressed.

In its most impressive mode, EMR can pull source data from S3, process it on a Hadoop cluster it creates on Amazon's virtual host on-demand service EC2, push the results back into S3, and terminate the Hadoop cluster and the EC2 virtual machines hosting it. Naturally, each of these services has a cost (usually on per GB stored and server-time usage basis), but the ability to access such powerful data-processing capabilities with no need for dedicated hardware is a powerful one.

主站蜘蛛池模板: 个旧市| 兰考县| 邮箱| 驻马店市| 麟游县| 黔西| 叶城县| 亳州市| 都兰县| 登封市| 肥乡县| 曲水县| 奉新县| 钟山县| 吉首市| 冕宁县| 丹棱县| 江津市| 荆州市| 昌邑市| 定远县| 出国| 咸丰县| 拉萨市| 卢氏县| 康平县| 阳西县| 延寿县| 宁武县| 曲麻莱县| 贺兰县| 永川市| 松溪县| 五大连池市| 长子县| 嵩明县| 阳泉市| 石柱| 黄陵县| 都江堰市| 聊城市|