官术网_书友最值得收藏!

Big data

The term refers to large volumes of data that combine both structured data types (rows and columns similar to a table) and unstructured data types (text documents, voice recordings, image data, and so on). Due to the volume of data, it does not fit into the main memory of the hardware where ML algorithms need to be executed. Separate strategies are needed to work on these large volumes of data. Distributed processing of the data and combining the results (typically called MapReduce) is one strategy. It is also possible to process just enough data sequentially that can fit in a main memory each time and store the results somewhere on a hard drive; we need to repeat this process until the entirety of the data is processed completely. After the data processing, the results need to be combined to avail the final results of all the data that has been processed.

Special technologies such as Hadoop and Spark are required to perform ML on big data. Needless to say, you will need to hone specialized skills in order to apply ML algorithms successfully using these technologies on big data.

主站蜘蛛池模板: 和田县| 同心县| 天柱县| 仁怀市| 湘潭市| 车险| 大新县| 交城县| 内江市| 亳州市| 资中县| 工布江达县| 百色市| 全南县| 安图县| 扎囊县| 阜康市| 内丘县| 东辽县| 乐平市| 临高县| 托克逊县| 白山市| 句容市| 乾安县| 佛教| 康保县| 肇源县| 五家渠市| 萨嘎县| 黄石市| 福州市| 鹿泉市| 弥渡县| 惠州市| 池州市| 城步| 疏附县| 宜都市| 文化| 弋阳县|