官术网_书友最值得收藏!

Filtering patterns

Also known as transformation patterns, filtering patterns find a subset of data, whether it be small, like a top 10 listing, or large, like the results of a deduplication:

Four patterns are presented in this chapter: filtering, bloom filtering, top ten, and distinct.

As the most basic pattern, filtering serves as an abstract pattern for some of the other patterns. Filtering simply evaluates each record separately and decides, based on some condition, whether it should stay or go. Filter out records that are not of interest and keep ones that are. Consider an evaluation function f that takes a record and returns a Boolean value of true or false. If this function returns true, keep the record; otherwise, toss it out.

The SingleMapper job seen earlier is a good example of a filtering patterns.

Depending on the use case, a transformation pattern can be customized to generate the intended output.

主站蜘蛛池模板: 西乌珠穆沁旗| 永昌县| 介休市| 兴化市| 上思县| 江孜县| 石阡县| 江津市| 黄石市| 绥中县| 新河县| 买车| 佛学| 贡觉县| 华容县| 尤溪县| 木兰县| 平安县| 凤凰县| 望江县| 苏尼特右旗| 车险| 栾城县| 揭阳市| 隆化县| 资源县| 朔州市| 湄潭县| 盐山县| 永吉县| 株洲县| 柳江县| 新田县| 墨玉县| 波密县| 泊头市| 黄浦区| 大竹县| 沙河市| 商河县| 大渡口区|