官术网_书友最值得收藏!

Left anti join

Left anti join gives only those rows from the left hand side table based that are not present in the right hand side table. Use this when you want to keep rows from the left table only when not present in right table. This provides very good performance, as only one table is fully considered and the other is only checked for the join condition:

We will consider the cities and temperatures if the cityID has only, name and no temperature records, as shown in the following code:

private static class LeftAntiJoinReducer
extends Reducer<Text, Text, Text, IntWritable> {
private IntWritable result = new IntWritable();
private Text cityName = new Text("Unknown");
public void reduce(Text key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
int n = 0;

for (Text val : values) {
String strVal = val.toString();
if (strVal.length() <=3)
{
sum += Integer.parseInt(strVal);
n +=1;
} else {
cityName = new Text(strVal);
}
}
if (n==0 ) {
if (n==0) n=1;

result.set(sum / n);
context.write(cityName, result);
}
}
}

The output will be as shown in the following code:

Las Vegas 0 // city of Las vegas has no temperature measurements in temperature.csv
主站蜘蛛池模板: 铁岭市| 富阳市| 安塞县| 巴东县| 甘泉县| 阿坝| 昌图县| 吉水县| 汉源县| 锡林浩特市| 黔西县| 子洲县| 富裕县| 南靖县| 章丘市| 南康市| 古浪县| 西青区| 民丰县| 陇南市| 凯里市| 济源市| 武汉市| 翁牛特旗| 太仆寺旗| 开化县| 循化| 延长县| 临邑县| 安吉县| 邳州市| 西盟| 科技| 佳木斯市| 德令哈市| 仙桃市| 库尔勒市| 资兴市| 涞水县| 万盛区| 敦煌市|