- Big Data Analytics with Hadoop 3
- Sridhar Alla
- 213字
- 2021-06-25 21:26:19
Right outer join
Right outer join gives all rows in right side table, as well as the common rows on both the left and right (inner join). Use this to get all of the rows in the right table, along with the rows found in both left and right tables. Fills in NULL if not in left. The performance here is similar to the left outer join previously mentioned in this table:

We will consider the cities and temperatures only if the cityID has both records or only temperature measurements are included, as shown in the following code:
private static class RightOuterJoinReducer
extends Reducer<Text, Text, Text, IntWritable> {
private IntWritable result = new IntWritable();
private Text cityName = new Text("Unknown");
public void reduce(Text key, Iterable<Text> values,
Context context) throws IOException, InterruptedException {
int sum = 0;
int n = 0;
for (Text val : values) {
String strVal = val.toString();
if (strVal.length() <=3)
{
sum += Integer.parseInt(strVal);
n +=1;
} else {
cityName = new Text(strVal);
}
}
if (n !=0) {
result.set(sum / n);
context.write(cityName, result);
}
}
}
The output will be as follows:
Boston 22
New York 23
Chicago 23
Philadelphia 23
San Francisco 22
city-6 22 //city ID 6 has no name in cities.csv only temperature measurements
推薦閱讀