官术网_书友最值得收藏!

Analytics challenges

Analytics often requires deciding on whether to fill in or ignore the missing values. Either choice may lead to a dataset that is not a representative of reality.

As an example of how this can affect results, consider the case of inaccurate political poll results in recent years. Many experts believe it is now in near crisis due to the shift of much of the world to mobile numbers as their only phone number. For pollsters, it is cheaper and easier to reach people on landline numbers. This can lead to the over representation of people with landlines. These people tend to be both older and wealthier than mobile-only respondents.

The response rate has also dropped from near 80% in the 1970s to about 8% (if you are lucky) today. This makes it more difficult (and expensive) to obtain a representative sample leading to many embarrassingly wrong poll predictions.

There can also be outside influences, such as environment conditions, that are not captured in the data. Winter storms can lead to power failures affecting devices that are able to report back data. You may end up drawing conclusions based on a non-representative sample of data without realizing it. This can affect the results of IoT analytics – and it will not be clear why.

Since connectivity is a new thing for many devices, there is also often a lack of historical data to base predictive models on. This can limit the type of analytics that can be done with the data.

It can also lead to a recency bias in datasets, as newer products are over represented in the data simply because a higher percentage are now a part of the IoT.

This leads us to the author's number one rule in IoT analytics:

Never trust data you don't know.

Treat it like a stranger offering you candy.

主站蜘蛛池模板: 祁东县| 紫阳县| 股票| 遂平县| 普宁市| 商南县| 若尔盖县| 巩留县| 吉木萨尔县| 全南县| 娱乐| 太仆寺旗| 商河县| 营口市| 遵化市| 且末县| 彭山县| 黎川县| 杭锦后旗| 达尔| 孝昌县| 惠安县| 琼海市| 苏州市| 西乌| 鸡泽县| 阆中市| 静海县| 武隆县| 桦川县| 蒲江县| 安西县| 克山县| 仁布县| 抚松县| 琼海市| 榆林市| 都安| 萝北县| 繁峙县| 喀喇沁旗|