官术网_书友最值得收藏!

Ethical implications of manipulating data

There are many ethical implications and risks when manipulating data that you need to know. We live in a world where most deep learning algorithms will have to be corrected, by re-training them, because it was found that they were biased or unfair. That is very unfortunate; you want to be a person who exercises responsible AI and produces carefully thought out models. 

When manipulating data, be careful about removing outliers from the data just because you think they are decreasing your model's performance. Sometimes, outliers represent information about protected groups or minorities, and removing those perpetuates unfairness and introduces bias toward the majority groups. Avoid removing outliers unless you are absolutely sure that they are errors caused by faulty sensors or human error. 

Be careful of the way you transform the distribution of the data. Altering the distribution is fine in most cases, but if you are dealing with demographic data, you need to pay close attention to what you are transforming.

When dealing with demographic information such as gender, encoding female and male as 0 and 1 could be risky if we are considering proportions; we need to be careful not to promote equality (or inequality) that does not reflect the reality of the community that will use your models. The exception is when our current reality shows unlawful discrimination, exclusion, and bias. Then, our models (based on our data) should not reflect this reality, but the lawful reality that our community wants. That is, we will prepare good data to create models not to perpetuate societal problems, but models that will reflect the society we want to become.

主站蜘蛛池模板: 黄平县| 芦溪县| 乌拉特中旗| 通榆县| 兰西县| 梓潼县| 柘荣县| 海门市| 通江县| 淮阳县| 常德市| 英山县| 资阳市| 沅江市| 淮北市| 砀山县| 鄱阳县| 东平县| 邵武市| 绥化市| 博白县| 衡阳县| 宁波市| 论坛| 九龙城区| 汉沽区| 乃东县| 潮州市| 莱芜市| 确山县| 花莲市| 宁陵县| 馆陶县| 防城港市| 民丰县| 武平县| 贵定县| 宁波市| 定陶县| 富宁县| 珲春市|