官术网_书友最值得收藏!

Body Mass Index

The Body Mass Index (BMI) is defined as a person's weight in kilograms, pided by the square of their height in meters:

Figure 2.28: Expression for BMI

BMI is a universal way to classify people as underweight, healthy weight, overweight, and obese, based on tissue mass (muscle, fat, and bone) and height. The following plot indicates the relationship between weight and height for the various categories:

Figure 2.29: Body Mass Index categories (source: https://en.wikipedia.org/wiki/Body_mass_index)

According to the preceding plot, we can build the four categories (underweight, healthy weight, overweight, and obese) based on the BMI values:

"""

define function for computing the BMI category, based on BMI value

"""

def get_bmi_category(bmi):

    if bmi < 18.5:

        category = "underweight"

    elif bmi >= 18.5 and bmi < 25:

        category = "healthy weight"

    elif bmi >= 25 and bmi < 30:

        category = "overweight"

    else:

        category = "obese"

    return category

# compute BMI category

preprocessed_data["BMI category"] = preprocessed_data\

                                    ["Body mass index"]\

                                    .apply(get_bmi_category)

We can plot the number of entries for each category:

# plot number of entries for each category

plt.figure(figsize=(10, 6))

sns.countplot(data=preprocessed_data, x='BMI category', \

              order=["underweight", "healthy weight", \

                     "overweight", "obese"], \

              palette="Set2")

plt.savefig('figs/bmi_categories.png', format='png', dpi=300)

The following is the output of the preceding code:

Figure 2.30: BMI categories

We can see that no entries for the underweight category are present, with the data being almost uniformly distributed among the remaining three categories. Of course, this is an alarming indicator, as more than 60% of the employees are either overweight or obese.

Now, let's check how the different BMI categories are related to the reason for absence. More precisely, we would like to see how many employees there are based on their body mass index and their reason for absence. This can be done with the following code:

# plot BMI categories vs Reason for absence

plt.figure(figsize=(10, 16))

ax = sns.countplot(data=preprocessed_data, \

                   y="Reason for absence", hue="BMI category", \

                   hue_order=["underweight", "healthy weight", \

                              "overweight", "obese"], \

                   palette="Set2")

ax.set_xlabel("Number of employees")

plt.savefig('figs/reasons_bmi.png', format='png', dpi=300)

The output will be as follows:

Figure 2.31: Absence reasons, based on BMI category

Unfortunately, no clear pattern arises from the preceding plot. In other words, for each reason for absence, an (almost) equal number of employees with different body mass indexes are present.

We can also investigate the distribution of absence hours for the different BMI categories:

# plot distribution of absence time, based on BMI category

plt.figure(figsize=(8,6))

sns.violinplot(x="BMI category", \

               y="Absenteeism time in hours", \

               data=preprocessed_data, \

               order=["healthy weight", "overweight", "obese"])

plt.savefig('figs/bmi_hour_distribution.png', format='png')

The output will be as follows:

Figure 2.32: Absence time in hours, based on the BMI category

As we can observe from Figure 2.31 and Figure 2.32, no evidence states that BMI and obesity levels influence the employees' absenteeism.

主站蜘蛛池模板: 徐水县| 天祝| 隆尧县| 廊坊市| 耒阳市| 本溪| 阆中市| 吉木萨尔县| 贞丰县| 晋江市| 三门县| 汝城县| 牡丹江市| 平湖市| 灵寿县| 樟树市| 滦平县| 敦化市| 鲜城| 石泉县| 临海市| 齐齐哈尔市| 温州市| 绵阳市| 石柱| 尚志市| 彭阳县| 佛教| 阿克苏市| 亚东县| 安丘市| 旺苍县| 柯坪县| 天长市| 宁武县| 扬州市| 全椒县| 廊坊市| 嘉定区| 南陵县| 锡林浩特市|