官术网_书友最值得收藏!

Making sense of data

It is crucial to identify the type of data under analysis. In this section, we are going to learn about different types of data that you can encounter during analysis. Different disciplines store different kinds of data for different purposes. For example, medical researchers store patients' data, universities store students' and teachers' data, and real estate industries storehouse and building datasets. A dataset contains many observations about a particular object. For instance, a dataset about patients in a hospital can contain many observations. A patient can be described by a patient identifier (ID), name, address, weight, date of birth, address, email, and gender. Each of these features that describes a patient is a variable. Each observation can have a specific value for each of these variables. For example, a patient can have the following:

PATIENT_ID = 1001
Name = Yoshmi Mukhiya
Address = Mannsverk 61, 5094, Bergen, Norway
Date of birth = 10th July 2018
Email = yoshmimukhiya@gmail.com
Weight = 10
Gender = Female

These datasets are stored in hospitals and are presented for analysis. Most of this data is stored in some sort of database management system in tables/schema. An example of a table for storing patient information is shown here:

            
PATIENT_ID           NAME           ADDRESS           DOB           EMAIL           Gender           WEIGHT
001           Suresh Kumar Mukhiya           Mannsverk, 61           30.12.1989           skmu@hvl.no           Male           68
002           Yoshmi Mukhiya           Mannsverk 61, 5094, Bergen           10.07.2018           yoshmimukhiya@gmail.com           Female           1
003           Anju Mukhiya           Mannsverk 61, 5094, Bergen           10.12.1997           anjumukhiya@gmail.com           Female           24
004           Asha Gaire           Butwal, Nepal           30.11.1990           aasha.gaire@gmail.com           Female           23
005           Ola Nordmann           Danmark, Sweden           12.12.1789           ola@gmail.com           Male           75

 

To summarize the preceding table, there are four observations (001, 002, 003, 004, 005). Each observation describes variables (PatientID, name, address, dob, email, gender, and weight). Most of the dataset broadly falls into two groups—numerical data and categorical data. 

主站蜘蛛池模板: 东宁县| 三台县| 体育| 平阴县| 团风县| 乌拉特后旗| 湘乡市| 寿阳县| 高雄县| 仪陇县| 高碑店市| 漳浦县| 汉中市| 贵溪市| 会理县| 盖州市| 高雄市| 迭部县| 汝城县| 阳城县| 彝良县| 黄山市| 华安县| 加查县| 搜索| 偏关县| 乌拉特前旗| 刚察县| 西充县| 万山特区| 河西区| 澄迈县| 永寿县| 且末县| 宜丰县| 闽清县| 西华县| 谷城县| 贵南县| 罗甸县| 二连浩特市|