The compilation of this Organisation of Data Notes makes students exam preparation simpler and organised.
Raw Data, Classification of Data and Variables
Suppose you have the data of the weights of all the students in your school. Definitely, the data is going to be very vast as it will contain information about every student. Looking at this raw mass of data can you draw any conclusions? Of course not. That’s when the classification of data saves the day! Let’s study raw data, classification of data, and variables.
Raw data is the unorganized data when we’re done with the collection stage. This is because it is similar to a lump of clay with no identity and also of no practical use. Definitely, we need to organize this raw data. It is important to realize that organized data facilitates comparison and meaningful conclusions.
Further, to organize the data we need to look for similarities or group the data. In this way, we effectively convert heterogeneous data into homogeneous data. To do so, an investigator has to classify the data in the form of a series.
Series refer to those data which are in some order and sequence. Thus, if we arrange the data in the example mentioned in the introduction according to the classes in your school, we will eventually classify the data in form of a statistical series. Note that we can also arrange them according to their heights. Hence, this basis of the arrangement of raw data can vary from purpose to purpose.
A variable is simply something that can vary with time and we can measure this variation. In other words, a variable is a characteristic or a phenomenon which is capable of being measured and changes its value over time. A variable is classified into two:
A discrete variable’s value changes only in complete numbers or increases in jumps. Thus the phenomenon or characteristic, a discrete variable represents should be such that its value cannot be infractions but only in whole numbers. For example, the number of children in a family can be 2, 3, 4, etc but not 2.5, 3.5, etc.
A continuous variable assumes fractional values or its value does not increase in jumps. For example, the heights of students, the weights of students, and so on.
Classification of Data
The main objective of the organization of data is to arrange the data in such a form that it becomes fairly easy to compare and analyze. Generally, we can do this by distributing data into various classes on the basis of some attribute or characteristic. This distribution of data into classes is the classification of data. Further, each division of data is a class. All in all, through the process of classification we can group and divide data into classes according to a general attribute, which facilitates comparison and analysis.
Objectives of Classification
- Simplification and Briefness: Classification presents data in a brief manner. Hence, it becomes fairly easy to analyze the data.
- Utility: As classification highlights the similarity in the data, it brings out its utility.
- Distinctiveness: With the help of grouping data into different classes, classification also brings out the distinctiveness in data.
- Comparability: As already mentioned, it facilitates the comparison of data.
- Scientific Arrangement: Classification arranges data on scientific lines. Thus it also increases the reliability of data.
- Attractive and Effective: Lastly, through the process of classification, data becomes effective and attractive.
Characteristics of a Good Classification
- Comprehensiveness: Classification should cover all the items of the data. In other words, it should be so comprehensive that it classifies all items in some group or class.
- Clarity: There should be no confusion about the placement of any data item in a group or class. That is, classification should be absolutely clear.
- Homogeneity: The items within a specific group or class should be similar to each other.
- Suitability: The attribute or characteristic according to which classification is done should agree with the purpose of classification.
- Stability: A particular kind of investigation should be effected on the same set of classifications.
- Elastic: As the purpose of classification changes, one should be able to change the basis of classification.
Basis of Classification
Definitely, we can classify a given data according to various characteristics, depending on the purpose of our study. Evidently, there are various bases of classification.
When we classify data according to different locations, it is termed as a geographical classification of adat. For example, a classification of the data about the number of children aged between 3-8 according to the various cities in India.
In chronological classification, we classify data according to time i.r it follows a chronological sequence. For example, the classification of the data about the number of deaths in India according to the years.
Here, we classify data according to the qualities or attributes of data. One key point to remember is that an attribute is qualitative in nature i.e. we cannot measure an attribute in quantitative terms like 5, 1, 2, etc. This qualification is further of two types:
1. Simple: In the simple qualitative classification of data, we qualify data exactly into two groups. One group has data items that exhibit the quality, the other group doesn’t. Evidently, it is also known as classification according to a dichotomy. Examples of classes can be educated-uneducated, male-female, and so on.
2. Manifold: Here we classify data according to more than one characteristic of an attribute. This means one we classify data into two groups according to an attribute, the two groups are further divided into two according to another attribute. As a result, there can be many levels of classification couples with more than just two classes. For example, the classification of data about students in a class, according to their gender, followed by classification according to whether they are fat or not.
Quantitative or Numerical Classification
Unlike qualitative classification, quantitative classification allows numerical division of data into classes. Here, each class represents a range of numerical values for the phenomenon under consideration. Accordingly, we frame each class with a lower and higher value and according to the range of data.
Again, the phenomenon should be such that it can be expressed in numerical terms. As it is classified into classes with a different range of values, this classification is effectively the representation of the change of the value of a phenomenon over time or across different regions. Which means its value varies. Accordingly, quantitative classification is also known as classification by variables.
List the characteristics of good classification.
The characteristics of a good classification are: