Data Analysis & Interpretation
75
Once necessary information has been collected through observation or survey or experiment, the following steps are to be taken.
1 Editing the data
Information may have been noted in haste and now required to be deciphered. Data should be edited before being presented as information to ensure that figures or words are accurate. Editing can be done manually or with computer or both depending upon the medium, whether paper or electronic.
The editing is done on two levels- micro and macro. In micro-editing, the basic records are corrected. Usually, all records are securitized one by one for apparent mistakes. The intent is to determine consistency of the data. For example, at one place the distances may be in miles while in another place these may be in km. Or there may be obvious mistake like showing a distance of 100 km where it should be only 10 km or less.
On macro level, aggregates are compared with data from other surveys or files or earlier versions of the same data. This is done to determine compatibility. For example, one survey has estimated total number of residents in a sector at 2,000. In another survey of family size, the total number of residents workout to be 2,500. Obviously, one of the estimates is wrong. In case, the figure of 2,000 was considered correct because of the double-check, the second would have to be reviewed for mistakes in totaling or multiplication.
Several types of data edits are available. In validity edits, it is ensured that specified units of measures (like kgs, liters or sq. Meters) are written. In range edit, one would observe that the values are within pre-established or common sense limits. Similarly, there are edits for duplications, consistency and history.
On the other hand, there are data errors such as (i) unasked questions, (ii) unrecorded answers and (iii) inappropriate responses.
Sometimes, a researcher is confronted with a exceptional but true figure like a very unusual temperature of 90 F (34.4 C). This is “unrepresentative” or “outlying” observations in a data set. What should we do about the “outliers” in a sample? “Should such data be deleted?” is for the researcher to decide.
2 Handling blank response
If more than 25% is left blank, discard the questionnaire. In other cases, only delete the particular question. Sometime, a mid-point value is assigned so as not to distort the average.
3 Coding
In data collection process, the final phase is quantification of the qualitative data. It is transformation of answers into a format that computer can understand.
Alpha-numeric codes are used to covert responses such as good and bad, poor or strong. Data are transferred, if necessary, to coding sheets, responses to negative worded questions are reversed to conform to the same direction.
Coding is a “systematic way in which to condense extensive data sets into smaller analyzable units through the creation of categories and concepts derived from the data.”
It is the “process by which verbal data are converted into variables and categories of variables using numbers, so that the data can be entered into computers for analysis.”
CODING THE DATA
Click thumbnail to view full-size4. Categorizing
To categorize data, the researcher puts them into categories or classes or segments which are mutually exclusive. Examples are gender, age, religion etc. Nominal scales are used for this purpose.
Category will be determined by the query. Is it about income levels, is it about education or customs or habits? The categories are accordingly drawn and item listed in a table.
5 Entering Data
Technology had made life easy. Data can be collected on scanner answer sheet which enable a researcher to enter them directly into computer file. In other cases, raw data would be manually entered into computer as data file. Here some software like SPSS data editor can be used to enter, edit and view the contents. It is easy to add, change or delete values after the data has been entered.
Data Analysis
There are three objectives of the data analysis:
- Getting a feel of the data,
- validity and reliability and
-
testing the hypotheses of the investigation.
1. Feel of the Data
Lists or statements are summarized to get a feel of the data. Descriptive statistics helps reduce the large data into meaning full indicators showing central tendencies and spread. Three measures of central tendency are commonly used in statistical analysis – the mode, the medium and the mean. Each measure is designed to represent a typical score. The choice depends upon the shape of the distribution (whether normal or skewed) and the variable’s level of measurements (nominal, ordinal or interval)
Averages etc do not tell us everything. At times, it could be misleading. Income per capital of Brunei is US$ 53,1000. One tends to feel that there would be hardly any poor person. But there are people below poverty level even in such a rich country. Such information is disclosed by dispersion or deviation or spread. These measures inform us how wealth is distributed in the country. There are super rich and very poor people which is indicated by the spread or standard deviation.
FEEL OF THE DATA
Click thumbnail to view full-size2. Reliability & Validity
The data should be both reliable and valid. While reliability shows trust-worthiness and dependability, validity shows appropriateness or authenticity or suitability or genuineness.
SAMPLE PROBLEM USING A SAMPLING DISTRIBUTION
3. Hypotheses Testing
Once cleared of any doubt as to reliability and validity, the researcher can go ahead in testing the hypotheses already formed for the report.
A car manufacturing company plans to test a new engine in order to find out whether it meets new standards of air pollution. The average should be less than 20 parts per million of carbon. Ten engines were picked up for test and after determining their emission levels, the average was found to be 17.17. Apparently, it is far lower than 20 but it is based on a small sample and average emission of total engines produced may be higher.
The General Manager wants to find out if the engines meet the pollution standards, This should evaluated at 99% confidence or with 1% chance of error.
NULL HYPOTHESIS: The engine does not meet the requirement, the average being 20
ALTERNATE: The engines meet the requirement, average is less than 20
REJECTION REGION: For ∂ =0.01 and d/f (degree of freedom) = n -1 =9, the one tail rejection region is t < - t = 2.821
Using a standard formula ( as shown on the side), the General Manager found the t-value to be -3.00 which far exceeds the t - value -2.821, as per table, and hence rejected the Null Hypothesis which means that the engines do meet the requirements. (under these conditions, the chance of being wrong are one in one thousand which is very low probability).
Conclusion
Once necessary data has been collected through surveys or experiment, it should be edited to ensure that only correct data is used. Next, it would be coded and categorized and entered in the tables or computers.
Based on the query or question, hypothesis should be developed and tested using the appropriate and reliable measures.
The results should be interpreted and a decision taken to solve the problem.
vote upvote downshareprintflag
- Useful (7)
- Funny
- Awesome (1)
- Beautiful (2)
- Interesting (1)
CommentsLoading...
That is so true.
Thanks a lot Sir!
RUFI SHAHZADA
Sir great hub very informative and very well organized. Thanks for sharing.
Thanks for such a nice blog post....i was searching for something like that.
DEAR HAFEEZ UR REHMAN, YOUR ARTICLE ON DATA ANALYSIS AND INTERPRETATION IS VERY INFORMATIVE AND BRINGS MUCH INFORMATION TO ME I HOPE YOU WILL PROVIDING ME INFORMATION AND KNOWLEDGE IN FUTURE.
AND I APPRICIAT YOUR WORK YOU ARE DOING A GREAT WORK WHICH IS DEMAND OF TIME, NO ONE CAN IGNOR YOUR EFFORT IN THIS ASPECT.
THANKS.
sir..wow...this is deep..ive made a copy of this...reading again and again. worthy to note and implement in organizations. thnx
While reading your blog it seems that you research on this topic very much. I must tell you that your blog is very informative and it helps other also.
Dear Sir...
Your article of Data Analysis and Interpretation is very good after going through this article all concepts are clear and this has enhance my knowledge.
Thanks for sharing this information this has given me valuable insight and I will also implement in organization.
It's fascinating and worthy collection of information for one who is interested to have quick knowledge on data analysis. It would be much valuable if one more topic "Data Management" could be included.
This is such a great blog post. I have been searching for something like this for ages.
Thanks for such a nice blog post....i was searching for something like that.
sirji thanks very good information about data analysis again thanks
sadas
Reading through such an excellent piece of writing is always amusing for me where i can found an element of captivation along with some informative material.
your hub is essential and informative which used as a short hand reference book for the beginner researchers and students! if you try to inculcate others such as regression(both linear and logistics) it becomes really smart and tough HUB!
congratulations sir hazeez honestly this is a good job.keep it up and may God/Allah bless u abundantly.










Rufi Shahzada 2 years ago
Dear Sir,
Marvelous HUB on DATA ANALYSIS AND INTERPRETATION, but I have one question that many people are very much biased and not even read the questions of the questionnaire and just tick randomly to save time or whatsoever reason, So how come we could eliminate or possibly reduce the element of biasness in such conditions?
Regards,
Rufi Shahzada