Hi,
I ran tables for two variables of income:
iadl_income_1 - "How many persons are living in this household (including participant)?"
iadl_income_2 "How many persons in the household (including participant) earned any money last year from any income source such as job or employment, SS, interest or dividends, other pensions
alimony or child support, unemployment compensation, public aid, armed forces or veteran's allotment and rent."
These are the tables:
The total # of obs is 637.
How do I handle the ".", the fact that 3 people answered "0", and the "missing."
The total # of obs is 637.
What, if anything do I need to do with the "."
How do I correct the "missing" or do I have to do anything?
How do you want to handle them? We don't even know the purpose of this analysis, or what you plan to do with this analysis; without that, no answer is even possible.
You could add the MISSING option in TABLES statement to have those with all blanks included in the total count.
You could exclude the all blank, period and missing values from the analysis completely by using a WHERE statement.
You could recode the period and missing values to blanks to have PROC FREQ consider them missing.
Personally I would do that latter and treat the all blank values (what PROC FREQ considered missing) and the period and the missing values as missing.
Was the question totally open ended? Or does the value '12' really mean '12 or more'? If so then it would be difficult to treat the value as a numeric. Otherwise you could make a numeric variable from the existing character one and then you could calculate things like the average number of people. Note that SAS has multiple codes for missing values. So you could convert the three different types of missing values to three different numeric missing values.
data want;
set have;
if iadl_income_1 = ' ' then iadl_income_1_num = .B ;
else if iadl_income_1 = '.' then iadl_income_1_num = . ;
else if iadl_income_1 = 'missing' then iadl_income_1_num = .M;
else iadl_income_1_num = input(iadl_income_1,32.);
run;
Character values that should have counts, such as in response to "How many" type questions means that the data was brought into SAS incorrectly. Either all of ., "missing" and the apparent blank should all be missing numeric values, or perhaps special missing indicating why missing.
Second I am going out on a limb and guessing this is some sort of survey. Since missing or 0 for the Iadl_income_1 variable means there are no persons living in the household then the question really becomes "who answered the survey?" A neighbor? A pet? An alien from Mars?
No one living in a household would mean there wasn't anyone there to answer questions so any other values from those records I would eye with extreme prejudice as to if they were collected correctly. A not uncommon problem can be with a poor/inconsistent file format that means the data may have been read incorrectly in some fashion and you only think the first question has those values. At this point in way too many surveys I had to look at raw files (i.e before reading into SAS) for the entire record to make sure there are not things in the data that corrupted the data like extra delimiters, line feed characters in the middle of text meaning some records are split between lines and read partially.
In one brand of survey data collection software we discovered that entering tab characters in text boxes such as First and Last name, created new fields for some records shifting columns for a few responses because the software internally used those for fields and when exporting the data to file for use honored those false fields.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.