Hello all,
i want to create a frequencyt table for my newly categorized variables. there is no error in log but when i run the proc freq , it doesnt show all the categories i created for the first variable which is race. it showed just 2 instead of 5
here is my syntax
DATA MEPSIS.RACEDU;
SET MEPSIS.TEMP5;
IF RACE = 1 THEN RACECAT = 'HISPANIC';
IF RACE = 2 THEN RACECAT = 'NON-HISPANIC WHITE ONLY';
IF RACE = 3 THEN RACECAT = 'NON-HISPANIC BLACK ONLY';
IF RACE = 4 THEN RACECAT = 'NON-HISPANIC ASIAN ONLY';
IF RACE = 5 THEN RACECAT = 'NON-HISPANIC OTHER RACE OR MULTI-RACE';
IF RACE < 1 THEN DELETE;
IF EDUCATION = 1 THEN EDUCAT = '=<8TH GRADE';
IF EDUCATION = 2 THEN EDUCAT = '9TH TO 12TH GRADE';
IF EDUCATION = 13 THEN EDUCAT = 'GED OR HIGH SCHOOL DIPLOMA';
IF EDUCATION = 14 THEN EDUCAT = '>HIGH SCHOOL BUT WITHOUT 4 YEAR DEGREE';
IF EDUCATION = 15 THEN EDUCAT = 'COLLEGE OR BACHELORS DEGREE';
IF EDUCATION = 16 THEN EDUCAT = 'GRADUATE OR PROFESSIONAL DEGREE';
IF EDUCATION < 1 THEN DELETE;
RUN;
/** PRINTING TO CHECK**/
PROC PRINT DATA = MEPSIS.RACEDU (OBS = 5);
VAR RACECAT EDUCAT;
TITLE "CROSSCHECKING QUESTION 4";
RUN;
Is it possible that one of the two conditions is true:
- IF RACE < 1 THEN DELETE;
- IF EDUCATION < 1 THEN DELETE;
Check the missing observations, maybe they satisfies above situation.
I propose to replace the DELETE with '***';
DATA MEPSIS.RACEDU;
SET MEPSIS.TEMP5;
length RACECAT EDUCAT $50.;
IF RACE = 1 THEN RACECAT = 'HISPANIC';
ELSE IF RACE = 2 THEN RACECAT = 'NON-HISPANIC WHITE ONLY';
ELSE IF RACE = 3 THEN RACECAT = 'NON-HISPANIC BLACK ONLY';
ELSE IF RACE = 4 THEN RACECAT = 'NON-HISPANIC ASIAN ONLY';
ELSE IF RACE = 5 THEN RACECAT = 'NON-HISPANIC OTHER RACE OR MULTI-RACE';
*ELSE IF RACE < 1 THEN DELETE;
ELSE RACECAT="CHECKME";
IF EDUCATION = 1 THEN EDUCAT = '=<8TH GRADE';
ELSE IF EDUCATION = 2 THEN EDUCAT = '9TH TO 12TH GRADE';
ELSE IF EDUCATION = 13 THEN EDUCAT = 'GED OR HIGH SCHOOL DIPLOMA';
ELSE IF EDUCATION = 14 THEN EDUCAT = '>HIGH SCHOOL BUT WITHOUT 4 YEAR DEGREE';
ELSE IF EDUCATION = 15 THEN EDUCAT = 'COLLEGE OR BACHELORS DEGREE';
ELSE IF EDUCATION = 16 THEN EDUCAT = 'GRADUATE OR PROFESSIONAL DEGREE';
ELSE IF EDUCATION < 1 THEN DELETE;
ELSE EDUCAT = "CHECKME";
RUN;
proc freq data=mepsis.racedu;
table race*racecat education*educat racecat*educat;
run;
@Banke wrote:
Hello all,
i want to create a frequencyt table for my newly categorized variables. there is no error in log but when i run the proc freq , it doesnt show all the categories i created for the first variable which is race. it showed just 2 instead of 5
here is my syntax
DATA MEPSIS.RACEDU;
SET MEPSIS.TEMP5;
IF RACE = 1 THEN RACECAT = 'HISPANIC';
IF RACE = 2 THEN RACECAT = 'NON-HISPANIC WHITE ONLY';
IF RACE = 3 THEN RACECAT = 'NON-HISPANIC BLACK ONLY';
IF RACE = 4 THEN RACECAT = 'NON-HISPANIC ASIAN ONLY';
IF RACE = 5 THEN RACECAT = 'NON-HISPANIC OTHER RACE OR MULTI-RACE';
IF RACE < 1 THEN DELETE;
IF EDUCATION = 1 THEN EDUCAT = '=<8TH GRADE';
IF EDUCATION = 2 THEN EDUCAT = '9TH TO 12TH GRADE';
IF EDUCATION = 13 THEN EDUCAT = 'GED OR HIGH SCHOOL DIPLOMA';
IF EDUCATION = 14 THEN EDUCAT = '>HIGH SCHOOL BUT WITHOUT 4 YEAR DEGREE';
IF EDUCATION = 15 THEN EDUCAT = 'COLLEGE OR BACHELORS DEGREE';
IF EDUCATION = 16 THEN EDUCAT = 'GRADUATE OR PROFESSIONAL DEGREE';
IF EDUCATION < 1 THEN DELETE;
RUN;
/** PRINTING TO CHECK**/
PROC PRINT DATA = MEPSIS.RACEDU (OBS = 5);
VAR RACECAT EDUCAT;
TITLE "CROSSCHECKING QUESTION 4";
RUN;
ok. thank you. i uploaded the wrong picture, sorry about that. It actually created all the categories i wanted, i scanned through the dataset. i just observed that it grouped the categories into 2 instead of 5 in the frequency table whereas there are 5 categories in the created dataset. what does the "CHECKME" mean please?
@Banke wrote:
ok. thank you. i uploaded the wrong picture, sorry about that. It actually created all the categories i wanted, i scanned through the dataset. i just observed that it grouped the categories into 2 instead of 5 in the frequency table whereas there are 5 categories in the created dataset. what does the "CHECKME" mean please?
It literally means go check those records. That is because they're falling out of your IF/ELSE conditions which means you're not accounting for some conditions or deletions. Coding it to a value rather than delete allows you to verify the logic before you delete it. This style of programming helps you to find errors in your code before they propogate.
The proper way to check this variable derivation is like this:
proc freq data=mepsis.racedu;
tables race*racecat / list missing;
run;It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.
