Hi,
I have 5 Race Variable ( RaceBlack, RaceAsian, RaceNativeAM, RaceNativePacefic and RaceOther) they have dichotomous response 1.Yes , 2.No. I want to create a new Variable "Race" and code it; Black =2, Asian = 3, Other =4 and White =1;
I am using following code it doesn't work,
Data Race;
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceBlace = 2 then delete;
If RaceAsian = 1 then Race = 3;
If RaceAsian = 2 then delete;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
run;
please try the below code
Data Race;
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceAsian = 1 then Race = 3;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 1;
if RaceOther=1 then Race = 4;
run;
Thank you for your response,
i want to code RaceNativeAM and RacePacefic as 4. Others and RaceOther as 1=white;
you can try the below code , but is it working and you are getting the expected output
Data Race;
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceAsian = 1 then Race = 3;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
if RaceOther=1 then Race = 1;
run;
Still There is error, 8944 subjects are missing. Please find the attached file.
Thanks;
Subhash
use the below code in proc freq step
where race ne . ;
No luck...
It shows me missing values in each single variable.
Please find the following code file.
Thanks;
Please stop posting code and log a screenshots, post them as text using the window opened by clicking on the {i}-icon.
Personally I would recode 1=Yes , 2=No to 1=Yes and 0=No as the first step. Too lazy to type out your long variable names but this shows some of why I would use the more traditional 1/0 coding.
data example; do a= 0,1; do b= 0,1; do c= 0,1; do d= 0,1; do e= 0,1; if sum(a,b,c,d,e)>1 then Race=4; else if a then race=2; else if b then race=3; else if c or d then race=4; else if e then race=1; output; end; end; end; end; end; label a='Black' b='Asian' c='AI/AN' d='NHOPI' e='White or Other' end; run; proc freq data=example; table a*b*c*d*e*race /list nocum nopercent missing; run;
NOTE: You have not provided any clue what you want for respondents that answer NO to all of your 5 questions. From experience you might have Hispanic respondents do that with moderate frequency as so many people refer to "Hispanic" as a race and depending on your actual data collection instrument that may happen.
Also note the way I provided example data for all of the possible Yes/No responses for 5 variables, as code others could use.
And the PROC Freq shows results in a way that is easy to see if the coding is doing what is intended.
I think we are close to solution!! after running this code observation length increased from 9K to 30K.
data example; 109 set keep; 110 do RaceBlack= 0,1; 111 do RaceAsian= 0,1; 112 do RaceNativeAm= 0,1; 113 do RacNativePacefic= 0,1; 114 do RaceOther= 0,1; 115 if sum(RaceBlack,RaceAsian,RaceNativeAm,RacNativePacefic,RaceOther)>1 then Race=4; 116 else if RaceBlack then race=2; 117 else if RaceAsian then race=3; 118 else if RaceNativeAm or RacNativePacefic then race=4; 119 else if RaceOther then race=1; 120 output; 121 end; 122 end; 123 end; 124 end; 125 end; 126 label 127 RaceBlack='Black' 128 RaceAsian='Asian' 129 RaceNativeAm='AI/AN' 130 RacNativePacific='NHOPI' 131 RaceOther='White or Other' 132 end; 133 run; NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column). 110:18 110:20 111:18 111:20 112:21 112:23 114:18 114:20 NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 115:14 115:24 115:34 115:64 116:15 117:15 118:16 119:15 NOTE: There were 9546 observations read from the data set WORK.KEEP. NOTE: The data set WORK.EXAMPLE has 305472 observations and 50 variables. NOTE: DATA statement used (Total process time): real time 0.65 seconds cpu time 0.32 seconds
Hi @SChander Please take a look at Proc format aka user defined format. All syntax and examples are available online everywhere. Define your formats once and look up as when you need.
IF THEN statements are tedious and boring. IMHO not worth the time in your case.
Please do not past pictures of the LOG. It is every so much more helpful to copy the text from the LOG and paste the text into a code box opened using the forum's {I} or "running man" icons.
For instance, it is for all intents impossible to copy one or two lines of text from a picture and highlight or show the change needed. I am not going to retype a bunch of text from a picture.
Second, since you have 5 variables, what do you want to happen when there are more than 1 that have a 1 value? I don't see any description of the expected result.
Second
@SChander wrote:
Hi,
I have 5 Race Variable ( RaceBlack, RaceAsian, RaceNativeAM, RaceNativePacefic and RaceOther) they have dichotomous response 1.Yes , 2.No. I want to create a new Variable "Race" and code it; Black =2, Asian = 3, Other =4 and White =1;
I am using following code it doesn't work,
Data Race; set ABC; If RaceBlack = 1 then Race = 2; If RaceBlace = 2 then delete; If RaceAsian = 1 then Race = 3; If RaceAsian = 2 then delete; IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4; If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete; else Race = 1; run;
First thing, it appears that you have a spelling error.
Second is you are deleting entire records based on single values of each variable. Since you may have raceblack=2 and RaceAsian=1 for the same record then you delete one of your Asian expected results. And again if raceasian=2 then either the raceblack=2 or raceasian=2 would have deleted your Race=4 expected result.
The first thing you need to learn is the meaning of "ELSE". It does not refer to all the previous IF statements being false. Rather, it refers to the most recent IF statement only, what to do is that one was false. So in your program you have:
If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
These will set RACE to 1 for ALL observations that were not deleted by the previous statement. So that ELSE logic needs to change.
Given that you haven't used ELSE until the last statement, I would recommend removing this statement entirely:
else Race = 1;
In its place, just add this statement after the SET statement:
Race = 1;
There are definitely more sophisticated ways to handle this, but you have to be more familiar with ELSE first. For example:
If RaceBlack = 1 then Race = 2;
Else If RaceBlace = 2 then delete;
Else If RaceAsian = 1 then Race = 3;
Else If RaceAsian = 2 then delete;
Else IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
Else If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.