Hi,
I have 5 Race Variable ( RaceBlack, RaceAsian, RaceNativeAM, RaceNativePacefic and RaceOther) they have dichotomous response 1.Yes , 2.No. I want to create a new Variable "Race" and code it; Black =2, Asian = 3, Other =4 and White =1;
I am using following code it doesn't work,
Data Race; 
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceBlace = 2 then delete;
If RaceAsian = 1 then Race = 3;
If RaceAsian = 2 then delete;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
run;please try the below code
Data Race; 
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceAsian = 1 then Race = 3;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 1;
if RaceOther=1 then Race = 4;
run;Thank you for your response,
i want to code RaceNativeAM and RacePacefic as 4. Others and RaceOther as 1=white;
you can try the below code , but is it working and you are getting the expected output
Data Race;
set ABC;
If RaceBlack = 1 then Race = 2;
If RaceAsian = 1 then Race = 3;
IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
if RaceOther=1 then Race = 1;
run;Still There is error, 8944 subjects are missing. Please find the attached file.
Thanks;
Subhash
use the below code in proc freq step
where race ne . ; No luck...
It shows me missing values in each single variable.
Please find the following code file.
Thanks;
Please stop posting code and log a screenshots, post them as text using the window opened by clicking on the {i}-icon.
Personally I would recode 1=Yes , 2=No to 1=Yes and 0=No as the first step. Too lazy to type out your long variable names but this shows some of why I would use the more traditional 1/0 coding.
data example;
   do a= 0,1;
   do b= 0,1;
   do c= 0,1;
   do d= 0,1;
   do e= 0,1;
      if sum(a,b,c,d,e)>1 then Race=4;
      else if a then race=2;
      else if b then race=3;
      else if c or d then race=4;
      else if e then race=1;
      output;
   end;
   end;
   end;
   end;
   end;
   label
      a='Black'
      b='Asian'
      c='AI/AN'
      d='NHOPI'
      e='White or Other'
   end;
run;
proc freq data=example;
 table a*b*c*d*e*race /list nocum nopercent missing;
run;
NOTE: You have not provided any clue what you want for respondents that answer NO to all of your 5 questions. From experience you might have Hispanic respondents do that with moderate frequency as so many people refer to "Hispanic" as a race and depending on your actual data collection instrument that may happen.
Also note the way I provided example data for all of the possible Yes/No responses for 5 variables, as code others could use.
And the PROC Freq shows results in a way that is easy to see if the coding is doing what is intended.
I think we are close to solution!! after running this code observation length increased from 9K to 30K.
  data example;
109  set keep;
110     do RaceBlack= 0,1;
111     do RaceAsian= 0,1;
112     do RaceNativeAm= 0,1;
113     do RacNativePacefic= 0,1;
114     do RaceOther= 0,1;
115        if sum(RaceBlack,RaceAsian,RaceNativeAm,RacNativePacefic,RaceOther)>1 then Race=4;
116        else if RaceBlack then race=2;
117        else if RaceAsian then race=3;
118        else if  RaceNativeAm or RacNativePacefic then race=4;
119        else if RaceOther then race=1;
120        output;
121     end;
122     end;
123     end;
124     end;
125     end;
126     label
127        RaceBlack='Black'
128        RaceAsian='Asian'
129        RaceNativeAm='AI/AN'
130       RacNativePacific='NHOPI'
131        RaceOther='White or Other'
132     end;
133  run;
NOTE: Numeric values have been converted to character values at the places given by:
      (Line):(Column).
      110:18   110:20   111:18   111:20   112:21   112:23   114:18   114:20
NOTE: Character values have been converted to numeric values at the places given by:
      (Line):(Column).
      115:14   115:24   115:34   115:64   116:15   117:15   118:16   119:15
NOTE: There were 9546 observations read from the data set WORK.KEEP.
NOTE: The data set WORK.EXAMPLE has 305472 observations and 50 variables.
NOTE: DATA statement used (Total process time):
      real time           0.65 seconds
      cpu time            0.32 seconds
					
				
			
			
				
			
			
			
			
			
			
			
		Hi @SChander Please take a look at Proc format aka user defined format. All syntax and examples are available online everywhere. Define your formats once and look up as when you need.
IF THEN statements are tedious and boring. IMHO not worth the time in your case.
Please do not past pictures of the LOG. It is every so much more helpful to copy the text from the LOG and paste the text into a code box opened using the forum's {I} or "running man" icons.
For instance, it is for all intents impossible to copy one or two lines of text from a picture and highlight or show the change needed. I am not going to retype a bunch of text from a picture.
Second, since you have 5 variables, what do you want to happen when there are more than 1 that have a 1 value? I don't see any description of the expected result.
Second
@SChander wrote:
Hi,
I have 5 Race Variable ( RaceBlack, RaceAsian, RaceNativeAM, RaceNativePacefic and RaceOther) they have dichotomous response 1.Yes , 2.No. I want to create a new Variable "Race" and code it; Black =2, Asian = 3, Other =4 and White =1;
I am using following code it doesn't work,
Data Race; set ABC; If RaceBlack = 1 then Race = 2; If RaceBlace = 2 then delete; If RaceAsian = 1 then Race = 3; If RaceAsian = 2 then delete; IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4; If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete; else Race = 1; run;
First thing, it appears that you have a spelling error.
Second is you are deleting entire records based on single values of each variable. Since you may have raceblack=2 and RaceAsian=1 for the same record then you delete one of your Asian expected results. And again if raceasian=2 then either the raceblack=2 or raceasian=2 would have deleted your Race=4 expected result.
The first thing you need to learn is the meaning of "ELSE". It does not refer to all the previous IF statements being false. Rather, it refers to the most recent IF statement only, what to do is that one was false. So in your program you have:
If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
These will set RACE to 1 for ALL observations that were not deleted by the previous statement. So that ELSE logic needs to change.
Given that you haven't used ELSE until the last statement, I would recommend removing this statement entirely:
else Race = 1;In its place, just add this statement after the SET statement:
Race = 1;
There are definitely more sophisticated ways to handle this, but you have to be more familiar with ELSE first. For example:
If RaceBlack = 1 then Race = 2;
Else If RaceBlace = 2 then delete;
Else If RaceAsian = 1 then Race = 3;
Else If RaceAsian = 2 then delete;
Else IF RaceNativeAM = 1 or RaceNativePacefic = 1 then Race = 4;
Else If RaceNativeAm = 2 or RaceNativePacefic = 2 then delete;
else Race = 1;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.
