BookmarkSubscribeRSS Feed
klongway
Calcite | Level 5
My IRB wants me to collapse my racial data for participant privacy but I am struggling to do so. Am I on the right track below, if I'm trying to take Cambodian, Laotian, Korean and Burmese and collapse into Asian?

Data newrace
Set partrace

Asian= if Kor=1 or Bur=1 or Lal=1 or Cam=1 then Asian.

Thanks!!!
4 REPLIES 4
Astounding
PROC Star

Similar to what you tried:

if kor=1 or bur=1 or lal=1 or cam=1 then asian=1;

A bit different but worth considering:

Asian = max(Kor, bur, cam, lal) ;

ScottBass
Rhodochrosite | Level 12

Sample code:

 

data have;
   kor=1;
   bur=0;
   lal=0;
   cam=0;
run;

data want1;
   format asian;
   set have;
   if kor=1 or bur=1 or lal=1 or cam=1 then asian=1;
run;

data want2;
   format asian;
   set have;
   asian=(kor=1 or bur=1 or lal=1 or cam=1);
run;

data want3;
   format asian;
   set have;
   asian=(kor or bur or lal or cam);
run;

data want4;
   format asian;
   set have;
   asian=max(kor,bur,lal,cam);
run;

data want5;
   format asian;
   set have;
   asian=max(kor,bur,lal,cam) ne 0;
run;

Explanation:

 

In SAS, False=0, and True=anything except 0, missing, or special missing (I'll explain this below).

However, typically we use true boolean, i.e. False=0, True=1.

A comparison operator (in this case = ) returns a boolean value, False=0, True=1.

 

Details:

 

want1: 

Here we are using an IF statement, which returns a boolean value - an IF statement is either True or False.

Within the IF statement, we are using a specific comparison operator (=), which returns a boolean value based on that comparison.

We are using the OR operator to "connect" our boolean comparisons, and the OR operator "says" "if any comparison is True, the statement is True".

We then use an assignment statement to set asian=1 if the statement is true.

Otherwise, since asian is not retained, it will be set to missing on each iteration of the data step, and thus will remain missing if the IF statement is False.

 

want2:

But remember that the comparison operator returns True, and the OR operator connects the comparisons to return True or False for the entire statement.

Also remember that SAS returns False=0, True=1.

So you can put the comparison on the right side of the equals sign - you don't have to use an If statement.

The asian variable will then capture the results (0 or 1) of the comparison.

 

want3:

Again remember that 0 or missing=False and anything else=True.

So, we don't even need the comparison operator to evaluate True or False.

However, notice the difference in the asian variable if you set say Kor=5 or Kor=-5.

In want3, asian will still be 1 if Kor is any value other than 0, missing, or special missing.

This may, or may not, be what you want.  Should asian=1 if Kor=-5?

 

want4:

The max function will return the maximum value of all the supplied variables.

So if all values are 0, the max is 0, otherwise the max is returned.

This is likely what you want if your source variables are boolean (0 or 1 only).

Otherwise, it may not be what you want.  For example, if Kor=-5 ("True") and all other variables=0, the max is 0.

Test this with say Kor=5 or Kor=-5.

 

want5:

But you may want asian to be exactly 0 or 1.  If so, use a comparison operator, in this case not equal.

Note that you can't use != in place of ne since the statement is on the right of the equals sign.

 

Note that you get different results with the various approaches if your source variables are not boolean. 

If your flag variables are exactly 0 or 1 (I'll also begrudgingly allow missing but I don't like it), then any of these approaches will give similar results. 

I use want3 (comparison on the right side of the equals sign) all the time.

 

In summary:

 

data _null_;
   length result $5;
   result=ifc(.z,'True','False');put result=;   
   result=ifc(.a,'True','False');put result=;   
   result=ifc(. ,'True','False');put result=;   
   put '======';
   result=ifc(-123,'True','False');put result=;   
   result=ifc(-1,  'True','False');put result=;   
   result=ifc(1,   'True','False');put result=;   
   result=ifc(123, 'True','False');put result=;   
run;

 

Hope this helps...


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
andreas_lds
Jade | Level 19

Why do you have so many variables for racial data? Is it possible, that more than variable is 1 at all?

 

Having just one variable for racial data, would allow to use a format to change the value when displayed. The value is, of course, not changed, but another text  will be displayed.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 491 views
  • 0 likes
  • 5 in conversation