BookmarkSubscribeRSS Feed
Primohunter
Obsidian | Level 7

Hello everyone,

below you can find an example of a data set that i want to turn to the OUTPUT example below as well.

Have DATASET:

Var1     Var2     Var3     Var4

Nick      ran1       Y        N

Nick      ran1       N        Y

John     ran2       Y         N

John     ran2       N         Y

George  ran3      N        Y

 

Output (Want) Dataset:

 

Var1     Var2     Var3     Var4

Nick      ran1       Y        Y

John     ran2       Y         Y

George  ran3      N        Y

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Please explain the logic that allows you to transform the HAVE data set into the WANT data set.

--
Paige Miller
Primohunter
Obsidian | Level 7

lets just say that for some reason nick and john are both qualified for having VAR 3 AND VAR 4  (e.g. 'Y') but the input program could only accept  only one VAR input at a time.

For example

nick wears a black and white shirt 

so the dataset becomes as follows:

 

 PROBLEMATIC INPUT                                                    DESIRED FORMAT FOR FURTHER ANALYSIS

Name     Status       White   Black              --->                    Name     Status       White   Black  

Nick      Dressed        Y         N                                                Nick      Dressed        Y         Y        

Nick      Dressed        N         Y

PaigeMiller
Diamond | Level 26

This would be much easier if you used 0s and 1s instead of N and Y.

 

data have;
    input var1 $ var2 $ var3 var4;
    cards;
nick ran1 1 0
nick ran1 0 1
john ran2 1 0
john ran2 0 1
george ran3 0 1
;
proc summary data=have nway;
    class var1 var2;
    var var3 var4;
    output out=want max=;
run;

 

 

As a side issue, please don't use different variable names in your examples than in your original data set.

--
Paige Miller
Primohunter
Obsidian | Level 7
Thanks for your response
will keep that in mind. i just used different real life variable names to make the reasoning easier.

Is there another way to do it with those character variables?
PaigeMiller
Diamond | Level 26

@Primohunter wrote:
Thanks for your response
will keep that in mind. i just used different real life variable names to make the reasoning easier.

Reasoning is easier if you use the same variable names in your examples and actual data. We're trying to help you, but you have to help us too.

 

Is there another way to do it with those character variables?

 

Convert char to numeric, use PROC SUMMARY as I have shown, and then convert 0 and 1 back to N and Y.

--
Paige Miller
FreelanceReinh
Jade | Level 19

@Primohunter wrote:
Is there another way to do it with those character variables?

You could use PROC SQL, but the syntax is less elegant than the succinct "max=" of PROC SUMMARY's OUTPUT statement:

data have;
input (var1-var4)($);
cards;
Nick ran1 Y N
Nick ran1 N Y
John ran2 Y N
John ran2 N Y
George ran3 N Y
;

proc sql;
create table want as
select var1, var2, max(var3) as var3, max(var4) as var4
from have
group by 2, 1;
quit;

The "maximum" here is based on the alphabetical order 'N' < 'Y'.

PaigeMiller
Diamond | Level 26

And so I still believe, despite the fine example from @FreelanceReinh , that handling binary variables via numeric 0s and 1s is superior for a number of reasons, and I urge all readers of this thread to use numeric 0s and 1s for binary variables.

 

  • If the real problem has 25 variables, instead of the 4, much less typing in PROC SUMMARY than in PROC SQL and therefore also less chance for typographical errors
  • PROC SUMMARY will compute the percent of 1 simply by asking for the MEAN (the mean of a 0/1 variable is the percent of ones)
  • Less typing to enter a 0 or 1 compared to 'N' or 'Y' and hence less chance for typographical errors
  • If a different question was asked, for example does any ID have more than 3 observations with 'Y', if you use 0/1s again PROC SUMMARY makes the solution easy, while otherwise you have to count the 'Y' values within an ID, which is certainly do-able, but not as easy as using PROC SUMMARY 
--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 764 views
  • 7 likes
  • 3 in conversation