BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
HB
Barite | Level 11 HB
Barite | Level 11

 

 

Here is some sample made up data to show what I want.

 

data successes;
   input code group something success;
datalines;
1104 1 8 0
1104 1 7 0
1105 1 2 0
1105 1 3 1
1105 1 4 0
1106 1 5 0
1106 1 6 1
1106 1 7 0
1106 1 8 0
1106 1 9 0
1106 1 1 0
1107 1 1 0
1107 1 3 0
1107 1 5 0
1107 1 7 0
1107 1 9 1
1108 1 2 0
1109 1 4 0
1109 1 6 0
1110 1 3 1
1110 1 2 1
1110 1 1 1
1111 2 2 0
1111 2 3 1
1111 2 4 0
1112 2 5 0
1112 2 6 1
1112 2 7 0
1112 2 8 0
1112 2 9 0
;
run;

For this data set, there are a number of observations for each code (in the real data anywhere between 1 and 25). A "1" in success means that that code was successful. In this data, codes 1104, 1108, and 1109 had no success in group 1. 1105, 1106, 1107, and 1110 had success.

 

What I want, is to adjust the data in such a way that I can do a Proc Freq to come up with a success percent by group. Group 1 has a success rate of 4/7 = 57%.

 

My thoughts were that I would make a successflag of some sort that was a 1 if any observation of a code had a success value of 1. Then de-duplicate based on code and successflag to create a data set I could run the FREQ against.

 

I tried

data mysuccesses;
	mysuccessflag = 0;
	if group = 1;
	do until (last.code);
		set successes;
		by code;
		if success = 1 then mysuccessflag = 1;
	end;
run;

but that is wrong in many ways. Looping in the data step error. Help here appreciated.

  

I was trying to figure out how to loop through all the observations for a particular code and then set my flag to 1 if any observation had a success of 1.

 

I think I want a data set that looks like

 

data successes;
   input code group something success mysuccessflag;
datalines;
1104 1 8 0 0
1104 1 7 0 0
1105 1 2 0 1
1105 1 3 1 1
1105 1 4 0 1
1106 1 5 0 1
1106 1 6 1 1
1106 1 7 0 1
1106 1 8 0 1
1106 1 9 0 1
1106 1 1 0 1
1107 1 1 0 1
1107 1 3 0 1
1107 1 5 0 1
1107 1 7 0 1
1107 1 9 1 1
1108 1 2 0 0
1109 1 4 0 0
1109 1 6 0 0
1110 1 3 1 1
1110 1 2 1 1
1110 1 1 1 1
1111 2 2 0 1
1111 2 3 1 1
1111 2 4 0 1
1112 2 5 0 1
1112 2 6 1 1
1112 2 7 0 1
1112 2 8 0 1
1112 2 9 0 1
;
run;

With that I could

proc sort data = successes out = cleanedsuccesses nodupkey;
	by code;
run;

proc freq data = cleanedsuccesses;
	where group = 1;
	tables mysuccessflag;
run;

which gets me where I want to be.

  

But I'm open to suggestions. I can do this with SQL but I want to do it something like this.

 

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Assuming you want a data set with the percent of any success by group, success is always 0/1 as described then this may work:

proc summary data=successes nway ;
   class group code;
   var success;
   output out=successmax max=;
run;

proc summary data=successmax nway;
   class group;
   var success;
   output out= perc mean=;
run;

The value of Success in the final set Perc is decimal percentage so instead of 57% it has 0.5714285714. Display as needed.

 

Note that use of Class variables to group the data doesn't require explicit sorting though the output data will be.

You can use data set options to drop the _type_ and _freq_ variables. _freq_ may help with verifying the number of codes used to calculate the percentage if concerned.

View solution in original post

5 REPLIES 5
ballardw
Super User

Assuming you want a data set with the percent of any success by group, success is always 0/1 as described then this may work:

proc summary data=successes nway ;
   class group code;
   var success;
   output out=successmax max=;
run;

proc summary data=successmax nway;
   class group;
   var success;
   output out= perc mean=;
run;

The value of Success in the final set Perc is decimal percentage so instead of 57% it has 0.5714285714. Display as needed.

 

Note that use of Class variables to group the data doesn't require explicit sorting though the output data will be.

You can use data set options to drop the _type_ and _freq_ variables. _freq_ may help with verifying the number of codes used to calculate the percentage if concerned.

HB
Barite | Level 11 HB
Barite | Level 11

proc summary data=successes nway ;

class group code;

var success;

output out=successmax max=;

run;

 

 

So, we are summarizing the data by group and code. 

The nway option collapses that down to one observation per combo of group and code.

Output creates a file.

Max= is where I kind of get lost.  That somehow magically assigns the maximum value of the variable named (var = success) as the value of the summarized success variable.  I think.

 

I don't understand it fully but it works.   If I add

 

PROC FORMAT;

   VALUE success_fmt

   0 = "No"

   1 = "Yes";

RUN;

 

proc freq data = successmax;

   where group = 1;

   format success success_fmt.;

   tables success;

run;

 

 

working right off the first created data set I get

                                          The SAS System           09:10 Friday, March 2, 2018   8

                                        The FREQ Procedure

                                                       Cumulative    Cumulative
                   success    Frequency     Percent     Frequency      Percent
                   
                       No            3       42.86             3        42.86
                       Yes           4       57.14             7       100.00

 

 

 

which gets me there. 

 

Very cool.  Thank you.

ballardw
Super User

@HB wrote:

proc summary data=successes nway ;

class group code;

var success;

output out=successmax max=;

run;

 

 

So, we are summarizing the data by group and code. 

The nway option collapses that down to one observation per combo of group and code.

Output creates a file.

Max= is where I kind of get lost.  That somehow magically assigns the maximum value of the variable named (var = success) as the value of the summarized success variable.  I think.

 

 

That is exactly what the MAX statistic does, report the maximum value of the variable(s) within the combinations of the class variable(s).

Since your description was sort of "I want a 'success' for the group/code combination if any of the individual records were a success" then one way to think of that is the maximum value.

0/1 coded variables are very nice for use with some of the summary statistics: n is the number of valid values; mean is percent and sum is the number of 1 values (successes, valid, true, what have you for interpretation).

Reeza
Super User

Try a DoW loop instead. 

 

 

novinosrin
Tourmaline | Level 20

Dow fun:

data successes;
   input code group something success;
datalines;
1104 1 8 0
1104 1 7 0
1105 1 2 0
1105 1 3 1
1105 1 4 0
1106 1 5 0
1106 1 6 1
1106 1 7 0
1106 1 8 0
1106 1 9 0
1106 1 1 0
1107 1 1 0
1107 1 3 0
1107 1 5 0
1107 1 7 0
1107 1 9 1
1108 1 2 0
1109 1 4 0
1109 1 6 0
1110 1 3 1
1110 1 2 1
1110 1 1 1
1111 2 2 0
1111 2 3 1
1111 2 4 0
1112 2 5 0
1112 2 6 1
1112 2 7 0
1112 2 8 0
1112 2 9 0
;
run;

data want;
_success=0;__success=0;
do until(last.group);
	_found=0;
	do until(last.code);
	set successes;
	by group code;
	if not _found and success=1 then _success+success;
	if success=1 then do;__success+success;_found=1;end;
	end;
	if last.group then success_pct=divide(_success,__success)*100;
end;
drop _: code something success;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1808 views
  • 6 likes
  • 4 in conversation