DATA Step, Macro, Functions and more

Getting max value by group - complex table

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 143
Accepted Solution

Getting max value by group - complex table

[ Edited ]

 Hi all,

 

Could you please help in the following case - how to find a max within each group in one step:

  

data test;
infile datalines missover;
input group $ value1 value 2 value 3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 ;
run;

 

I wanted to get this output:

 

group    value

g1         30

g2        76

g3        84

g4        94

 

Thank You!


Accepted Solutions
Solution
‎09-11-2017 09:07 AM
Super User
Super User
Posts: 7,997

Re: Getting max value by group - complex table

Posted in reply to DmytroYermak

Ah, I see, then:

data test;
  infile datalines missover;
  input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 
;
run;

data want (keep=group max_val);
  set test;
  by group;
  retain max_val;
  if first.group then max_val=.;
  if max(of value:) > max_val then max_val=max(of value:);
  if last.group then output;
run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,997

Re: Getting max value by group - complex table

Posted in reply to DmytroYermak

Something like:

data test;
infile datalines missover;
input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73
;
run;
proc sql;
create table WANT as
select GROUP,
sum(sum(VALUE1,VALUE2,VALUE3,VALUE4)) as VALUE
from TEST
group by GROUP;
quit;

However your output want dataset doesn't seem to make sense to me.  Why would G1=30, when there is 12 and 30?

Much like G4, how can you get 94 when there are 9 values to add together?

Frequent Contributor
Posts: 143

Re: Getting max value by group - complex table

I have received the output where  sums were calculated but I need max values within each group. That is the case.

 

The SAS System

Obs group VALUE 1 2 3 4
g142
g2192
g3122
g4386
Solution
‎09-11-2017 09:07 AM
Super User
Super User
Posts: 7,997

Re: Getting max value by group - complex table

Posted in reply to DmytroYermak

Ah, I see, then:

data test;
  infile datalines missover;
  input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 
;
run;

data want (keep=group max_val);
  set test;
  by group;
  retain max_val;
  if first.group then max_val=.;
  if max(of value:) > max_val then max_val=max(of value:);
  if last.group then output;
run;
Frequent Contributor
Posts: 143

Re: Getting max value by group - complex table

Thank you, very useful!
Super User
Posts: 5,518

Re: Getting max value by group - complex table

I'm not sure this is all that different:

 

data want;

do until (last.group);

   set have;

   by group;

   maxval = max(maxval, of value1-value4);

end;

keep group maxval;

run;

Frequent Contributor
Posts: 143

Re: Getting max value by group - complex table

Posted in reply to Astounding
It is definitely a solution as well. Thank you!
Contributor
Posts: 41

Re: Getting max value by group - complex table

Posted in reply to Astounding

That looks clever, but I don't understand it.  How does BY function inside a DO loop?  I've never seen that before.

 

--Dav

Regular Contributor
Posts: 240

Re: Getting max value by group - complex table

@Davanden, here is how i understand it :

 

do until (last.group);

   set have;

   by group;

   maxval = max(maxval, of value1-value4);

end;

 

The interest of this loop is to treat an entire group for each iteration of the datastep.

By putting the set ... by instructions inside the loop, you can read each record

of the group without outputting or reinitializing the PDV so that the value of MAXVAL for the group is kept

between records..

 

It is only at the run instruction that the result is output and the PDV is reset before handling the next group.

 

You end up with a shorter program than if you explicitely reset Maxval on first.group and output

on last.group.

Clever indeed.

 

Super User
Posts: 5,518

Re: Getting max value by group - complex table

[ Edited ]

Each SET statement is permitted to use a BY statement.  That sets up FIRST. and LAST. variables (regardless of whether inside a loop or not).

 

One of the key issues here is how the SET statement operates.  It is not a label that shows where the data is coming from.  Rather, it is an instruction to read the next observation from a SAS data set.  It only reads one observation at a time (whether inside or outside a DO loop).  So the BY statement merely identifies whether that observation is the first (or last) one for a value of GROUP.

 

***** EDITED HERE:

 

Maybe I shouldn't add this ... it might add to the confusion.  Yet some of the community will find this interesting.  Technically, the BY statement refers to the most recent SET / MERGE / UPDATE statement.  There is no reason that it has to immediatley follow a SET, MERGE, or UPDATE statement.  So this variation would work exactly the same way:

 

data want;

do until (last.group);

   set have;

   maxval = max(maxval, of value1-value4);

end;

by group;

keep group maxval;

run;

PROC Star
Posts: 283

Re: Getting max value by group - complex table

Posted in reply to Astounding

@DmytroYermak  My apologies for the late entry to the party. Proc sql is the best solution for this kind of problem coz you can avoid a sort and can do it one select query-

 

data test;

infile datalines missover;

input group $ value1 value2 value3 value4;

datalines;

g1 12 30

g2 14 37 15

g2 16 76 34

g3 12 22

g3 4 84

g4 7 15 10

g4 24 21 17 67

g4 14 18 94

g4 26 73

;

run;

 

proc sql;

create table want as

select group, max(value1, value2, value3, value4) as value

from test

group by group

having value=max(value);

quit;

 

Regards,

Naveen Srinivasan

 

 

Super User
Posts: 5,518

Re: Getting max value by group - complex table

Posted in reply to novinosrin

@novinosrin,

 

One thing to be wary of ... I believe the SQL approach can give you multiple observations for a GROUP, if you have a tie for the largest value.  If you're good enough at SQL (which rules me out), you might be able to get around that using DISCREET.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 243 views
  • 9 likes
  • 6 in conversation