BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
DmytroYermak
Lapis Lazuli | Level 10

 Hi all,

 

Could you please help in the following case - how to find a max within each group in one step:

  

data test;
infile datalines missover;
input group $ value1 value 2 value 3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 ;
run;

 

I wanted to get this output:

 

group    value

g1         30

g2        76

g3        84

g4        94

 

Thank You!

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, I see, then:

data test;
  infile datalines missover;
  input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 
;
run;

data want (keep=group max_val);
  set test;
  by group;
  retain max_val;
  if first.group then max_val=.;
  if max(of value:) > max_val then max_val=max(of value:);
  if last.group then output;
run;

View solution in original post

11 REPLIES 11
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like:

data test;
infile datalines missover;
input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73
;
run;
proc sql;
create table WANT as
select GROUP,
sum(sum(VALUE1,VALUE2,VALUE3,VALUE4)) as VALUE
from TEST
group by GROUP;
quit;

However your output want dataset doesn't seem to make sense to me.  Why would G1=30, when there is 12 and 30?

Much like G4, how can you get 94 when there are 9 values to add together?

DmytroYermak
Lapis Lazuli | Level 10

I have received the output where  sums were calculated but I need max values within each group. That is the case.

 

The SAS System

Obs group VALUE 1 2 3 4
g142
g2192
g3122
g4386
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, I see, then:

data test;
  infile datalines missover;
  input group $ value1 value2 value3 value4;
datalines;
g1 12 30
g2 14 37 15
g2 16 76 34
g3 12 22
g3 4 84
g4 7 15 10
g4 24 21 17 67
g4 14 18 94
g4 26 73 
;
run;

data want (keep=group max_val);
  set test;
  by group;
  retain max_val;
  if first.group then max_val=.;
  if max(of value:) > max_val then max_val=max(of value:);
  if last.group then output;
run;
DmytroYermak
Lapis Lazuli | Level 10
Thank you, very useful!
Astounding
PROC Star

I'm not sure this is all that different:

 

data want;

do until (last.group);

   set have;

   by group;

   maxval = max(maxval, of value1-value4);

end;

keep group maxval;

run;

DmytroYermak
Lapis Lazuli | Level 10
It is definitely a solution as well. Thank you!
Davanden
Obsidian | Level 7

That looks clever, but I don't understand it.  How does BY function inside a DO loop?  I've never seen that before.

 

--Dav

gamotte
Rhodochrosite | Level 12

@Davanden, here is how i understand it :

 

do until (last.group);

   set have;

   by group;

   maxval = max(maxval, of value1-value4);

end;

 

The interest of this loop is to treat an entire group for each iteration of the datastep.

By putting the set ... by instructions inside the loop, you can read each record

of the group without outputting or reinitializing the PDV so that the value of MAXVAL for the group is kept

between records..

 

It is only at the run instruction that the result is output and the PDV is reset before handling the next group.

 

You end up with a shorter program than if you explicitely reset Maxval on first.group and output

on last.group.

Clever indeed.

 

Astounding
PROC Star

Each SET statement is permitted to use a BY statement.  That sets up FIRST. and LAST. variables (regardless of whether inside a loop or not).

 

One of the key issues here is how the SET statement operates.  It is not a label that shows where the data is coming from.  Rather, it is an instruction to read the next observation from a SAS data set.  It only reads one observation at a time (whether inside or outside a DO loop).  So the BY statement merely identifies whether that observation is the first (or last) one for a value of GROUP.

 

***** EDITED HERE:

 

Maybe I shouldn't add this ... it might add to the confusion.  Yet some of the community will find this interesting.  Technically, the BY statement refers to the most recent SET / MERGE / UPDATE statement.  There is no reason that it has to immediatley follow a SET, MERGE, or UPDATE statement.  So this variation would work exactly the same way:

 

data want;

do until (last.group);

   set have;

   maxval = max(maxval, of value1-value4);

end;

by group;

keep group maxval;

run;

novinosrin
Tourmaline | Level 20

@DmytroYermak  My apologies for the late entry to the party. Proc sql is the best solution for this kind of problem coz you can avoid a sort and can do it one select query-

 

data test;

infile datalines missover;

input group $ value1 value2 value3 value4;

datalines;

g1 12 30

g2 14 37 15

g2 16 76 34

g3 12 22

g3 4 84

g4 7 15 10

g4 24 21 17 67

g4 14 18 94

g4 26 73

;

run;

 

proc sql;

create table want as

select group, max(value1, value2, value3, value4) as value

from test

group by group

having value=max(value);

quit;

 

Regards,

Naveen Srinivasan

 

 

Astounding
PROC Star

@novinosrin,

 

One thing to be wary of ... I believe the SQL approach can give you multiple observations for a GROUP, if you have a tie for the largest value.  If you're good enough at SQL (which rules me out), you might be able to get around that using DISCREET.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 7976 views
  • 9 likes
  • 6 in conversation