DATA Step, Macro, Functions and more

Max by group + keep variables

Accepted Solution Solved
Reply
Contributor fre
Contributor
Posts: 35
Accepted Solution

Max by group + keep variables

[ Edited ]

Dear community,

 

I have this dataset to start from:

 

case  /   name   /    valueX

 

001  /  A   /   100

002  /  A   /    80

002  /  B   /    20

003  /  A   /    20

003  /  B   /    50

003  /  C   /    30

 

What I need is for each case, the maximum for valueX, but the output need also to have the column "name".

So, I want:

 

001 / A /  100

002 / B /  80

003 / B /  50

 

What I tried:

proc sql;

create table want as

select case, name, max(valueX) as max_value

from have

group by case;

quit;

 

but this results in:

001  /  A  /  100

002  /  A  /  80

002  /  B  /  80

003  /  A   /    50

003  /  B   /    50

003  /  C   /    50

 

What am I doing wrong?

 

 

 

 

 

 

 

 


Accepted Solutions
Solution
‎10-05-2017 08:36 AM
Regular Contributor
Posts: 191

Re: Max by group + keep variables

Try the having clause:

data one;

length cas name $32 valueX 8;

input cas name valueX;

datalines;

001 A 100

002 A 80

002 B 20

003 A 20

003 B 50

003 C 30

;

quit;

proc sql;

create table two as

select cas, name, max(valueX) as max_valueX

from one

group by cas

having valueX = max_valueX

;

quit;

 

//Fredrik

View solution in original post


All Replies
Regular Contributor
Posts: 240

Re: Max by group + keep variables

Hello,

 

GROUP BY only indicates which observations are taken into account when calculating the max

but you still keep all observations. You have to add a where close on the condition

valueX=max(valueX)

Super User
Super User
Posts: 7,997

Re: Max by group + keep variables

@gamotte has answered the SQL part, you could also just do a sort and datastep (an be included in other code):

proc sort data=have out=want;
  by case name descending valuex;
run;
data want;
  set have;
  by case name;
  if first.name;
run;

 

Solution
‎10-05-2017 08:36 AM
Regular Contributor
Posts: 191

Re: Max by group + keep variables

Try the having clause:

data one;

length cas name $32 valueX 8;

input cas name valueX;

datalines;

001 A 100

002 A 80

002 B 20

003 A 20

003 B 50

003 C 30

;

quit;

proc sql;

create table two as

select cas, name, max(valueX) as max_valueX

from one

group by cas

having valueX = max_valueX

;

quit;

 

//Fredrik

Contributor fre
Contributor
Posts: 35

Re: Max by group + keep variables

Hi FredrikE, gamotte, RW9,

 

thank you all very much for your answers.  There are more possibilities, but I can only mark 1 as the solution, so I picked the one closest to my original code.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 138 views
  • 4 likes
  • 4 in conversation