BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tommy1
Quartz | Level 8

Hi SAS Community,

 

I have a data set where I am trying to find the max grade for each ID. This is simple enough using Proc SQL and then using a group by statement.

 

There are two problems that I am running into , the first is I am trying to identify the model where the max grade comes from and assign that model for each row with the same ID. In the example below,  for ID #1 model b has a higher grade of 2 than a that only has a grade of 1, so model b would be assigned as the max_ model for ID #1.. 

 

The second problem that I am trying to tackle is the same as the first problem with an added nuance, If there are multiple models that have the same highest grade for the ID, I want to pick the one one with the highest amount value. In the example below, for ID #2 has two models have the same max grade of 3. Since model a has a higher value, I would assign a to every row for ID 2

 

This is an example of my data

id         model       grade       amount

1             a               1               10

1             b               2               10

2             a               3               20

2             b               3               10

 

This is what I envision my data to look like in the end

id         model       grade       amount    max_grade max_model

1             a               1               10            2                 b

1             b               2               10            2                 b

2             a               3               20            3                 a

2             b               3               10            3                 a

 

Please reach out if you have any questions or need me to clarify anything.

 

Best,

Tom

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You will need a data step:

 

proc sort data=have; by id descending grade descending amount; run;

data want;
set have;
by id;
retain max_grade max_model;
if first.id then do;
	max_grade = grade;
	max_model = model;
	end;
run;

solves both problems.

PG

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

PROC SUMMARY allows you to identify the ID value (where ID can be any variable) that produces the maximum. Here is an example:

 

https://documentation.sas.com/?docsetId=proc&docsetVersion=9.4&docsetTarget=n18axszmd7up03n1densx922...

 

As far as the second problem, I think you'd have to do some coding in a DATA step or SQL.

--
Paige Miller
Reeza
Super User

Use BY group + SORT

 

Something like this, untested.

 

proc sort data=have;
by ID descending grade descending amount;
run;

data max;
set have;

by ID ;

if first.ID;
rename grade=max_grade model = max_model;
drop amount;
run;

data want;
merge have max;
by ID;
run;

@Tommy1 wrote:

Hi SAS Community,

 

I have a data set where I am trying to find the max grade for each ID. This is simple enough using Proc SQL and then using a group by statement.

 

There are two problems that I am running into , the first is I am trying to identify the model where the max grade comes from and assign that model for each row with the same ID. In the example below,  for ID #1 model b has a higher grade of 2 than a that only has a grade of 1, so model b would be assigned as the max_ model for ID #1.. 

 

The second problem that I am trying to tackle is the same as the first problem with an added nuance, If there are multiple models that have the same highest grade for the ID, I want to pick the one one with the highest amount value. In the example below, for ID #2 has two models have the same max grade of 3. Since model a has a higher value, I would assign a to every row for ID 2

 

This is an example of my data

id         model       grade       amount

1             a               1               10

1             b               2               10

2             a               3               20

2             b               3               10

 

This is what I envision my data to look like in the end

id         model       grade       amount    max_grade max_model

1             a               1               10            2                 b

1             b               2               10            2                 b

2             a               3               20            3                 a

2             b               3               10            3                 a

 

Please reach out if you have any questions or need me to clarify anything.

 

Best,

Tom


 

PaigeMiller
Diamond | Level 26

For huge data sets, wouldn't a SORT be more time consuming than PROC SUMMARY?

--
Paige Miller
Reeza
Super User
Yes, but this handles both conditions at once which is more efficient from a programming standpoint 🙂
JeffMaggio
Obsidian | Level 7

I think this tackles it:

data have;
id=1;             model='a';      grade=1;       amount=10; output;
id=1;             model='b';      grade=2;       amount=10; output;
id=2;             model='a';      grade=3;       amount=20; output;
id=2;             model='b';      grade=3;       amount=10; output;
run;

proc sort data=have out=have_sort;
by id descending grade descending amount model;
run;

data want (drop=max_amount);
set have_sort;
by id;
retain max_grade max_amount max_model;
if first.id then do; max_grade = grade; max_model = model; max_amount=amount; end;
if grade > max_grade then do; max_grade = grade; max_model = model; end;
else if grade=max_grade and amount > max_amount then do; max_grade = grade; max_model = model; end;
run;

You could also do this in proc sql qith nested queries, but that second criteria would be a bit tricky. 

novinosrin
Tourmaline | Level 20

Not hard in SQL if you know what to group and boolean expressions

 

data have;
input id         model  $     grade       amount;
cards;
1             a               1               10
1             b               2               10
2             a               3               20
2             b               3               10
;

proc sql;
create table want(drop=t) as
select *, max(t) as max_model
from 
(select *,ifc(max(amount)=amount,model,' ') as t
from
(select *, max(grade) as max_grade from have group by id)
group by id,max_grade)
group by id,max_grade;
quit;
novinosrin
Tourmaline | Level 20


data have;
input id         model  $     grade       amount;
cards;
1             a               1               10
1             b               2               10
2             a               3               20
2             b               3               10
;

data want;
 if _n_=1 then do;
  dcl hash H () ;
  h.definekey  ("max_grade") ;
  h.definedata ("_amt", "_m") ;
  h.definedone () ;
 end;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  length max_model _m $32;
  if grade<max_grade then continue;
  max_grade=grade;
  max_model=model;
  if h.find()=0 and _amt>amount then max_model=_m;
  else do; _amt=amount;_m=model;end;
  h.replace();
 end;
 do _n_=1 to _n_;
  set have;
  output;
 end;
 drop _:;
 h.clear();
run;
PGStats
Opal | Level 21

You will need a data step:

 

proc sort data=have; by id descending grade descending amount; run;

data want;
set have;
by id;
retain max_grade max_model;
if first.id then do;
	max_grade = grade;
	max_model = model;
	end;
run;

solves both problems.

PG
Tommy1
Quartz | Level 8

Wow! Thank you all so much for all the solutions! The knowledge and speed that you responded to my post blew me away. This shows just how wonderful a place the SAS community is. I wish I could accept multiple solutions because there were so many different ways that you all helped me approach the problem. I am not sure if there is a certain etiquette for which solution to pick, but I went with the one that was the most simple and easiest for me to implement with my data.

 

Thank You! 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1988 views
  • 5 likes
  • 6 in conversation