BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tommy1
Quartz | Level 8

Hi SAS Community,

 

I have a data set where I am trying to find the max grade for each ID. This is simple enough using Proc SQL and then using a group by statement.

 

There are two problems that I am running into , the first is I am trying to identify the model where the max grade comes from and assign that model for each row with the same ID. In the example below,  for ID #1 model b has a higher grade of 2 than a that only has a grade of 1, so model b would be assigned as the max_ model for ID #1.. 

 

The second problem that I am trying to tackle is the same as the first problem with an added nuance, If there are multiple models that have the same highest grade for the ID, I want to pick the one one with the highest amount value. In the example below, for ID #2 has two models have the same max grade of 3. Since model a has a higher value, I would assign a to every row for ID 2

 

This is an example of my data

id         model       grade       amount

1             a               1               10

1             b               2               10

2             a               3               20

2             b               3               10

 

This is what I envision my data to look like in the end

id         model       grade       amount    max_grade max_model

1             a               1               10            2                 b

1             b               2               10            2                 b

2             a               3               20            3                 a

2             b               3               10            3                 a

 

Please reach out if you have any questions or need me to clarify anything.

 

Best,

Tom

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You will need a data step:

 

proc sort data=have; by id descending grade descending amount; run;

data want;
set have;
by id;
retain max_grade max_model;
if first.id then do;
	max_grade = grade;
	max_model = model;
	end;
run;

solves both problems.

PG

View solution in original post

9 REPLIES 9
PaigeMiller
Diamond | Level 26

PROC SUMMARY allows you to identify the ID value (where ID can be any variable) that produces the maximum. Here is an example:

 

https://documentation.sas.com/?docsetId=proc&docsetVersion=9.4&docsetTarget=n18axszmd7up03n1densx922...

 

As far as the second problem, I think you'd have to do some coding in a DATA step or SQL.

--
Paige Miller
Reeza
Super User

Use BY group + SORT

 

Something like this, untested.

 

proc sort data=have;
by ID descending grade descending amount;
run;

data max;
set have;

by ID ;

if first.ID;
rename grade=max_grade model = max_model;
drop amount;
run;

data want;
merge have max;
by ID;
run;

@Tommy1 wrote:

Hi SAS Community,

 

I have a data set where I am trying to find the max grade for each ID. This is simple enough using Proc SQL and then using a group by statement.

 

There are two problems that I am running into , the first is I am trying to identify the model where the max grade comes from and assign that model for each row with the same ID. In the example below,  for ID #1 model b has a higher grade of 2 than a that only has a grade of 1, so model b would be assigned as the max_ model for ID #1.. 

 

The second problem that I am trying to tackle is the same as the first problem with an added nuance, If there are multiple models that have the same highest grade for the ID, I want to pick the one one with the highest amount value. In the example below, for ID #2 has two models have the same max grade of 3. Since model a has a higher value, I would assign a to every row for ID 2

 

This is an example of my data

id         model       grade       amount

1             a               1               10

1             b               2               10

2             a               3               20

2             b               3               10

 

This is what I envision my data to look like in the end

id         model       grade       amount    max_grade max_model

1             a               1               10            2                 b

1             b               2               10            2                 b

2             a               3               20            3                 a

2             b               3               10            3                 a

 

Please reach out if you have any questions or need me to clarify anything.

 

Best,

Tom


 

PaigeMiller
Diamond | Level 26

For huge data sets, wouldn't a SORT be more time consuming than PROC SUMMARY?

--
Paige Miller
Reeza
Super User
Yes, but this handles both conditions at once which is more efficient from a programming standpoint 🙂
JeffMaggio
Obsidian | Level 7

I think this tackles it:

data have;
id=1;             model='a';      grade=1;       amount=10; output;
id=1;             model='b';      grade=2;       amount=10; output;
id=2;             model='a';      grade=3;       amount=20; output;
id=2;             model='b';      grade=3;       amount=10; output;
run;

proc sort data=have out=have_sort;
by id descending grade descending amount model;
run;

data want (drop=max_amount);
set have_sort;
by id;
retain max_grade max_amount max_model;
if first.id then do; max_grade = grade; max_model = model; max_amount=amount; end;
if grade > max_grade then do; max_grade = grade; max_model = model; end;
else if grade=max_grade and amount > max_amount then do; max_grade = grade; max_model = model; end;
run;

You could also do this in proc sql qith nested queries, but that second criteria would be a bit tricky. 

novinosrin
Tourmaline | Level 20

Not hard in SQL if you know what to group and boolean expressions

 

data have;
input id         model  $     grade       amount;
cards;
1             a               1               10
1             b               2               10
2             a               3               20
2             b               3               10
;

proc sql;
create table want(drop=t) as
select *, max(t) as max_model
from 
(select *,ifc(max(amount)=amount,model,' ') as t
from
(select *, max(grade) as max_grade from have group by id)
group by id,max_grade)
group by id,max_grade;
quit;
novinosrin
Tourmaline | Level 20


data have;
input id         model  $     grade       amount;
cards;
1             a               1               10
1             b               2               10
2             a               3               20
2             b               3               10
;

data want;
 if _n_=1 then do;
  dcl hash H () ;
  h.definekey  ("max_grade") ;
  h.definedata ("_amt", "_m") ;
  h.definedone () ;
 end;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  length max_model _m $32;
  if grade<max_grade then continue;
  max_grade=grade;
  max_model=model;
  if h.find()=0 and _amt>amount then max_model=_m;
  else do; _amt=amount;_m=model;end;
  h.replace();
 end;
 do _n_=1 to _n_;
  set have;
  output;
 end;
 drop _:;
 h.clear();
run;
PGStats
Opal | Level 21

You will need a data step:

 

proc sort data=have; by id descending grade descending amount; run;

data want;
set have;
by id;
retain max_grade max_model;
if first.id then do;
	max_grade = grade;
	max_model = model;
	end;
run;

solves both problems.

PG
Tommy1
Quartz | Level 8

Wow! Thank you all so much for all the solutions! The knowledge and speed that you responded to my post blew me away. This shows just how wonderful a place the SAS community is. I wish I could accept multiple solutions because there were so many different ways that you all helped me approach the problem. I am not sure if there is a certain etiquette for which solution to pick, but I went with the one that was the most simple and easiest for me to implement with my data.

 

Thank You! 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 721 views
  • 5 likes
  • 6 in conversation