max() group by without sql

Accepted Solution Solved
Reply
Valued Guide
Posts: 858
Accepted Solution

max() group by without sql

Happy New Year,

I would like to perform the following using sas code, not proc sql.  I know using the max function for two variables and a group by statement would work but I'm looking for another solution.  Thank You in advance:

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd npl

00001,01/01/2000,pd

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

data want;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2001,pd npl

00002,02/01/2010,pd npl

;

run;


Accepted Solutions
Solution
‎01-01-2015 07:43 PM
Super User
Posts: 19,157

Re: max() group by without sql

The answer from @stat@sas isn't a true max, but it takes the first record.  Simply switching the first two records in your sample data will break the logic.

If you're looking for the true max from each field you can do two sorts/passes of the data or use first/last and manually calculate one. See an example below, of breaking the solution as well as an alternative. If your data was pre-sorted by acct already you could manually  calculate each max similarly.

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd

00001,01/01/2000,pd npl

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

*Manually calculating max assuming a sorted dataset, by acct date;

data want2;

set have;

by acct date;

retain max_default;

if first.acct then max_default=default;

else if default>max_default then max_default=default;

if last.acct;

drop default;

rename max_default=default;

run;


*Manually calculating max, assuming sorted only by acct;

*You'll need to do the rename/dropping of variables;

data want3;

set have;

by acct notsorted;

retain max_date max_default;

if first.acct then do;

  max_default=default;

  max_date=date;

end;

else do;

  if default>max_default then max_default=default;

  if date>max_date then max_date=date;

end;

if last.acct;

run;

View solution in original post


All Replies
Valued Guide
Posts: 2,177

Re: max() group by without sql

Gave you looked at

PROC MEANS

Valued Guide
Posts: 858

Re: max() group by without sql

I have not, I am not familiar with getting this result using proc means.

Super User
Posts: 19,157

Re: max() group by without sql

I don't think proc means will work because of the text variables.

Whats the logic, max of each field?

Valued Guide
Posts: 858

Re: max() group by without sql

I want the most recent date for the date field and the longest length for the default field.  I believe the sql max() performs the same way.

Thanks,

Trusted Advisor
Posts: 1,228

Re: max() group by without sql

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

Valued Guide
Posts: 858

Re: max() group by without sql

stat@sas, very nice.  Thank you very much.  If you wouldn't mind can you explain why this works the way that it does?

Thank You,

Mark

Trusted Advisor
Posts: 1,228

Re: max() group by without sql

Hi Mark,

As dataset is sorted by acct and date. Last date within each value of acct variable will be the maximum one. Introduced a new variable default1 assigned maximum value of default variable at the first occurrence of each value of acct and retained until the last occurrence.  Finally, if last condition outputs the last occurrence of each acct value along with maximum date and maximum value of default variable.

Hoping to see Steelers in super bowl this year.

Regards,

Naeem

Valued Guide
Posts: 858

Re: max() group by without sql

haha, me too!

I am having one issue.  The default field is null in some cases.  Where this happens the result is always null.

Solution
‎01-01-2015 07:43 PM
Super User
Posts: 19,157

Re: max() group by without sql

The answer from @stat@sas isn't a true max, but it takes the first record.  Simply switching the first two records in your sample data will break the logic.

If you're looking for the true max from each field you can do two sorts/passes of the data or use first/last and manually calculate one. See an example below, of breaking the solution as well as an alternative. If your data was pre-sorted by acct already you could manually  calculate each max similarly.

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd

00001,01/01/2000,pd npl

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

*Manually calculating max assuming a sorted dataset, by acct date;

data want2;

set have;

by acct date;

retain max_default;

if first.acct then max_default=default;

else if default>max_default then max_default=default;

if last.acct;

drop default;

rename max_default=default;

run;


*Manually calculating max, assuming sorted only by acct;

*You'll need to do the rename/dropping of variables;

data want3;

set have;

by acct notsorted;

retain max_date max_default;

if first.acct then do;

  max_default=default;

  max_date=date;

end;

else do;

  if default>max_default then max_default=default;

  if date>max_date then max_date=date;

end;

if last.acct;

run;

Valued Guide
Posts: 858

Re: max() group by without sql

Very nice, thank you.

Trusted Advisor
Posts: 1,228

Re: max() group by without sql

As suggested solution was based on the given dataset and problem description. Reeza has raised on interesting point but not sure whether this is true for all of the text comparisons. A small modification in the syntax will fix the problem raised by Reeza as well as if the default field contains null values.

Thanks,

Naeem

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

length default1 $8.;

retain default1;

if first.acct then default1='';

if default='pd npl' then default1=default;

if last.acct;

run;

Super User
Posts: 19,157

Re: max() group by without sql

@Naeem your solution is too specific to the sample data, not to the actual programming problem.  What would happen if the text had an upper cased letter, extra leading or trailing space? And assume this is a sample of the problem that is meant to illustrate the problem, not the exact data/problem the OP is presenting.

The problem seems to be:

I know using the max function for two variables and a group by statement would work but I'm looking for another solution.


Or restated:

How to find the max of two variables within a group NOT using proc sql.


Hope that makes sense!

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 540 views
  • 1 like
  • 4 in conversation