BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Steelers_In_DC
Barite | Level 11

Happy New Year,

I would like to perform the following using sas code, not proc sql.  I know using the max function for two variables and a group by statement would work but I'm looking for another solution.  Thank You in advance:

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd npl

00001,01/01/2000,pd

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

data want;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2001,pd npl

00002,02/01/2010,pd npl

;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

The answer from @stat@sas isn't a true max, but it takes the first record.  Simply switching the first two records in your sample data will break the logic.

If you're looking for the true max from each field you can do two sorts/passes of the data or use first/last and manually calculate one. See an example below, of breaking the solution as well as an alternative. If your data was pre-sorted by acct already you could manually  calculate each max similarly.

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd

00001,01/01/2000,pd npl

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

*Manually calculating max assuming a sorted dataset, by acct date;

data want2;

set have;

by acct date;

retain max_default;

if first.acct then max_default=default;

else if default>max_default then max_default=default;

if last.acct;

drop default;

rename max_default=default;

run;


*Manually calculating max, assuming sorted only by acct;

*You'll need to do the rename/dropping of variables;

data want3;

set have;

by acct notsorted;

retain max_date max_default;

if first.acct then do;

  max_default=default;

  max_date=date;

end;

else do;

  if default>max_default then max_default=default;

  if date>max_date then max_date=date;

end;

if last.acct;

run;

View solution in original post

12 REPLIES 12
Peter_C
Rhodochrosite | Level 12

Gave you looked at

PROC MEANS

Steelers_In_DC
Barite | Level 11

I have not, I am not familiar with getting this result using proc means.

Reeza
Super User

I don't think proc means will work because of the text variables.

Whats the logic, max of each field?

Steelers_In_DC
Barite | Level 11

I want the most recent date for the date field and the longest length for the default field.  I believe the sql max() performs the same way.

Thanks,

stat_sas
Ammonite | Level 13

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

Steelers_In_DC
Barite | Level 11

stat@sas, very nice.  Thank you very much.  If you wouldn't mind can you explain why this works the way that it does?

Thank You,

Mark

stat_sas
Ammonite | Level 13

Hi Mark,

As dataset is sorted by acct and date. Last date within each value of acct variable will be the maximum one. Introduced a new variable default1 assigned maximum value of default variable at the first occurrence of each value of acct and retained until the last occurrence.  Finally, if last condition outputs the last occurrence of each acct value along with maximum date and maximum value of default variable.

Hoping to see Steelers in super bowl this year.

Regards,

Naeem

Steelers_In_DC
Barite | Level 11

haha, me too!

I am having one issue.  The default field is null in some cases.  Where this happens the result is always null.

Reeza
Super User

The answer from @stat@sas isn't a true max, but it takes the first record.  Simply switching the first two records in your sample data will break the logic.

If you're looking for the true max from each field you can do two sorts/passes of the data or use first/last and manually calculate one. See an example below, of breaking the solution as well as an alternative. If your data was pre-sorted by acct already you could manually  calculate each max similarly.

data have;

infile cards dsd;

informat date mmddyy10.;

format date date9.;

input acct $ date default $;

cards;

00001,01/01/2000,pd

00001,01/01/2000,pd npl

00001,01/01/2001,pd

00002,02/01/2000,pd npl

00002,02/01/2000,pd

00002,02/01/2010,pd

;

run;

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

retain default1;

if first.acct then default1=default;

if last.acct;

run;

*Manually calculating max assuming a sorted dataset, by acct date;

data want2;

set have;

by acct date;

retain max_default;

if first.acct then max_default=default;

else if default>max_default then max_default=default;

if last.acct;

drop default;

rename max_default=default;

run;


*Manually calculating max, assuming sorted only by acct;

*You'll need to do the rename/dropping of variables;

data want3;

set have;

by acct notsorted;

retain max_date max_default;

if first.acct then do;

  max_default=default;

  max_date=date;

end;

else do;

  if default>max_default then max_default=default;

  if date>max_date then max_date=date;

end;

if last.acct;

run;

Steelers_In_DC
Barite | Level 11

Very nice, thank you.

stat_sas
Ammonite | Level 13

As suggested solution was based on the given dataset and problem description. Reeza has raised on interesting point but not sure whether this is true for all of the text comparisons. A small modification in the syntax will fix the problem raised by Reeza as well as if the default field contains null values.

Thanks,

Naeem

proc sort data=have;

by acct date;

run;

data want(drop=default rename=default1=default);

set have;

by acct date default notsorted;

length default1 $8.;

retain default1;

if first.acct then default1='';

if default='pd npl' then default1=default;

if last.acct;

run;

Reeza
Super User

@Naeem your solution is too specific to the sample data, not to the actual programming problem.  What would happen if the text had an upper cased letter, extra leading or trailing space? And assume this is a sample of the problem that is meant to illustrate the problem, not the exact data/problem the OP is presenting.

The problem seems to be:

I know using the max function for two variables and a group by statement would work but I'm looking for another solution.


Or restated:

How to find the max of two variables within a group NOT using proc sql.


Hope that makes sense!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1212 views
  • 1 like
  • 4 in conversation