BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zhige50
Obsidian | Level 7

I am trying to find the max/min value of certain variables and the associated product of that max/min value account. 

 

Have:

cust_id account prod limit balance
111 1234 A         10,000                 500
111 5678 B         20,000                 1,000
111 2345 C           3,000                 2,000
111 6789 B           2,000                 3,000
222 4321 A           1,000                 2,000
222 8765 A           4,000                 1,000
222 7654 C           6,000                    5,000

Want:

cust_id max_lmt prod_max_lmt min_bal prod_min_bal
111 20,000 B 500 A
222 6,000 C 1,000 A

 

I could use a few data steps to achieve that, but are there any efficient way? Thanks in advance.

/****get max_lmt prod_max_lmt****/

proc sort data=have;

by cust_id limit;

run;

data have1(keep=cust_id prod limit rename=(limit=max_lmt prod=max_prod_lmt));

set have;

by cust_id limit;

if last.cust_id;

run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Since you now have a rule for treating tied maximum limits or tied minimum balances, here is a program to accomodate:

 

data want (keep=cust_id max_lmt prod_max_lmt  min_bal prod_min_bal);
  set have ;
  by cust_id;

  retain prod_max_lmt prod_min_bal '       '; 

  if first.cust_id then do;
    set have (keep=limit balance rename=(limit=max_lmt  balance=min_bal)) point=_n_;
    prod_max_lmt=prod;
    prod_min_bal=prod;
  end;
  else do;
    if limit>=max_lmt then do; 
      if limit>max_lmt then prod_max_lmt=prod;
      else prod_max_lmt=catx('&',prod_max_lmt,prod);
      max_lmt=limit;
    end;
    if balance<=min_bal then do; 
      if balance<min_bal then prod_min_bal=prod;
      else prod_min_bal=catx('&',prod_min_bal,prod);
      min_bal=balance;
    end;
  end;
  if last.cust_id;
run;

Note the retain statement initializes prod_max_lmt and prod_min_bal to a string of 7 blanks, thereby giving those variables a length of $7, which would accommodate up to 4 single-char PRODs separated by '&'.  Use a longer initial series of blanks if you anticipate there will be more than 4 ties.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
mkeintz
PROC Star

And what do you want to do if there are multiple observations with the max limit, but different products?  Or tied minimum balance with different products?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
zhige50
Obsidian | Level 7

Good question. For example, if product A & B have same max limit, then the prod_max_lmt should be 'A&B'

 

ballardw
Super User

Proc means/summary comes close until you no longer have a unique Max/Min identfier

proc means data=have nway noprint ;
   class cust_id;
   var limit balance;
   output out=want (drop=_type_ _freq_)
      idgroup ( max(limit) out(limit prod )= max_lmt prod_max_lmt 	  )
      idgroup ( min(balance) out(balance prod )= min_lmt prod_min_bal 	  )
   ;
run;
andreas_lds
Jade | Level 19

Slightly different solution:

 max_limit = 0;
      min_balance = constant('big');
      call missing(prod_max_limit, prod_min_balance);
   end;

   if limit > max_limit then do;
      max_limit = limit;
      prod_max_limit = prod;
   end;
   else do;
      if limit = max_limit then do;
         prod_max_limit = catx('&', prod_max_limit, prod);
      end;
   end;

   if balance < min_balance then do;
      min_balance = balance;
      prod_min_balance = prod;
   end;
   else do;
      if balance = min_balance then do;
         prod_min_balance = catx('&', prod_min_balance, prod);
      end;
   end;

   if last.cust_id then do;
      output;
   end;
run;
mkeintz
PROC Star

Assuming you have no ties for max_lmt or min_bal, then this program will do what you want:

 

data want (keep=cust_id max_lmt prod_max_lmt  min_bal prod_min_bal);
  set have ;
  by cust_id;
  if first.cust_id then do;
    set have (keep=limit prod rename=(limit=max_lmt   prod=prod_max_lmt)) point=_n_;
    set have (keep=balance prod rename=(balance=min_bal prod=prod_min_bal)) point=_n_;
  end;
  if limit>max_lmt then do; 
    max_lmt=limit;
    prod_max_lmt=prod;
  end;
  if balance<min_bal then do; 
    min_bal=balance;
    prod_min_bal=prod;
  end;
  if last.cust_id;
run;

Every time this program encounters a new cust_id, it re-reads that observation with via the IF FIRST.ID THEN DO and  twiSET .... POINT= ... ; statements.  The variables read by these conditional set statements are renamed, and will therefore not be overwritten by a new SET action until the next cust_id is encountered.  These variables are retained over the current cust_id, and updated as a new maximum limit or minimum balance is encountered.

 

The last statement "if last.custid;" is a "subsetting if".  Only the values reached at the end of the cust_id will be output to the new dataset.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

Since you now have a rule for treating tied maximum limits or tied minimum balances, here is a program to accomodate:

 

data want (keep=cust_id max_lmt prod_max_lmt  min_bal prod_min_bal);
  set have ;
  by cust_id;

  retain prod_max_lmt prod_min_bal '       '; 

  if first.cust_id then do;
    set have (keep=limit balance rename=(limit=max_lmt  balance=min_bal)) point=_n_;
    prod_max_lmt=prod;
    prod_min_bal=prod;
  end;
  else do;
    if limit>=max_lmt then do; 
      if limit>max_lmt then prod_max_lmt=prod;
      else prod_max_lmt=catx('&',prod_max_lmt,prod);
      max_lmt=limit;
    end;
    if balance<=min_bal then do; 
      if balance<min_bal then prod_min_bal=prod;
      else prod_min_bal=catx('&',prod_min_bal,prod);
      min_bal=balance;
    end;
  end;
  if last.cust_id;
run;

Note the retain statement initializes prod_max_lmt and prod_min_bal to a string of 7 blanks, thereby giving those variables a length of $7, which would accommodate up to 4 single-char PRODs separated by '&'.  Use a longer initial series of blanks if you anticipate there will be more than 4 ties.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User
prov sql;
create table max as
  select *
  from have
  group by cust_id
  having limit = max(limit)
;
quit;

data want;
set max;
by cust_id;
length products $100;
retain products;
if first.cust_id then products = "";
products = catx(",",products,prod);
if last.cust_id;
drop prod;
run;

Add additional code in the DATA step for other variables you want in the output.

Ksharp
Super User
data have;
infile cards expandtabs;
input cust_id account  prod $  limit : comma. balance : comma.;
cards;
111 1234 A         10,000                 500
111 5678 B         20,000                 1,000
111 2345 C           3,000                 2,000
111 6789 B           2,000                 3,000
222 4321 A           1,000                 2,000
222 8765 A           4,000                 1,000
222 7654 C           6,000                    5,000
;

proc sort data=have out=temp1;by cust_id descending limit;run;
data temp11;
 set temp1;
 by cust_id descending limit;
 if first.cust_id then n=0;
 if first.limit then n+1;
 if n=1;
run;
data part1;
do until(last.cust_id);
 set temp11;
 by cust_id;
length prod_max_lmt  $ 200;
prod_max_lmt=catx('&',prod_max_lmt,prod);
end;
keep cust_id prod_max_lmt limit;
rename limit=max_lmt;
run;


proc sort data=have out=temp2;by cust_id  balance;run;
data temp21;
 set temp2;
 by cust_id balance ;
 if first.cust_id then n=0;
 if first.balance then n+1;
 if n=1;
run;
data part2;
do until(last.cust_id);
 set temp21;
 by cust_id;
length prod_min_bal  $ 200;
prod_min_bal=catx('&',prod_min_bal,prod);
end;
keep cust_id prod_min_bal balance;
rename balance=min_bal;
run;



data want;
merge part1 part2;
by cust_id;
run;

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3760 views
  • 7 likes
  • 6 in conversation