BookmarkSubscribeRSS Feed
Justin9
Obsidian | Level 7

Hi, 

 

Can someone tell me what the best way to keep all previous records if one account has type = 136 (type is a character variable)? In the example below, since account numbers 12 and 18 have one record where type=136, I want to keep all instances where the same account appears (in order to track what the previous types were). Since my dataset has over 500,000 accounts, is there code that would quickly do this ?

 

Justin9_1-1713181448863.png

 

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

UNTESTED CODE (I can't program code to pull numbers out of a screen capture)

 

data have2;
    set have;
    if type=136 then flag=1; else flag=0;
run;
proc summary data=have2;
     class account_number;
     var flag;
     output out=maxx(drop=_:) max=;
run;
data want;
    merge have maxx;
    by account_number;
    if flag=1;
run;

 

 

 

--
Paige Miller
mkeintz
PROC Star

If your data are sorted by account_number:

 

data want;
  merge have (where=(type='136') in=wanted)
        have;
  by account_number;
  if wanted=1;
run;

The WANTED dummy variable will be a 1 for all account numbers in have that include at least 1 instance of "136".

 

That's due to the way a MERGE statement followed by a BY statement works.  For each account number, whichever merge data set has fewer obs will have its last instance matched against the "excess" obs in the other data set, including the dummy varaible associated with the "shorter" data set.  The "136" obs are the shorted dataset in this case.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ItsMeAG
Fluorite | Level 6

Hi @Justin9 ,

 

Please see the code below :

 
data test1 test2;
set test;
by acct ;
if first.acct then n =1; else n+1;
if type = '136' then f=1;
output test1 ;
if f=1 then output test2;
run;
 
data test3;
merge test1(in=a) test2(in=b keep=acct n rename=(n=n_) );
by acct;
if a and b and  n le n_ ;
drop n n_ f;
run;
mkeintz
PROC Star

@Justin9 

 

There is some ambiguity in your request.  You start out by saying

 


Can someone tell me what the best way to keep all previous records if one account has type = 136 (type is a character variable)?

 but later you say

 


where type=136, I want to keep all instances where the same account appears

So which is it?  All obs for the identified account?  Or just the preceding obs?

And if it's the preceding obs, do you want the type=136 obs also?

 

 

My solution, and @PaigeMiller's provide all the obs.  But @ItsMeAG's response selects every obs through the last type="136".

 

BTW, it would simplify code to know:

  1. Can your data have more than one type="136" for a single account?
  2. Is your data sorted by type, within account?  You sample data is.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

After seeing @ItsMeAG 's response, it occurred to me that you want only obs up through the last type='136', but not any obs that follow.  If so, here is a possibility:

 

data want (drop=_:);
  set have (where=(type='136') in=found136)
      have (in=secondpass);
  by account_number;

  if first.account_number=1 then _n136=0;
  _n136+found136;

  if secondpass and _n136>0;    *subsetting IF*; 
  if type='136' then _n136=_n136-1;
run;

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
proc sql;
create table want as
select * from have
where account_number in
(select distinct account_number from have where type='136')
;
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1686 views
  • 1 like
  • 5 in conversation