BookmarkSubscribeRSS Feed
Justin9
Obsidian | Level 7

Hi, 

 

Can someone tell me what the best way to keep all previous records if one account has type = 136 (type is a character variable)? In the example below, since account numbers 12 and 18 have one record where type=136, I want to keep all instances where the same account appears (in order to track what the previous types were). Since my dataset has over 500,000 accounts, is there code that would quickly do this ?

 

Justin9_1-1713181448863.png

 

 

6 REPLIES 6
PaigeMiller
Diamond | Level 26

UNTESTED CODE (I can't program code to pull numbers out of a screen capture)

 

data have2;
    set have;
    if type=136 then flag=1; else flag=0;
run;
proc summary data=have2;
     class account_number;
     var flag;
     output out=maxx(drop=_:) max=;
run;
data want;
    merge have maxx;
    by account_number;
    if flag=1;
run;

 

 

 

--
Paige Miller
mkeintz
PROC Star

If your data are sorted by account_number:

 

data want;
  merge have (where=(type='136') in=wanted)
        have;
  by account_number;
  if wanted=1;
run;

The WANTED dummy variable will be a 1 for all account numbers in have that include at least 1 instance of "136".

 

That's due to the way a MERGE statement followed by a BY statement works.  For each account number, whichever merge data set has fewer obs will have its last instance matched against the "excess" obs in the other data set, including the dummy varaible associated with the "shorter" data set.  The "136" obs are the shorted dataset in this case.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ItsMeAG
Calcite | Level 5

Hi @Justin9 ,

 

Please see the code below :

 
data test1 test2;
set test;
by acct ;
if first.acct then n =1; else n+1;
if type = '136' then f=1;
output test1 ;
if f=1 then output test2;
run;
 
data test3;
merge test1(in=a) test2(in=b keep=acct n rename=(n=n_) );
by acct;
if a and b and  n le n_ ;
drop n n_ f;
run;
mkeintz
PROC Star

@Justin9 

 

There is some ambiguity in your request.  You start out by saying

 


Can someone tell me what the best way to keep all previous records if one account has type = 136 (type is a character variable)?

 but later you say

 


where type=136, I want to keep all instances where the same account appears

So which is it?  All obs for the identified account?  Or just the preceding obs?

And if it's the preceding obs, do you want the type=136 obs also?

 

 

My solution, and @PaigeMiller's provide all the obs.  But @ItsMeAG's response selects every obs through the last type="136".

 

BTW, it would simplify code to know:

  1. Can your data have more than one type="136" for a single account?
  2. Is your data sorted by type, within account?  You sample data is.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

After seeing @ItsMeAG 's response, it occurred to me that you want only obs up through the last type='136', but not any obs that follow.  If so, here is a possibility:

 

data want (drop=_:);
  set have (where=(type='136') in=found136)
      have (in=secondpass);
  by account_number;

  if first.account_number=1 then _n136=0;
  _n136+found136;

  if secondpass and _n136>0;    *subsetting IF*; 
  if type='136' then _n136=_n136-1;
run;

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
proc sql;
create table want as
select * from have
where account_number in
(select distinct account_number from have where type='136')
;
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 322 views
  • 1 like
  • 5 in conversation