BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Mirisage
Obsidian | Level 7

Hi SAS Forum,

I need to stack the attached two SAS tables so that I will get a table like below.  In below table, the first 5 records are coming from attahced Book2 table.

The "bottom-most bolded record" of below table is coming from the attahced Book1 table.

It is the corresponding counterpart account in Book1 data set where arrerars_band = writoff (I find difficult to explain this).  

Bank_numberCurrent_dateAccount_numberShort_nameCurrency_codeProduct_codeCurrency_short_nameCountryTransitBalanceProductArrears_Band
1028-Feb-101000000000OOOOOOOOO RRRRRRRR01VVVDDDDDDDDRetail1700Personal LoanNPNA
1031-Mar-101000000000OOOOOOOOO RRRRRRRR01VVVDDDDDDDDRetail1700Personal LoanNPNA
1030-Apr-101000000000OOOOOOOOO RRRRRRRR01VVVDDDDDDDDRetail1700Personal LoanNPNA
1031-May-101000000000OOOOOOOOO RRRRRRRR01VVVDDDDDDDDRetail1700Personal LoanNPNA
1030-Jun-101000000000OOOOOOOOO RRRRRRRR01VVVDDDDDDDDRetail1700Personal LoanNPNA
106-Jul-101000000000missing missingmissing missing missing missing 2216Personal Loanwritoff

The account_name called ‘8000000000’in attahced Book2 data set does not have a corresponding counterpart account in Book1 data set where arrerars_band = writoff.  

So, I do not want the account_number called 8000000000 to show up in my final stacked output file.

AUTiger and Haikuo helped me to do the below code to tackle a "kind of similar situation" but the current situation in my hand is bit different.

So, this code is not capable of doing it.

proc sort data=A.BOOK1 out=file_A;

   by bank_number Account_number ;

run;

proc sort data=A.BOOK2  Out=file_B;

     by bank_number account_number ;

run;

data merged_file;

  set file_B

      file_A

           ;

by bank_number account_number ;

run;

Would there be way to handle this new situation?

Thanks

Mirisage

1 ACCEPTED SOLUTION

Accepted Solutions
AUTigers
Calcite | Level 5

it works for your sample data.

proc sort data=ds.BOOK1 out=file_A;

   by bank_number Account_number ;

run;

proc sort data=ds.BOOK2  Out=file_B;

     by bank_number account_number ;

run;

data merged_file;

retain lastacct;

  set file_A(in=ina)

      file_B (in=inb)

           ;

by bank_number account_number ;

sortord=Ina*2+inb*1;

if ina and first.account_number then lastacct=account_number;

if ina or ( inb and lastacct=account_number) then output;

run;

proc sort data=merged_file;

by  bank_number account_number sortord ;

run;

View solution in original post

5 REPLIES 5
ballardw
Super User

One way if what you want is records that come from both data sets is :

data merged_file;

  set file_B (in= inb)

      file_A (in= ina)

           ;

by bank_number account_number ;

if inb and ina;

run;

Tom
Super User Tom
Super User

Sounds like you only want to append records from BOOK1 when the account is also in BOOK2. But that you want to keep all records from BOOK2 whether there is a record in BOOK1 or not.

In this case the accounts you want will always have their first record coming from BOOK2.  Note that this code is dependent on the order the datasets are listed in the SET statement.

data want ;

  retain keep;

  drop keep;

  set book2(in=in2) book1(in=in1);

  by bank_number account_number;

  if first.account_number then do;

    if in2 then keep=1;

    else keep=0;

   end;

  if keep;

run;

robertrao
Quartz | Level 8

Tom,

Could you explain the code.

if its the first of the account number(nothing but from book2)  we are giving it a value of 1 for keep variable.

if its not coming from book2(if its not there in book2 ie its from book1) then we give a value of 0 for keep variable;

if keep gives us only the ones which have a value of 1(from book2).....if the first of the bank number is from book1(0) its eliminated!!!!

Lastly ,What is the use of retain keep and drop keep statements at the top????

why are we dropping it at the top? and again creating at the bottom????

Please explain??

Tom
Super User Tom
Super User

First RETAIN and DROP are examples of  non executable statements. That is they are used to compile the data step but do not do anything on each iteration. So it doesn't matter (other than for easy of reading) where in the data step they are coded.

When it starts a new account (IF FIRST.ACCOUNT_NUMBER) it assigns a value to the variable KEEP. The RETAIN statement insures that the value stays the same until it is changed.  The first record must be from either BOOK2 (in2=1) or BOOK1(in1=1) but cannot be from both as we are using a SET statement instead of MERGE statement.  So every record for the same account will have the same value of the KEEP variable.

IF KEEP; is the same as IF NOT KEEP THEN DELETE;.  Thus only the accounts we have flagged to keep will be output to the new dataset.

AUTigers
Calcite | Level 5

it works for your sample data.

proc sort data=ds.BOOK1 out=file_A;

   by bank_number Account_number ;

run;

proc sort data=ds.BOOK2  Out=file_B;

     by bank_number account_number ;

run;

data merged_file;

retain lastacct;

  set file_A(in=ina)

      file_B (in=inb)

           ;

by bank_number account_number ;

sortord=Ina*2+inb*1;

if ina and first.account_number then lastacct=account_number;

if ina or ( inb and lastacct=account_number) then output;

run;

proc sort data=merged_file;

by  bank_number account_number sortord ;

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1523 views
  • 5 likes
  • 5 in conversation