Hi SAS Forum,
I need to stack the attached two SAS tables so that I will get a table like below. In below table, the first 5 records are coming from attahced Book2 table.
The "bottom-most bolded record" of below table is coming from the attahced Book1 table.
It is the corresponding counterpart account in Book1 data set where arrerars_band = writoff (I find difficult to explain this).
Bank_number | Current_date | Account_number | Short_name | Currency_code | Product_code | Currency_short_name | Country | Transit | Balance | Product | Arrears_Band |
10 | 28-Feb-10 | 1000000000 | OOOOOOOOO RRRRRRRR | 0 | 1 | VVV | DDDDDDDD | Retail | 1700 | Personal Loan | NPNA |
10 | 31-Mar-10 | 1000000000 | OOOOOOOOO RRRRRRRR | 0 | 1 | VVV | DDDDDDDD | Retail | 1700 | Personal Loan | NPNA |
10 | 30-Apr-10 | 1000000000 | OOOOOOOOO RRRRRRRR | 0 | 1 | VVV | DDDDDDDD | Retail | 1700 | Personal Loan | NPNA |
10 | 31-May-10 | 1000000000 | OOOOOOOOO RRRRRRRR | 0 | 1 | VVV | DDDDDDDD | Retail | 1700 | Personal Loan | NPNA |
10 | 30-Jun-10 | 1000000000 | OOOOOOOOO RRRRRRRR | 0 | 1 | VVV | DDDDDDDD | Retail | 1700 | Personal Loan | NPNA |
10 | 6-Jul-10 | 1000000000 | missing | missing | missing | missing | missing | missing | 2216 | Personal Loan | writoff |
The account_name called ‘8000000000’in attahced Book2 data set does not have a corresponding counterpart account in Book1 data set where arrerars_band = writoff.
So, I do not want the account_number called 8000000000 to show up in my final stacked output file.
AUTiger and Haikuo helped me to do the below code to tackle a "kind of similar situation" but the current situation in my hand is bit different.
So, this code is not capable of doing it.
proc sort data=A.BOOK1 out=file_A;
by bank_number Account_number ;
run;
proc sort data=A.BOOK2 Out=file_B;
by bank_number account_number ;
run;
data merged_file;
set file_B
file_A
;
by bank_number account_number ;
run;
Would there be way to handle this new situation?
Thanks
Mirisage
it works for your sample data.
proc sort data=ds.BOOK1 out=file_A;
by bank_number Account_number ;
run;
proc sort data=ds.BOOK2 Out=file_B;
by bank_number account_number ;
run;
data merged_file;
retain lastacct;
set file_A(in=ina)
file_B (in=inb)
;
by bank_number account_number ;
sortord=Ina*2+inb*1;
if ina and first.account_number then lastacct=account_number;
if ina or ( inb and lastacct=account_number) then output;
run;
proc sort data=merged_file;
by bank_number account_number sortord ;
run;
One way if what you want is records that come from both data sets is :
data merged_file;
set file_B (in= inb)
file_A (in= ina)
;
by bank_number account_number ;
if inb and ina;
run;
Sounds like you only want to append records from BOOK1 when the account is also in BOOK2. But that you want to keep all records from BOOK2 whether there is a record in BOOK1 or not.
In this case the accounts you want will always have their first record coming from BOOK2. Note that this code is dependent on the order the datasets are listed in the SET statement.
data want ;
retain keep;
drop keep;
set book2(in=in2) book1(in=in1);
by bank_number account_number;
if first.account_number then do;
if in2 then keep=1;
else keep=0;
end;
if keep;
run;
Tom,
Could you explain the code.
if its the first of the account number(nothing but from book2) we are giving it a value of 1 for keep variable.
if its not coming from book2(if its not there in book2 ie its from book1) then we give a value of 0 for keep variable;
if keep gives us only the ones which have a value of 1(from book2).....if the first of the bank number is from book1(0) its eliminated!!!!
Lastly ,What is the use of retain keep and drop keep statements at the top????
why are we dropping it at the top? and again creating at the bottom????
Please explain??
First RETAIN and DROP are examples of non executable statements. That is they are used to compile the data step but do not do anything on each iteration. So it doesn't matter (other than for easy of reading) where in the data step they are coded.
When it starts a new account (IF FIRST.ACCOUNT_NUMBER) it assigns a value to the variable KEEP. The RETAIN statement insures that the value stays the same until it is changed. The first record must be from either BOOK2 (in2=1) or BOOK1(in1=1) but cannot be from both as we are using a SET statement instead of MERGE statement. So every record for the same account will have the same value of the KEEP variable.
IF KEEP; is the same as IF NOT KEEP THEN DELETE;. Thus only the accounts we have flagged to keep will be output to the new dataset.
it works for your sample data.
proc sort data=ds.BOOK1 out=file_A;
by bank_number Account_number ;
run;
proc sort data=ds.BOOK2 Out=file_B;
by bank_number account_number ;
run;
data merged_file;
retain lastacct;
set file_A(in=ina)
file_B (in=inb)
;
by bank_number account_number ;
sortord=Ina*2+inb*1;
if ina and first.account_number then lastacct=account_number;
if ina or ( inb and lastacct=account_number) then output;
run;
proc sort data=merged_file;
by bank_number account_number sortord ;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.