I have 2 dataset x and y. X is my master dataset and has id and date and Y is quarterly dataset having balance and bal.
Y table is getting created every Mar, Jun, Sep, and Dec while X is refreshed Monthly.
I am retaining the value populated in Mar to Apr and May. Once the data available in jun. I am retaining the Jun data to Jul and Aug. Till we get the data in next qtr. If any Id doesn't get the Balance and Bal in next qtr I am trying to keep the missing value for that Id.
Currently previous qtrs non missing value is populated in my final table where i want missing value.
Table X | |
id | date |
1 | 01-12-2017 |
2 | 01-12-2017 |
3 | 01-12-2017 |
4 | 01-12-2017 |
5 | 01-12-2017 |
6 | 01-12-2017 |
7 | 01-12-2017 |
1 | 01-01-2018 |
2 | 01-01-2018 |
3 | 01-01-2018 |
4 | 01-01-2018 |
5 | 01-01-2018 |
6 | 01-01-2018 |
7 | 01-01-2018 |
1 | 01-02-2018 |
2 | 01-02-2018 |
3 | 01-02-2018 |
4 | 01-02-2018 |
5 | 01-02-2018 |
6 | 01-02-2018 |
7 | 01-02-2018 |
1 | 01-03-2018 |
2 | 01-03-2018 |
3 | 01-03-2018 |
4 | 01-03-2018 |
5 | 01-03-2018 |
6 | 01-03-2018 |
7 | 01-03-2018 |
Table Y | |||
Id | date | Balance | Bal |
1 | 01-12-2017 | 200 | A |
2 | 01-12-2017 | 500 | B |
3 | 01-12-2017 | 700 | C |
4 | 01-12-2017 | 800 | C |
1 | 01-03-2018 | 200 | A |
2 | 01-03-2018 | 500 | B |
5 | 01-03-2018 | 700 | C |
6 | 01-03-2018 | 800 | C |
Final table(Expected) | |||
id | date | balance | bal |
1 | 01-12-2017 | 200 | A |
2 | 01-12-2017 | 500 | B |
3 | 01-12-2017 | 700 | C |
4 | 01-12-2017 | 800 | C |
5 | 01-12-2017 | ||
6 | 01-12-2017 | ||
7 | 01-12-2017 | ||
1 | 01-01-2018 | 200 | A |
2 | 01-01-2018 | 500 | B |
3 | 01-01-2018 | 700 | C |
4 | 01-01-2018 | 800 | C |
5 | 01-01-2018 | ||
6 | 01-01-2018 | ||
7 | 01-01-2018 | ||
1 | 01-02-2018 | 200 | A |
2 | 01-02-2018 | 500 | B |
3 | 01-02-2018 | 700 | C |
4 | 01-02-2018 | 800 | C |
5 | 01-02-2018 | ||
6 | 01-02-2018 | ||
7 | 01-02-2018 | ||
1 | 01-03-2018 | 200 | A |
2 | 01-03-2018 | 500 | B |
3 | 01-03-2018 | ||
4 | 01-03-2018 | ||
5 | 01-03-2018 | 700 | C |
6 | 01-03-2018 | 800 | C |
7 | 01-03-2018 |
Code:
proc sql;
create table final1 as select a.*, b.balance, b.bal
from x a left join y b
on a.id=b.id
and a.date=b.date;
quit;
proc sort data=final1;
by id date;
run;
data final;
set final1;
retain balance1 bal2;
by id date;
if first.id then do;
balance1=balance;
bal2=bal
end;
if month(date) in (3,6,9,12) then do;
balance1=balance;
if bal ne '' then bal2=bal;
end;
drop bal balance;
run;
Hi @Ishaan Can you please check your table Y dates that has
1 | 01-03-2017 | 200 | A |
2 | 01-03-2017 | 500 | B |
5 | 01-03-2017 | 700 | C |
6 | 01-03-2017 | 800 | C |
as opposed to the final
1 | 01-03-2018 | 200 | A |
2 | 01-03-2018 | 500 | B |
3 | 01-03-2018 | ||
4 | 01-03-2018 | ||
5 | 01-03-2018 | 700 | C |
6 | 01-03-2018 | 800 | C |
Assuming I understand , my attempt:
data x;
input id date :ddmmyy10.;
format date ddmmyy10.;
cards;
1 01-12-2017
2 01-12-2017
3 01-12-2017
4 01-12-2017
5 01-12-2017
6 01-12-2017
7 01-12-2017
1 01-01-2018
2 01-01-2018
3 01-01-2018
4 01-01-2018
5 01-01-2018
6 01-01-2018
7 01-01-2018
1 01-02-2018
2 01-02-2018
3 01-02-2018
4 01-02-2018
5 01-02-2018
6 01-02-2018
7 01-02-2018
1 01-03-2018
2 01-03-2018
3 01-03-2018
4 01-03-2018
5 01-03-2018
6 01-03-2018
7 01-03-2018
;
data y;
infile cards truncover;
input Id date :ddmmyy10. Balance Bal $;
format date ddmmyy10.;
cards;
1 01-12-2017 200 A
2 01-12-2017 500 B
3 01-12-2017 700 C
4 01-12-2017 800 C
1 01-03-2017 200 A
2 01-03-2017 500 B
5 01-03-2017 700 C
6 01-03-2017 800
;
data want;
if _n_=1 then do;
if 0 then set x y(rename=date=_date);
dcl hash H (dataset:'y(rename=date=_date)') ;
h.definekey ("id","_date") ;
h.definedata ("_date","balance", "bal") ;
h.definedone () ;
end;
set x;
rc=h.find(key:id,key:date);
if rc ne 0 then do;
call missing(balance, bal);
if intck('month',_Date,date)<=2 then rc1=h.find();
end;
drop rc: _:;
run;
So assuming your Y table is in sequence
data y;
infile cards truncover;
input Id date :ddmmyy10. Balance Bal $;
format date ddmmyy10.;
cards;
1 01-12-2017 200 A
2 01-12-2017 500 B
3 01-12-2017 700 C
4 01-12-2017 800 C
1 01-03-2018 200 A
2 01-03-2018 500 B
5 01-03-2018 700 C
6 01-03-2018 800 C
;
where March,2018 would follow Dec2017 i.e of Dec2017--Mar2018--Jun2018 and beyond
The above code will match your expected result.
This does what you want though I am unsure why you'd want the 2018 entries to take the 2017 data.
proc sql;
select a.ID, b.DATE, Y.BALANCE, Y.BAL
from (select unique ID from X) a
cross join
(select unique DATE from X) b
left join
Y
on Y.ID=a.ID and intck('qtr.3',Y.DATE,b.DATE) in(0,4)
order 2,1;
quit;
ID | DATE | BALANCE | BAL |
---|---|---|---|
1 | 01/12/2017 | 200 | A |
2 | 01/12/2017 | 500 | B |
3 | 01/12/2017 | 700 | C |
4 | 01/12/2017 | 800 | C |
5 | 01/12/2017 | . | |
6 | 01/12/2017 | . | |
7 | 01/12/2017 | . | |
1 | 01/01/2018 | 200 | A |
2 | 01/01/2018 | 500 | B |
3 | 01/01/2018 | 700 | C |
4 | 01/01/2018 | 800 | C |
5 | 01/01/2018 | . | |
6 | 01/01/2018 | . | |
7 | 01/01/2018 | . | |
1 | 01/02/2018 | 200 | A |
2 | 01/02/2018 | 500 | B |
3 | 01/02/2018 | 700 | C |
4 | 01/02/2018 | 800 | C |
5 | 01/02/2018 | . | |
6 | 01/02/2018 | . | |
7 | 01/02/2018 | . | |
1 | 01/03/2018 | 200 | A |
2 | 01/03/2018 | 500 | B |
3 | 01/03/2018 | . | |
4 | 01/03/2018 | . | |
5 | 01/03/2018 | 700 | C |
6 | 01/03/2018 | 800 | C |
7 | 01/03/2018 | . |
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.