This is another classical example for bad data modeling forcing crap (expletive used intentionally!) code.
The first thing you do is transpose those unusable wide datasets to a long layout:
data have;
input acct2018_1 acct2018_2 acct2018_3 acct2018_4 acct2018_5 bal2018_1 bal2018_2 bal2018_3 bal2018_4 bal2018_5
acct2020_1 acct2020_2 acct2020_3 acct2020_4 acct2020_5 bal2020_1 bal2020_2 bal2020_3 bal2020_4 bal2020_5;
datalines;
101 407 103 . . 40 60 80 . . 101 604 505 . . 10 19 20 . .
303 203 . . . 70 80 . . . 507 205 406 907 . 14 19 89 99 .
901 602 801 . . 10 14 24 . . 404 901 505 802 . 20 90 99 87 .
301 501 904 905 . 84 90 20 95 . 808 . . . . 34 . . . . .
;
data account_balance2019;
input acct2019_1 acct2019_2 acct2019_3 acct2019_4 acct2019_5 ;
datalines;
101 . . . .
907 203 . . .
801 505 901 . .
808 301 . . .
;
data have_wide;
set have;
n = _n_;
run;
proc transpose data=have_wide out=_have_long_a (rename=(col1=account));
by n;
var acct:;
run;
proc transpose data=have_wide out=_have_long_b (rename=(col1=balance));
by n;
var bal:;
run;
data have_long_a;
set _have_long_a;
year = substr(_name_,5,4);
seq = substr(_name_,10);
drop _name_;
run;
data have_long_b;
set _have_long_b;
year = substr(_name_,4,4);
seq = substr(_name_,9);
drop _name_;
run;
proc sql;
create table have_long as
select a.account, a.year, b.balance
from have_long_a a, have_long_b b
where a.n = b.n and a.year = b.year and a.seq = b.seq
and a.account ne . and b.balance ne .
order by a.account, a.year
;
quit;
data ab_wide;
set account_balance2019;
n = _n_;
run;
proc transpose data=ab_wide out=_ac_2019 (rename=(col1=account) where=(account ne .));
by n;
var acct:;
run;
proc sort
data=_ac_2019 (keep=account)
out=ac_2019
;
by account;
run;
(original data step code kept for reference)
The datasets now look like this:
account year balance
101 2018 40
101 2020 10
103 2018 80
203 2018 80
205 2020 19
301 2018 84
303 2018 70
404 2020 20
406 2020 89
407 2018 60
501 2018 90
505 2020 20
505 2020 99
507 2020 14
602 2018 14
604 2020 19
801 2018 24
802 2020 87
808 2020 34
901 2018 10
901 2020 90
904 2018 20
905 2018 95
907 2020 99
and this:
account
101
203
301
505
801
808
901
907
Note that all processing up to this point is only done to bring the data into a shape it should have had from the beginning; either avoid any transposition to wide altogether, or if you get data in such a format, do the transpose when you load it into SAS.
(Hint: if you want to display data in a wide format for human consumption, SAS procedures (TABULATE or REPORT) can do that for you on the fly)
Now, the code is very simple:
data want;
merge
have_long (in=h)
have_long (
in=ac2020
keep=account year
rename=(year=_y)
where=(_y = "2020")
)
ac_2019 (in=ac2019)
;
by account;
if h;
output;
if year = '2018' and ac2019 and ac2020
then do;
year = "2019";
output;
end;
drop _y;
run;
Also note that the whole code makes no assumptions about the number of variables in your wide datasets, only that the names start with "acct" and "bal".
If you need to do this for different year constellations, you just need to replace the "2019" and "2020" in the code with macro variables.
... View more