I want to make a table by combing the following two tables.
dy | dm | age_group | counts |
1997 | 1 | 2 | 20 |
1997 | 1 | 3 | 30 |
1997 | 1 | 4 | 20 |
1997 | 2 | 2 | 5 |
1997 | 2 | 3 | 14 |
1997 | 2 | 4 | 23 |
1997 | 3 | 1 | 23 |
1997 | 3 | 2 | 22 |
1997 | 3 | 3 | 11 |
1997 | 3 | 4 | 5 |
1997 | 4 | 1 | 52 |
1997 | 4 | 2 | 34 |
1997 | 4 | 3 | 25 |
1997 | 4 | 4 | 23 |
1998 | 1 | 1 | 53 |
1998 | 1 | 2 | 5 |
1998 | 1 | 3 | 32 |
1998 | 1 | 4 | 20 |
1998 | 2 | 1 | 44 |
1998 | 2 | 2 | 42 |
1998 | 2 | 3 | 11 |
1998 | 2 | 4 | 5 |
1998 | 3 | 1 | 34 |
1998 | 3 | 2 | 20 |
1998 | 3 | 3 | 30 |
1998 | 3 | 4 | 32 |
1998 | 4 | 1 | 10 |
1998 | 4 | 2 | 23 |
1998 | 4 | 3 | 10 |
1998 | 4 | 4 | 20 |
table a
year | age_group | total | std |
1997 | 1 | 3500 | 28 |
1997 | 2 | 6500 | 25 |
1997 | 3 | 7600 | 17 |
1997 | 4 | 24300 | 30 |
1998 | 1 | 3600 | 28 |
1998 | 2 | 6500 | 25 |
1998 | 3 | 7800 | 17 |
1998 | 4 | 24300 | 30 |
table B
What I want looks like this.
ddate | age_group | counts | total | std |
1997/01 | 1 | 0 | 3500 | 28 |
1997/01 | 2 | 20 | 6500 | 25 |
1997/01 | 3 | 30 | 7600 | 17 |
1997/01 | 4 | 20 | 24300 | 30 |
1997/02 | 1 | 0 | 3500 | 28 |
1997/02 | 2 | 5 | 6500 | 25 |
1997/02 | 3 | 14 | 7600 | 17 |
1997/02 | 4 | 23 | 24300 | 30 |
1997/03 | 1 | 23 | 3500 | 28 |
1997/03 | 2 | 22 | 6500 | 25 |
1997/03 | 3 | 11 | 7600 | 17 |
1997/03 | 4 | 5 | 24300 | 30 |
1997/04 | 1 | 52 | 3500 | 28 |
1997/04 | 2 | 34 | 6500 | 25 |
1997/04 | 3 | 25 | 7600 | 17 |
1997/04 | 4 | 23 | 24300 | 30 |
1998/01 | 1 | 53 | 3600 | 28 |
1998/01 | 2 | 5 | 6500 | 25 |
1998/01 | 3 | 32 | 7800 | 17 |
1998/01 | 4 | 20 | 24300 | 30 |
1998/02 | 1 | 44 | 3600 | 28 |
1998/02 | 2 | 42 | 6500 | 25 |
1998/02 | 3 | 11 | 7800 | 17 |
1998/02 | 4 | 5 | 24300 | 30 |
1998/03 | 1 | 34 | 3600 | 28 |
1998/03 | 2 | 20 | 6500 | 25 |
1998/03 | 3 | 30 | 7800 | 17 |
1998/03 | 4 | 32 | 24300 | 30 |
1998/04 | 1 | 10 | 3600 | 28 |
1998/04 | 2 | 23 | 6500 | 25 |
1998/04 | 3 | 10 | 7800 | 17 |
1998/04 | 4 | 20 | 24300 | 30 |
should I change the year and month variable to the calendar format?
How can I join these tables with proc sql?
Thanks for taking the time to look at it in advance.
Post your data as a data step.
This will allow us to know for example if you have SAS dates or strings.
I couldn't really get the missing values in there. It shouldn't be any problems but I guess I'm tired. So, I added a step in the middle to add the missing age_groups.
data TableA;
input dy dm age_group counts;
datalines;
1997 1 2 20
1997 1 3 30
1997 1 4 20
1997 2 2 5
1997 2 3 14
1997 2 4 23
1997 3 1 23
1997 3 2 22
1997 3 3 11
1997 3 4 5
1997 4 1 52
1997 4 2 34
1997 4 3 25
1997 4 4 23
1998 1 1 53
1998 1 2 5
1998 1 3 32
1998 1 4 20
1998 2 1 44
1998 2 2 42
1998 2 3 11
1998 2 4 5
1998 3 1 34
1998 3 2 20
1998 3 3 30
1998 3 4 32
1998 4 1 10
1998 4 2 23
1998 4 3 10
1998 4 4 20
;;;
run;
data tableB;
input year age_group total std;
datalines;
1997 1 3500 28
1997 2 6500 25
1997 3 7600 17
1997 4 24300 30
1998 1 3600 28
1998 2 6500 25
1998 3 7800 17
1998 4 24300 30
;;;
run;
data tableb;
set tableb;
do _i = 1 to 4;
dm =_i;
output;
end;
drop _i;
run;
proc sort data=tablea (rename=(dy=year));
by year age_group dm;
run;
proc sort data=tableb;
by year age_group dm;
run;
data want;
retain ddate age_group counts total std;
length ddate $ 10;
update tableb tablea ;
by year age_group dm;
ddate =cat(year,"/",dm);
if counts = . then counts = 0;
drop year dm;
run;
proc sort data=want;
by ddate age_group;
run;
Stupid but it works? 🙂
Do something like:
proc sql;
create table C as
select
mdy(dm,1,dy) as ddate format=yymms.,
a.age_group,
counts,
total,
std
from A inner join B on dy=year and A.age_group=B.age_group
order by ddate, age_group;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.