DATA Step, Macro, Functions and more

Merge 2 datasets

Reply
Frequent Contributor
Posts: 109

Merge 2 datasets

Hi,

 

I wanted to do age/sex-adjusted standardized rates. So I've a standard population in 8 strata (2 sexes and 4 age groups) and I've a study cohort . The problem is some groups in the study cohort don't have all strata and I just wanted to make all the possible strata shown. I tried both SQL (including full outer join) and data step; both save me the right  and none of them, understandably, look right. How should write the code?

 

/*standard population count by sesx and age*/ 
data stdpop; input  strata $ POPCOUNT	POPTOTAL;
datalines;
F_1849 3003188 10393957
F_5064 1319163 10393957
F_65up 1024958 10393957
M_1849 2964946 10393957
M_5064 1276806 10393957
M_65up  804896 10393957
;
proc print data=stdpop ; run;
/*Study cohort, with just one group to demonsrate - the group has only 2 age/sex groups*/ 
data have ; input group strata $ amount;
datalines;
1 F_1849 1000
1 M_65up 2000
2 F_5064 300
2 M_65up 400
;
*The SQL way;
proc sql; create table want1 as select b.*, a.* from stdpop a left join have as b on  a.strata=b.strata; quit;
proc print; run;
proc sort data=have; by strata;
*The data step way;
data want2; merge have stdpop; by strata; run;
proc print; run;

The want data should look like the following, with Group value is filled for the missing strata; as to amount, either missing (as seen) or 0 would do.

 

Group Strata Amount POPCOUNT POPTOTAL
1 F_1849 1000 3003188 10393957
1 F_5064   1319163 10393957
1 F_65up   1024958 10393957
1 M_1849   2964946 10393957
1 M_5064   1276806 10393957
1 M_65up 2000 804896 10393957
2 F_1849   3003188 10393957
2 F_5064 300 1319163 10393957
2 F_65up   1024958 10393957
2 M_1849   2964946 10393957
2 M_5064   1276806 10393957
2 M_65up 400 804896 10393957

 

Thanks.

Moderator
Posts: 329

Re: Merge 2 datasets

Have data set: contains Group & Amount fields

 

Stdpop data set contains records, some of which have strata values which are not in the Have data set - F)65up, M_849, M_5064

 

So for these values, how is are the Group and Amount fields to be evaluated?

 

Your response will determine how best to proceed, but I suspect neither SQL or SAS data merge will suffice, so this will most likely be a hash table solution. (Exact solution will depend on the business rules for calculation of the Group value.

Ask a Question
Discussion stats
  • 1 reply
  • 103 views
  • 0 likes
  • 2 in conversation