Hi,
I wanted to do age/sex-adjusted standardized rates. So I've a standard population in 8 strata (2 sexes and 4 age groups) and I've a study cohort . The problem is some groups in the study cohort don't have all strata and I just wanted to make all the possible strata shown. I tried both SQL (including full outer join) and data step; both save me the right and none of them, understandably, look right. How should write the code?
/*standard population count by sesx and age*/
data stdpop; input strata $ POPCOUNT POPTOTAL;
datalines;
F_1849 3003188 10393957
F_5064 1319163 10393957
F_65up 1024958 10393957
M_1849 2964946 10393957
M_5064 1276806 10393957
M_65up 804896 10393957
;
proc print data=stdpop ; run;
/*Study cohort, with just one group to demonsrate - the group has only 2 age/sex groups*/
data have ; input group strata $ amount;
datalines;
1 F_1849 1000
1 M_65up 2000
2 F_5064 300
2 M_65up 400
;
*The SQL way;
proc sql; create table want1 as select b.*, a.* from stdpop a left join have as b on a.strata=b.strata; quit;
proc print; run;
proc sort data=have; by strata;
*The data step way;
data want2; merge have stdpop; by strata; run;
proc print; run;
The want data should look like the following, with Group value is filled for the missing strata; as to amount, either missing (as seen) or 0 would do.
Group | Strata | Amount | POPCOUNT | POPTOTAL |
1 | F_1849 | 1000 | 3003188 | 10393957 |
1 | F_5064 | 1319163 | 10393957 | |
1 | F_65up | 1024958 | 10393957 | |
1 | M_1849 | 2964946 | 10393957 | |
1 | M_5064 | 1276806 | 10393957 | |
1 | M_65up | 2000 | 804896 | 10393957 |
2 | F_1849 | 3003188 | 10393957 | |
2 | F_5064 | 300 | 1319163 | 10393957 |
2 | F_65up | 1024958 | 10393957 | |
2 | M_1849 | 2964946 | 10393957 | |
2 | M_5064 | 1276806 | 10393957 | |
2 | M_65up | 400 | 804896 | 10393957 |
Thanks.
Have data set: contains Group & Amount fields
Stdpop data set contains records, some of which have strata values which are not in the Have data set - F)65up, M_849, M_5064
So for these values, how is are the Group and Amount fields to be evaluated?
Your response will determine how best to proceed, but I suspect neither SQL or SAS data merge will suffice, so this will most likely be a hash table solution. (Exact solution will depend on the business rules for calculation of the Group value.
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.