I have two datasets that I am merging - one with population data and one with case count data. The population dataset has data for all counties by gender and age group. The case dataset has cases for counties by gender and age group only where there were cases (does not include all states (or age/gender stratifications in population dataset). When I combine these datasets (by state, county, gender, and age), cases is set to missing anywhere there were no cases. This is fine for states where no case data were available (I'd want to delete those where all cases missing), but I want to change those missing values to 0s only for states where case data are available.
Example of merged dataset:
State County Gender Age_group Pop Cases
01 01111 M 1 1500 .
01 01111 M 2 2000 .
01 01111 M 3 2300 .
01 01111 Through M ages 1-18, F ages 1-18 .
08 08244 M 14 1498 28
08 08244 M 15 1247 21
08 08244 M 16 2736 .
08 08244 M 17 1932 .
08 08244 M 18 1140 11
So for the example above - all cases under state 01 were set to missing because the population dataset had data, but the case dataset had no data for that state. If I could somehow delete observations where all values for case are missing for all values within a state, that would be good (or have them so they don't merge). Then I would want to insert 0s where a state has data, but values are missing (e.g., above I'd want 0s for state 08 where they are missing as I know the state has data, so 0 cases for those stratifications).
I'd want to end up with:
State County Gender Age_group Pop Cases
08 08244 M 14 1498 28
08 08244 M 15 1247 21
08 08244 M 16 2736 0
08 08244 M 17 1932 0
08 08244 M 18 1140 11
Not sure if my data step with merge or proc sql code can be modified to include the above, but here is my code for merging (data step returns missing values as shown in example 1 above, proc sql just returns non-missing obs, but that's not what I want as I want to set missing to 0 where a state has case data):
data casespop; merge final.pop final.case (drop=YEAR); by State County Gender Age_group; run;
proc sql; create table merged_casepop as select case.State, case.County, case.Gender, pop.county, pop.Age_group, cases, pop from final.case, final.pop
where case.State eq pop.State AND case.County eq pop.County AND case.Age_group eq pop.Age_group AND case.Gender eq pop.Gender; quit;
Thanks!
... View more