Can anybody help me with many to many merge?
I have two datasets A and B, each has six variables (5 explanatory variables: agegroups, sex, service type, service group, and year) and one outcome variable (hospital admissions rate). I want to calculate age-sex-service type-service group-year specific rate ratio but don't know how to merge these two files together as there is no unique matching ID and two files have difference number of rows.
Provide data sample(s) and the desired result, in your post, for a more effective and timely response to your question. Also, to accomplish a MERGE in SAS, there are some data input requirements, depending on the desired results.
Below, you will find links to some SAS support website http://support.sas.com/ technical paper references for you to consider and review.
With multiple observations for a unique BY variable set, you will need to use some SAS procedure (MEANS, SUMMARY, UNIVARIATE) to generate a reduced observation set before using a DATA step MERGE process.
Suggest you start out with getting your data into SAS members using a SAS DATA step, then explore what "data summarization or analysis" procedure will work for you, and then review the DATA step process for merging using a BY statement, so you can perform your calculation.
I encourage you to use the DOC and technical paper references cited previously.
What you have suggested is absolutely right. I have done proc summary to generate maximally one record for each age-sex-service specific group. The problems are that (1) I need to match the two datasets by a set of variables rather than one and (2) there are many unmatched missing groups bwteen dataset1 and dataset2.
Okay. A SAS DATA step MERGE with a BY statement, which supports multiple SAS variables. It's unclear to me what you're expecting from SAS software, frankly? Again, I encourage you to show us *ALL* of your sample data, including the summarized level, and also the outcome of some MERGE process, for additional feedback.
Looking at the example data you give us the combination of the variables agegrp,sex,stay_type (composite key) is unique. The relation between the two tables is 1:1.
Have a look at the example below.
I used Proc SQL - you could do the same by first sorting the tables A and B by agegrp sex stay and then merge the sorted tables (merge A B; by agegrp sex stay.
/* create table CombinedAB as*/
select coalesce(l.agegrp,r.agegrp) as agegrp
,coalesce(l.sex,r.sex) as sex
,coalesce(l.stay_type,r.stay_type) as stay_type
from work.a l full join work.b r
on l.agegrp=r.agegrp and l.sex=r.sex and l.stay_type=r.stay_type
order by agegrp,sex,stay_type;