Solved: paired data merging

Archibald2 · Posted 07-18-2018 01:42 PM

Hello,

I need to merge the DATASET 1 with DATASET 2 to get to a final dataset "DATA WANT" and I need some guidance.

please note that dataset 1 is a matched pair data of students and dataset 2 contains additional variables on the matched pairs data.

DATASET 1

groupA_id	groupB_id	groupA_grade	groupB_grade	groupA_sex	groupB_sex
00000-000001	00000-000200	2	2	1	1
00000-000001	00000-000201	2	2	1	1
00000-000001	00000-000202	2	2	1	1
00000-000002	00000-000203	2	2	1	1
00000-000002	00000-000204	2	2	1	1
00000-000002	00000-000205	2	2	1	1
00000-000003	00000-000206	2	2	1	1
00000-000003	00000-000207	2	2	1	1
00000-000003	00000-000208	2	2	1	1

DATASET 2

id	Siblings	economic status
00000-000001	Yes	low
00000-000001	yes	low
00000-000001	yes	low
00000-000002	yes	low
00000-000002	yes	low
00000-000002	yes	low
00000-000003	no	high
00000-000003	no	high
00000-000003	no	high
00000-000200	yes	low
00000-000201	no	low
00000-000202	no	high
00000-000203	no	medium
00000-000204	no	high
00000-000205	yes	low
00000-000206	no	medium
00000-000207	yes	high
00000-000208	yes	high

DATA WANT

id	grade	sex	Siblings	economic status
00000-000001	2	1	Yes	low
00000-000002	2	1	yes	low
00000-000003	2	1	no	high
00000-000002	2	1	yes	low
00000-000200	2	1	yes	low
00000-000201	2	1	no	low
00000-000202	2	1	no	high
00000-000203	2	1	no	medium
00000-000204	2	1	no	high
00000-000205	2	1	yes	low
00000-000206	2	1	no	medium
00000-000207	2	1	yes	high
00000-000208	2	1	yes	high

Any help would be appreciated. Thanks

Astounding · Posted 07-18-2018 02:33 PM

If I understand the problem correctly, this should work:

data dataset1_revised;

set dataset1;

id = groupA_id;

grade = groupA_grade;

sex = groupA_sex;

output;

id = groupB_id;

grade = groupB_grade;

sex = groupB_sex;

output;

keep id grade sex;

run;

The rest becomes a more commonplace sort + merge:

proc sort data=dataset1_revised nodupkey;

by id;

run;

proc sort data=dataset2 out=dataset2_revised nodupkey;

by id;

run;

data want;

merge dataset1_revised dataset2_revised;

by id;

run;

View solution in original post

Astounding · Posted 07-18-2018 02:33 PM

If I understand the problem correctly, this should work:

data dataset1_revised;

set dataset1;

id = groupA_id;

grade = groupA_grade;

sex = groupA_sex;

output;

id = groupB_id;

grade = groupB_grade;

sex = groupB_sex;

output;

keep id grade sex;

run;

The rest becomes a more commonplace sort + merge:

proc sort data=dataset1_revised nodupkey;

by id;

run;

proc sort data=dataset2 out=dataset2_revised nodupkey;

by id;

run;

data want;

merge dataset1_revised dataset2_revised;

by id;

run;

Archibald2 · Posted 07-18-2018 05:06 PM

Many thanks Astounding!

This is what I wanted to accomplish. Note- I added a keep statement for dataset1_revised since the last statement did work only for dataset2_revised.

paired data merging

Re: paired data merging

Re: paired data merging

Re: paired data merging

paired data merging

Re: paired data merging

Re: paired data merging

Re: paired data merging

Registration is open

SAS Training: Just a Click Away