Hello,
I need to merge the DATASET 1 with DATASET 2 to get to a final dataset "DATA WANT" and I need some guidance.
please note that dataset 1 is a matched pair data of students and dataset 2 contains additional variables on the matched pairs data.
DATASET 1
groupA_id | groupB_id | groupA_grade | groupB_grade | groupA_sex | groupB_sex |
00000-000001 | 00000-000200 | 2 | 2 | 1 | 1 |
00000-000001 | 00000-000201 | 2 | 2 | 1 | 1 |
00000-000001 | 00000-000202 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000203 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000204 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000205 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000206 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000207 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000208 | 2 | 2 | 1 | 1 |
id | Siblings | economic status |
00000-000001 | Yes | low |
00000-000001 | yes | low |
00000-000001 | yes | low |
00000-000002 | yes | low |
00000-000002 | yes | low |
00000-000002 | yes | low |
00000-000003 | no | high |
00000-000003 | no | high |
00000-000003 | no | high |
00000-000200 | yes | low |
00000-000201 | no | low |
00000-000202 | no | high |
00000-000203 | no | medium |
00000-000204 | no | high |
00000-000205 | yes | low |
00000-000206 | no | medium |
00000-000207 | yes | high |
00000-000208 | yes | high |
DATA WANT
id | grade | sex | Siblings | economic status |
00000-000001 | 2 | 1 | Yes | low |
00000-000002 | 2 | 1 | yes | low |
00000-000003 | 2 | 1 | no | high |
00000-000002 | 2 | 1 | yes | low |
00000-000200 | 2 | 1 | yes | low |
00000-000201 | 2 | 1 | no | low |
00000-000202 | 2 | 1 | no | high |
00000-000203 | 2 | 1 | no | medium |
00000-000204 | 2 | 1 | no | high |
00000-000205 | 2 | 1 | yes | low |
00000-000206 | 2 | 1 | no | medium |
00000-000207 | 2 | 1 | yes | high |
00000-000208 | 2 | 1 | yes | high |
Any help would be appreciated. Thanks
If I understand the problem correctly, this should work:
data dataset1_revised;
set dataset1;
id = groupA_id;
grade = groupA_grade;
sex = groupA_sex;
output;
id = groupB_id;
grade = groupB_grade;
sex = groupB_sex;
output;
keep id grade sex;
run;
The rest becomes a more commonplace sort + merge:
proc sort data=dataset1_revised nodupkey;
by id;
run;
proc sort data=dataset2 out=dataset2_revised nodupkey;
by id;
run;
data want;
merge dataset1_revised dataset2_revised;
by id;
run;
If I understand the problem correctly, this should work:
data dataset1_revised;
set dataset1;
id = groupA_id;
grade = groupA_grade;
sex = groupA_sex;
output;
id = groupB_id;
grade = groupB_grade;
sex = groupB_sex;
output;
keep id grade sex;
run;
The rest becomes a more commonplace sort + merge:
proc sort data=dataset1_revised nodupkey;
by id;
run;
proc sort data=dataset2 out=dataset2_revised nodupkey;
by id;
run;
data want;
merge dataset1_revised dataset2_revised;
by id;
run;
Many thanks Astounding!
This is what I wanted to accomplish. Note- I added a keep statement for dataset1_revised since the last statement did work only for dataset2_revised.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: