Hello,
I need to merge the DATASET 1 with DATASET 2 to get to a final dataset "DATA WANT" and I need some guidance.
please note that dataset 1 is a matched pair data of students and dataset 2 contains additional variables on the matched pairs data.
DATASET 1
groupA_id | groupB_id | groupA_grade | groupB_grade | groupA_sex | groupB_sex |
00000-000001 | 00000-000200 | 2 | 2 | 1 | 1 |
00000-000001 | 00000-000201 | 2 | 2 | 1 | 1 |
00000-000001 | 00000-000202 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000203 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000204 | 2 | 2 | 1 | 1 |
00000-000002 | 00000-000205 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000206 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000207 | 2 | 2 | 1 | 1 |
00000-000003 | 00000-000208 | 2 | 2 | 1 | 1 |
id | Siblings | economic status |
00000-000001 | Yes | low |
00000-000001 | yes | low |
00000-000001 | yes | low |
00000-000002 | yes | low |
00000-000002 | yes | low |
00000-000002 | yes | low |
00000-000003 | no | high |
00000-000003 | no | high |
00000-000003 | no | high |
00000-000200 | yes | low |
00000-000201 | no | low |
00000-000202 | no | high |
00000-000203 | no | medium |
00000-000204 | no | high |
00000-000205 | yes | low |
00000-000206 | no | medium |
00000-000207 | yes | high |
00000-000208 | yes | high |
DATA WANT
id | grade | sex | Siblings | economic status |
00000-000001 | 2 | 1 | Yes | low |
00000-000002 | 2 | 1 | yes | low |
00000-000003 | 2 | 1 | no | high |
00000-000002 | 2 | 1 | yes | low |
00000-000200 | 2 | 1 | yes | low |
00000-000201 | 2 | 1 | no | low |
00000-000202 | 2 | 1 | no | high |
00000-000203 | 2 | 1 | no | medium |
00000-000204 | 2 | 1 | no | high |
00000-000205 | 2 | 1 | yes | low |
00000-000206 | 2 | 1 | no | medium |
00000-000207 | 2 | 1 | yes | high |
00000-000208 | 2 | 1 | yes | high |
Any help would be appreciated. Thanks
If I understand the problem correctly, this should work:
data dataset1_revised;
set dataset1;
id = groupA_id;
grade = groupA_grade;
sex = groupA_sex;
output;
id = groupB_id;
grade = groupB_grade;
sex = groupB_sex;
output;
keep id grade sex;
run;
The rest becomes a more commonplace sort + merge:
proc sort data=dataset1_revised nodupkey;
by id;
run;
proc sort data=dataset2 out=dataset2_revised nodupkey;
by id;
run;
data want;
merge dataset1_revised dataset2_revised;
by id;
run;
If I understand the problem correctly, this should work:
data dataset1_revised;
set dataset1;
id = groupA_id;
grade = groupA_grade;
sex = groupA_sex;
output;
id = groupB_id;
grade = groupB_grade;
sex = groupB_sex;
output;
keep id grade sex;
run;
The rest becomes a more commonplace sort + merge:
proc sort data=dataset1_revised nodupkey;
by id;
run;
proc sort data=dataset2 out=dataset2_revised nodupkey;
by id;
run;
data want;
merge dataset1_revised dataset2_revised;
by id;
run;
Many thanks Astounding!
This is what I wanted to accomplish. Note- I added a keep statement for dataset1_revised since the last statement did work only for dataset2_revised.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.