Hello All,
I have a somewhat basic question about many-to-few merge in SAS. I am trying to merge two datasets in SAS with unequal number of by-groups in the datasets. I am trying to merge by id without the last records in dataset2 getting carried over to the output dataset. The two input datsets and the planned output dataset are as follows:
Dataset1:
id | day | computer_score |
1 | 1 | 19 |
1 | 2 | 22 |
1 | 3 | 13 |
1 | 4 | 14 |
1 | 5 | 17 |
2 | 1 | 9 |
2 | 2 | 8 |
2 | 3 | 7 |
2 | 4 | 8 |
2 | 5 | 9 |
Dataset2:
id | verbal_score |
1 | 15 |
1 | 17 |
1 | 19 |
2 | 7 |
2 | 9 |
Output dataset:
id | day | computer_score | verbal_score |
1 | 1 | 19 | 15 |
1 | 2 | 22 | 17 |
1 | 3 | 13 | 19 |
1 | 4 | 14 | |
1 | 5 | 17 | |
2 | 1 | 9 | |
2 | 2 | 8 | 9 |
2 | 3 | 7 | |
2 | 4 | 8 | |
2 | 5 | 9 |
Any help will be much appreciated.
Thanks,
Pronabesh
If that 9 for the first record for id=2 should be a seven, then I think that your really only have a one-to-one match.
Try:
data Dataset2;
set Dataset2;
by id;
if first.id then day=1;
else day+1;
run;
data want;
merge dataset1 (in=a) dataset2;
by id day;
if a;
run;
One correction:
The output dataset should look like this:
id | day | computer_score | verbal_score |
1 | 1 | 19 | 15 |
1 | 2 | 22 | 17 |
1 | 3 | 13 | 19 |
1 | 4 | 14 | |
1 | 5 | 17 | |
2 | 1 | 9 | 9 |
2 | 2 | 8 | 8 |
2 | 3 | 7 | |
2 | 4 | 8 | |
2 | 5 | 9 |
If that 9 for the first record for id=2 should be a seven, then I think that your really only have a one-to-one match.
Try:
data Dataset2;
set Dataset2;
by id;
if first.id then day=1;
else day+1;
run;
data want;
merge dataset1 (in=a) dataset2;
by id day;
if a;
run;
Hello Arthur,
Thank for the response. This worked perfectly for the mock datasets. However, there is another layer to my problem. The day variable in dataet1 is actually a date variable. There is no date for dataset2 ( not the same examination date as dataset1 anyway). However, I need to merge the scores from dataset2 to dataset1 one-to-one and not carry it forward if there are less than 5 observations per subject. Also the dates can change between and within subjects. The datasets are as follows:
Dataset1:
id | date | computer_score |
1 | 1/1/13 | 19 |
1 | 2/1/13 | 22 |
1 | 3/1/13 | 13 |
1 | 4/1/13 | 14 |
1 | 5/1/13 | 17 |
2 | 1/1/13 | 9 |
2 | 2/1/13 | 8 |
2 | 3/1/13 | 7 |
2 | 4/1/13 | 8 |
2 | 5/1/13 | 9 |
Dataset2:
id | verbal_score |
1 | 15 |
1 | 17 |
1 | 19 |
2 | 7 |
2 | 9 |
Output dataset:
id | day | computer_score | verbal_score |
1 | 1/1/13 | 19 | 15 |
1 | 2/1/13 | 22 | 17 |
1 | 3/1/13 | 13 | 19 |
1 | 4/1/13 | 14 | |
1 | 5/1/13 | 17 | |
2 | 1/1/13 | 9 | 7 |
2 | 2/1/13 | 8 | 9 |
2 | 3/1/13 | 7 | |
2 | 4/1/13 | 8 | |
2 | 5/1/13 | 9 |
Your example data and shown desired output file was the same as the original one, which is what my previous offered code accomplished, thus you have to be clearer regarding what you want.
Checking for number of records is easy, as is sorting the file by date within id. The questions are checking in which file\(s), what to do when there are less than five records, whehter you do need to carry forward any of the scores, and whether the dates themselves are relevant to your needed solution?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.