Hello experts! I need help in merging this data set. Everything comes out fine expect the 2nd column "visit" please see pictures below. Any guidance will be much appreciated.
Here is the directions for my problem:
Patients underwent knee replacement surgeries for one or both knees. The data record patient id, the replaced knee number (1 or 2), and satisfaction scores pre-operatively, one day, one week, and one month after the surgery. The data are
Patient Knee Score Score at Score at Score at ID Number Pre-op one day one week one month 01 1 0 5 7 10 02 1 0 10 15 15 02 2 3 5 8 10 03 1 0 3 3 3 03 2 0 6 9 9 04 1 0 4 10 10 |
C) Use the MERGE statement to create the merged file
Here is my code for part C:
data examEX4;
input id $ KneeNumber ScorePre Score1Day Score1Week Score1Month;
cards;
01 1 0 5 7 10
02 1 0 10 15 15
02 2 3 5 8 10
03 1 0 3 3 3
03 2 0 6 9 9
04 1 0 4 10 10
;
data A(keep=id visit score_knee1);
set examEX4;
array x[4] ScorePre Score1Day Score1Week Score1Month;
do i=1 to 4;
visit=i;
score_knee1=x[i];
if KneeNumber=1;
output;
end;
run;
data B(keep=id visit score_knee2);
set examEX4;
array x[4] ScorePre Score1Day Score1Week Score1Month;
do i=1 to 4;
visit=i;
score_knee2=x[i];
if KneeNumber=2;
output;
end;
run;
proc sort data=A;
by id;
proc sort data=B;
by id;
run;
data A_B_merged;
merge A B;
by id;
run;
proc print noobs;
run;
However my output looks like this:
when it needs to look like this:
Thank you for any help.
You need to create a format for your numeric variable which displays the texts instead of the numbers.
proc format;
value visit
1 = "pre op"
2 = "day 1"
/* and so on */
;
run;
proc print data=a_b_merged noobs;
format visit visit.;
run;
You need to create a format for your numeric variable which displays the texts instead of the numbers.
proc format;
value visit
1 = "pre op"
2 = "day 1"
/* and so on */
;
run;
proc print data=a_b_merged noobs;
format visit visit.;
run;
PS Kudos for posting example data in a working data step. Much appreciated.
Thanks for posting data in usable form.
I know this is an exam task, but:
The need to use a merge is artificially generated by creating multiple datasets. Proc transpose seems to be better suited to solve the problem. Or have failed to see something?
proc sort data=examEX4;
by id KneeNumber;
run;
proc transpose data=examEX4 out=t_want prefix=score_knee name= Visit ;
by id;
var ScorePre Score1Day Score1Week Score1Month;
run;
Hi,
Just a note... see this NOTE in your log:
40 data A_B_merged; 41 merge A B; 42 by id; 43 run; NOTE: MERGE statement has more than one data set with repeats of BY values <--- bad Note! NOTE: There were 16 observations read from the data set WORK.A. NOTE: There were 8 observations read from the data set WORK.B. NOTE: The data set WORK.A_B_MERGED has 16 observations and 4 variables.
That NOTE means you have done a many to many merge. Typically I treat that note as an error message, because the MERGE statement implements a many-to-many merge in a manner that is not intuitive and can lead to surprising results if you don't thoroughly understand the DATA step language. It does not give you a cartesian product. I don't think I've ever had a situation where I intentionally did a many-to-many merge.
In this case, I think your result could be trusted to be correct, because you know that there are always four records per ID in both work.A and work.B, or there might be 0 records per ID. You also know that the four records are sorted in the same order within ID (visit=1 to visit=4), because you created the data that way. Your merge essentially merges by ID, and when there are multiple records with the same ID, it merges by them sequentially based on order.
Still, as a general practice I would change the BY statement to be:
by id Visit ;
Because that way the BY statement is defining a unique key. My rule is in a MERGE of two data sets, you always need at least one data set to be unique by the BY variables. This will avoid that note in the log.
40 data A_B_merged; 41 merge A B; 42 by id visit; 43 run; NOTE: There were 16 observations read from the data set WORK.A. NOTE: There were 8 observations read from the data set WORK.B. NOTE: The data set WORK.A_B_MERGED has 16 observations and 4 variables.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.