BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
BeEyeSea
Calcite | Level 5

Hello experts! I need help in merging this data set. Everything comes out fine expect the 2nd column "visit" please see pictures below. Any guidance will be much appreciated. 

 

Here is the directions for my problem:

 

Patients underwent knee replacement surgeries for one or both knees. The data record patient id, the replaced knee number (1 or 2), and satisfaction scores pre-operatively, one day, one week, and one month after the surgery. The data are

Patient         Knee       Score     Score at     Score at     Score at

   ID       Number   Pre-op   one day     one week   one month

   01          1              0            5              7                10

   02          1              0            10              15               15

   02          2              3             5              8                10

   03          1             0             3               3                   3

   03          2             0             6               9                   9

   04          1             0             4              10                10

 

C) Use the MERGE statement to create the merged file

 

Here is my code for part C:

data examEX4;
input id $ KneeNumber ScorePre Score1Day Score1Week Score1Month;
cards;
    01           1               0              5                7                 10
    02           1               0             10               15                 15
    02           2               3              5                8                 10
    03           1               0              3                3                  3
    03           2               0              6                9                  9
    04           1               0              4               10                 10
;

data A(keep=id visit score_knee1);
set examEX4;
array x[4] ScorePre Score1Day Score1Week Score1Month;
	do i=1 to 4;
		visit=i;
		score_knee1=x[i]; 
if KneeNumber=1;
		output;
	end;
run;
data B(keep=id visit score_knee2);
set examEX4;
array x[4] ScorePre Score1Day Score1Week Score1Month;
	do i=1 to 4;
		visit=i;
		score_knee2=x[i]; 
if KneeNumber=2;
		output;
	end;
run;

proc sort data=A;
by id;

proc sort data=B;
by id;
run;

data A_B_merged;
	merge A B;
	by id;
run; 

proc print noobs;
run;

However my output looks like this: 

 

My Output.png

when it needs to look like this:

Required Output.png

Thank you for any help.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You need to create a format for your numeric variable which displays the texts instead of the numbers.

proc format;
value visit
  1 = "pre op"
  2 = "day 1"
  /* and so on */
;
run;

proc print data=a_b_merged noobs;
format visit visit.;
run;

View solution in original post

4 REPLIES 4
Kurt_Bremser
Super User

You need to create a format for your numeric variable which displays the texts instead of the numbers.

proc format;
value visit
  1 = "pre op"
  2 = "day 1"
  /* and so on */
;
run;

proc print data=a_b_merged noobs;
format visit visit.;
run;
andreas_lds
Jade | Level 19

Thanks for posting data in usable form.

 

I know this is an exam task, but:

The need to use a merge is artificially generated by creating multiple datasets. Proc transpose seems to be better suited to solve the problem. Or have failed to see something?

 

proc sort data=examEX4;
   by id KneeNumber;
run;

proc transpose data=examEX4 out=t_want prefix=score_knee name= Visit ;
   by id;
   var ScorePre Score1Day Score1Week Score1Month;
run;

 

Quentin
Super User

Hi,

 

Just a note... see this NOTE in your log:

40   data A_B_merged;
41     merge A B;
42     by id;
43   run;

NOTE: MERGE statement has more than one data set with repeats of BY values <--- bad Note!
NOTE: There were 16 observations read from the data set WORK.A.
NOTE: There were 8 observations read from the data set WORK.B.
NOTE: The data set WORK.A_B_MERGED has 16 observations and 4 variables.

That NOTE means you have done a many to many merge.  Typically I treat that note as an error message, because the MERGE statement implements a many-to-many merge in a manner that is not intuitive and can lead to surprising results if you don't thoroughly understand the DATA step language.  It does not give you a cartesian product.  I don't think I've ever had a situation where I intentionally did a many-to-many merge.

 

In this case, I think your result could be trusted to be correct, because you know that there are always four records per ID in both work.A and work.B, or there might be 0 records per ID.  You also know that the four records are sorted in the same order within ID (visit=1 to visit=4), because you created the data that way.  Your merge essentially merges by ID, and when there are multiple records with the same ID, it merges by them sequentially based on order.

 

Still, as a general practice I would change the BY statement to be:

	by id Visit ;

Because that way the BY statement is defining a unique key.  My rule is in a MERGE of two data sets, you always need at least one data set to be unique by the BY variables.  This will avoid that note in the log.

40   data A_B_merged;
41     merge A B;
42     by id visit;
43   run;

NOTE: There were 16 observations read from the data set WORK.A.
NOTE: There were 8 observations read from the data set WORK.B.
NOTE: The data set WORK.A_B_MERGED has 16 observations and 4 variables.

 

The Boston Area SAS Users Group is hosting free webinars!
Next up: Joe Madden & Joseph Henry present Putting Power into the Hands of the Programmer with SAS Viya Workbench on Wednesday Nov 6.
Register now at https://www.basug.org/events.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 777 views
  • 8 likes
  • 4 in conversation