BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
newtriks
Obsidian | Level 7

I'm trying to merge two datasets in which the same patient may have been tested once for TEST1 and twice  for TEST2. 

 

So if the observations/variables are as follows:

FIRST DATASET

Obs     ID    TEST1 

 1          A     POS 

 

SECOND DATASET

Obs     ID    TEST2 

1           A     POS

2          A      POS

 

The resulting merged set returns this:

Obs     ID   TEST1    TEST2

1          A     POS        POS

2          A     POS        POS

 

How do I keep the second observation for this patient blank for TEST1?

1 ACCEPTED SOLUTION
9 REPLIES 9
PeterClemmensen
Tourmaline | Level 20

A simple Merge will do 🙂

 

EDIT: Did not catch the missing test1 part first.

 

data one;
input ID $ TEST1 $;
datalines;
A POS 
;

data two;
input ID $ TEST2 $;
datalines;
A POS 
A POS 
;

data want;
   if 0 then set one two;
   test1 = '';
   merge one two;
   by ID;
run;

 

Result:

 

ID TEST1 TEST2
A  POS   POS
A        POS

 

newtriks
Obsidian | Level 7

Thanks. Perhaps I should have been clearer. These datasets have been imported from Excel, and each has over 2,000 observations. Is there a method which doesn't require manual data entry? I am much obliged.

Kurt_Bremser
Super User

@PeterClemmensen two initial data steps are just there to create data for the third data step. Instead of the two fake datasets created here, use your datasets in the MERGE statement.

newtriks
Obsidian | Level 7

That did the trick. Thanks!

newtriks
Obsidian | Level 7

Sorry to be responding after 2 months but it appears that this bit of code results in the deletion of any observations with both tests. Final dataset shows only those exclusive to one test variable or the other. I'm lost as to how to fix it. Thanks!

newtriks
Obsidian | Level 7

I should rephrase: I'm not sure what happened to these observations. The listed variables are common to both datasets with one exception: TESTA had no Btest variable, and TESTB had no Atest variable. The resulting dataset does have observations with entries for both tests, as well as those with entries in only one test or the other, but other observations were dropped entirely and I've no idea how to do the detective work. Here's the code:

 

DATA TESTCHECK;
RETAIN Client_ID Site_Type Reporting_Month Reporting_Quarter Reporting_Year Year_of_Birth Race Ethnicity Gender Atest Btest;
IF 0 THEN SET TESTA TESTB;
CALL missing (of _all_);
MERGE TESTA TESTB;
BY Client_ID Reporting_Month;
RUN;

 

I appreciate your help!

newtriks
Obsidian | Level 7

UPDATE - got it to work. All good here. Thanks for responding.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1865 views
  • 1 like
  • 3 in conversation