BookmarkSubscribeRSS Feed
robertrao
Quartz | Level 8

HI,

i am merging two data sets by the common key variable-MRN

DAta dataset1 dataset2;

merge have1(in=a) have2(in=b);

BY mrn ;

if a and(not b) then output dataset1;

run;

unfortunately in dataset1 i am getting MRN's which are present (common)in both data sets

i tried compress and strip for mrn variable. it Dont work

MRN is a char variable

COuld you tell where I went wrong???

thanks

6 REPLIES 6
UrvishShah
Fluorite | Level 6

Hi,

Try with Proc Sql joins...Left Join, Right Join...

LinusH
Tourmaline | Level 20

Just a join won't do this, you need to do stuff like intersect to make this happen, or some kind of where on the join result.

So for this case, I prefer the chosen method...

Data never sleeps
Anotherdream
Quartz | Level 8

I'd be willing to bet money your strings are not 100% the same due to being case sensitive. remember sas is case sensitive, this includes its joins.

Example

Data person;

     infile datalines delimiter=',';

input namekey $ firstvar;

datalines;

John,1

Mary,2

;

run;

Data person1;

     infile datalines delimiter=',';

input namekey $ secondvar;

datalines;

JOHN,1

;

run;

proc sort data=person;

by namekey;

run;

proc sort data=person1;

by namekey;

run;

data didntwork;

merge person(in=a) person1(in=b);

by namekey;

if a=1 and b=0;

run;

This process will return a dataset that has BOTH John and Mary, because neither of these values exist in the other table (John doesn't exist, but JOHN does).

Do an upcase on your fields before you merge and see if that fixes it. If you are 100% positive that your cases are the same, then what you are describing cannot happen in sas by the definition of merges. If it is, append some of your actual data for us to confirm, and then follow along with SAS support.

Brandon

LinusH
Tourmaline | Level 20

It's hard to without the actual data.

It seems that the program you attached is not the one you are actually using, it should produce an error (no space between "and" and the "(".

Data never sleeps
Amir
PROC Star

Hi,

I tried your code as follows:

data have1;

  input mrn $;

  datalines;

aaa

bbb

ccc

ddd

eee

;

data have2;

  input mrn $;

  datalines;

aaa

ccc

eee

;

DAta dataset1 dataset2;

merge have1(in=a) have2(in=b);

BY mrn ;

if a and(not b) then output dataset1;

run;

and it worked as expected, giving:

NOTE: There were 5 observations read from the data set WORK.HAVE1. 

NOTE: There were 3 observations read from the data set WORK.HAVE2. 

NOTE: The data set WORK.DATASET1 has 2 observations and 1 variables.

NOTE: The data set WORK.DATASET2 has 0 observations and 1 variables.

The "and(" was not a problem, token handling takes care of that, but generally speaking it would be clearer if there was a space.

Please provide some example input and the faulty output results.

Regards,

Amir.

Astounding
PROC Star

Not possible, most of the time.  If you are sure you are really getting this result, it could indicate that one of your data sets actually contains a variable named B with values of 0 or missing, which overrides the value generated by  HAVE2 (IN=B).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 905 views
  • 0 likes
  • 6 in conversation