DATA Step, Macro, Functions and more

merging

Reply
New Contributor
Posts: 3

merging

I ran into a strange merge situation and am hoping someone can help me understand it. Here's a simplified version (the real one involved more datasets and more variables):

 

Dataset ONE, created by Proc Import from an Excel file, contained 2 variables: barcode and id

 

Dataset TWO, created from other SAS datasets, contained 3 variables: BARCODE, ID, and TRIAL

 

Dataset THREE, created by merging ONE and TWO by barcode (after sorting both datasets by barcode), had all 3 variables, as expected, but data for TRIAL and ID were missing.

 

After confirming type, length, label and format of all variables were identical, I changed the case of ONE variables to match TWO variables (i.e., all upper case, rather than all lowercase). I did not change any values, only variable names. When I did this, the merge worked and no values were missing.

 

If SAS variable names are case insensitive, how can this be explained?

Super User
Posts: 23,776

Re: merging

Posted in reply to datamgr85

If SAS variable names are case insensitive, how can this be explained?

 

Variable names are case insensitive, but variable values are case sensitive. Also, if you're working with an RBDMS, ie connecting to a SQL server that may be case sensitive. 

New Contributor
Posts: 3

Re: merging

Thanks, Reeza. I didn't change any values, only variable names. I will look into whether a SQL server could be the issue.

Super User
Posts: 13,583

Re: merging

Posted in reply to datamgr85

There are a number of possible data combinations that would cause this.

One is if the ID in the first set has a barcode, is missing id and has not corresponding barcode in the other data.

Another is if the barcode matches but the second set the matching barcode has missing values for id and trial.

And also if the second set has barcode that doesn't have a match and is missing id and trial

Please see this example code:

data one;
   input barcode id;
datalines;
1234   567
3333   .
6666   123
;
run;

data two;
input barcode id trial;
datalines;
1234 567  8910
5555 888  9999
6666 .    .
7777 .    .
;
run;

data merged;
   merge
      one
      two
   ;
   by barcode ;
run;

 

 

I would suspect that your result is most likely something related to data such as these three cases.

Ask a Question
Discussion stats
  • 3 replies
  • 116 views
  • 0 likes
  • 3 in conversation