Merge Problems

Accepted Solution Solved
Reply
Contributor
Posts: 56
Accepted Solution

Merge Problems

Dear all,

 

I have 2 datasets that I would like to merge.

dataset1 has 137 observations and 7 variables.

dataset2 has 417 observations and 165 variables.

 

after sorting both dataset, I put:

 

data hpvmerge;

merge dataset1 (in=a) dataset2 (in=b);

by study_id;

if a=1;

run;

 

SAS did not give any errors and there was 137 observations and 171 variables in hpvmerge dataset.

however, when I opened this merged dataset, all values for the variables from the larger dataset (dataset2) are missing.

I am not sure what went wrong and would like to get some help from you guys.

 

Thanks in advance.

 


Accepted Solutions
Solution
‎10-05-2017 09:22 PM
SAS Super FREQ
Posts: 479

Re: Merge Problems

use the left function to left justify in a separate step before the merge.

 

 

x = left(x);

 

substitute your var name for x.

View solution in original post


All Replies
SAS Super FREQ
Posts: 479

Re: Merge Problems

Check to see if study_id has the same values in the two data sets. I think you will find they do not.

Also, set b2 = b;  see if b2 has any nonzero values.  b is going to be automatically dropped.

Super User
Posts: 12,148

Re: Merge Problems

With this bit in your code:

if a=1;

you only records that have a value for study_id in dataset1. If there are no matching values for study_id in dataset2 then none of the variables would get values from dataset2 but the variables will exist.

 

You should run proc freq or some other summary on the variable study_id in both sets. Then see if there should be matches. If they aren't matching it may be a matter of 1) no matches or presumptive matches

2) letter case: "Study a" does not match "Study A",

3) extra characters: "Study  A" does not match "Study A" (the first has two spaces)

4) different characters: "Study_A" does not match "Study A"

5) the insidious leading blank: " Study A" does not match "Study A"

If the study id is numeric it there may be a formatting issue such that when you look at them they appear identical but there is an actual difference in the underlying value. 4.3 displayed with an F1 format will look like 4 but not match the actual value 4 in the other set.

Contributor
Posts: 56

Re: Merge Problems

Than you so much for the suggestion.

So I have ran proc freq on both datasets and there are definitely matching values.

Study id in dataset1 have both numbers only ID and numbers+letters ID, and it is a character variable with a length of 5.

but study id in dataset2 have only numbers ID, (numerical variable with length 8, but actually it ranges from only 1 digit to 4 digit ID), so I had to convert to a character variable with a length of $5 before merging. 

If I understood your suggestions correctly, did this cause the missing values in the merged dataset??

SAS Super FREQ
Posts: 479

Re: Merge Problems

Right.  If the values do not match, you will not get any observations from dataset2.

Contributor
Posts: 56

Re: Merge Problems

Posted in reply to WarrenKuhfeld

Thank you.

Can you give some suggestions on how I should fix that?

I can't think of another way of doing this (e.g. convert to character via the put function and make it the same length as dataset1 study ID). I need to keep those IDs with numbers+letters too in dataset1 as I might be linking to a third dataset with this kind of ID format.

 

SAS Super FREQ
Posts: 479

Re: Merge Problems

You can keep the original variables, but you must make a variable in each data set that has the same names and values and then merge on it.

Contributor
Posts: 56

Re: Merge Problems

Posted in reply to WarrenKuhfeld

Sorry I did not make this clear before. The character variable I converted to in dataset2 have the same variable name as study_id in dataset1 (both are named study_id). The original numeric ID variable was named something else.

Should I make a new variable in dataset1 as well and just have the values equal to study_id and link this to dataset2?

SAS Super FREQ
Posts: 479

Re: Merge Problems

If the merge is still not working the way you want, run proc freq on the BY variable for both data sets.  Ensure that the values in each data set that should match do in fact match.  If they don't transform them in whatever way is necessary to get the desired matches.

Contributor
Posts: 56

Re: Merge Problems

Posted in reply to WarrenKuhfeld

After comparing side by side, I am finally seeing what's wrong:

dataset1 study ID (already a character variable, just picked a few for example)

1

10

112

 

 

and dataset2 study ID (after converting to character)

    1

  10

112

 

Why is this happening and how should I fix??

Solution
‎10-05-2017 09:22 PM
SAS Super FREQ
Posts: 479

Re: Merge Problems

use the left function to left justify in a separate step before the merge.

 

 

x = left(x);

 

substitute your var name for x.

Contributor
Posts: 56

Re: Merge Problems

Posted in reply to WarrenKuhfeld

It worked!! Thank you so much I have been trying to figure this out for the whole day!

SAS Super FREQ
Posts: 479

Re: Merge Problems

Glad I could help!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 268 views
  • 2 likes
  • 3 in conversation