BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ihtishamsultan
Obsidian | Level 7

Hello All,

 

I have two datasets with a common variable -number, I am trying to merge them.

 

Datset1-

Id Number place

1    20         A

2    25         B

3    30         C

 

Dataset2-

Number  country

20            US

20            US  

20            US

21            CA

22            US

25            UK

 

What I would like to have  a data set where the number is common 

 

Dataset - Want

Id Number place Country

1    20         A         US

1    20                    US

1    20                    US

2    25         B         UK

 

I tried using the following statement and got the following output

 

Statement:

data want;

set dataset1(in=a) dataset2(in=b);

by number;

if a and b then output;

run;

 

Output -Want

Id Number place Country

1    20         A         US

1    20         A          US

1    20         A          US

1    20         A          US

2    25         B         UK

 

Can anyone help?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Please watch the spelling of things like data set names.

If at all possible provide data in the form of a data step and paste code into a code box opened on the forum with the </> icon. Text pasted into the main message boxes will get reformatted by the forum and may result in "code" that no longer works when someone else copies and tries to use it, such as testing code against your data.

Consider:

data Dataset1;
   input Id Number place $;
datalines;
1    20         A
2    25         B
3    30         C
;
 
data Dataset2;
   input number  country $;
datalines;
20            US
20            US  
20            US
21            CA
22            US
25            UK
;

data want;
   merge dataset1 (in=in1)
         dataset2 (in=in2)
   ;
   by number;
   if not first.number then call missing(place);
   if in1 and in2;
run;

You didn't want a SET statement, which appends or stacks data sets, MERGE combines them row by row.

The BY number now tries to align values based on the value of number.

The "if in1 and in2" only keeps records that appear with values from both sets (i.e. matching Number).

Since you have a peculiar requirement to have Place missing when multiple number matches occur the that is done with the call missing to set the Place to missing except for the first value of each number match.

When you use BY in a data step SAS provides automatic variables First. and Last. (please notice the dots) that are used to tell if a specific row has the First or last of each by group value. They are true/false boolean values that can be used as shown.

 

If you have multiple values of number in both sets then you have many-to-many merge and it is very likely that a data step merge won't get what you want.

View solution in original post

4 REPLIES 4
ballardw
Super User

Please watch the spelling of things like data set names.

If at all possible provide data in the form of a data step and paste code into a code box opened on the forum with the </> icon. Text pasted into the main message boxes will get reformatted by the forum and may result in "code" that no longer works when someone else copies and tries to use it, such as testing code against your data.

Consider:

data Dataset1;
   input Id Number place $;
datalines;
1    20         A
2    25         B
3    30         C
;
 
data Dataset2;
   input number  country $;
datalines;
20            US
20            US  
20            US
21            CA
22            US
25            UK
;

data want;
   merge dataset1 (in=in1)
         dataset2 (in=in2)
   ;
   by number;
   if not first.number then call missing(place);
   if in1 and in2;
run;

You didn't want a SET statement, which appends or stacks data sets, MERGE combines them row by row.

The BY number now tries to align values based on the value of number.

The "if in1 and in2" only keeps records that appear with values from both sets (i.e. matching Number).

Since you have a peculiar requirement to have Place missing when multiple number matches occur the that is done with the call missing to set the Place to missing except for the first value of each number match.

When you use BY in a data step SAS provides automatic variables First. and Last. (please notice the dots) that are used to tell if a specific row has the First or last of each by group value. They are true/false boolean values that can be used as shown.

 

If you have multiple values of number in both sets then you have many-to-many merge and it is very likely that a data step merge won't get what you want.

ihtishamsultan
Obsidian | Level 7

Thank you for your reply. 

 

I will keep in mind and use the </> icon.

 

In my real dataset, I have many variables like "place". Can you please advice how should I go about that?

ballardw
Super User

@ihtishamsultan wrote:

Thank you for your reply. 

 

I will keep in mind and use the </> icon.

 

In my real dataset, I have many variables like "place". Can you please advice how should I go about that?


Are they all supposed to be set missing the same way? Then place all the variable names separated by commas like Call Missing(place, othervar, thatvar).

Call missing is one of the few functions that will handle multiple variables of different types, so you can mix numeric and character functions.

If you need something else done with other variables then you will need to provide example data and rules for what is to be done.

 

ihtishamsultan
Obsidian | Level 7
Thank you, yes I wanted all of them to be missing. I'll separate them by commas as you advised.

Thanks again.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 727 views
  • 1 like
  • 2 in conversation