BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ihtishamsultan
Obsidian | Level 7

Hello All,

 

I have two datasets with a common variable -number, I am trying to merge them.

 

Datset1-

Id Number place

1    20         A

2    25         B

3    30         C

 

Dataset2-

Number  country

20            US

20            US  

20            US

21            CA

22            US

25            UK

 

What I would like to have  a data set where the number is common 

 

Dataset - Want

Id Number place Country

1    20         A         US

1    20                    US

1    20                    US

2    25         B         UK

 

I tried using the following statement and got the following output

 

Statement:

data want;

set dataset1(in=a) dataset2(in=b);

by number;

if a and b then output;

run;

 

Output -Want

Id Number place Country

1    20         A         US

1    20         A          US

1    20         A          US

1    20         A          US

2    25         B         UK

 

Can anyone help?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Please watch the spelling of things like data set names.

If at all possible provide data in the form of a data step and paste code into a code box opened on the forum with the </> icon. Text pasted into the main message boxes will get reformatted by the forum and may result in "code" that no longer works when someone else copies and tries to use it, such as testing code against your data.

Consider:

data Dataset1;
   input Id Number place $;
datalines;
1    20         A
2    25         B
3    30         C
;
 
data Dataset2;
   input number  country $;
datalines;
20            US
20            US  
20            US
21            CA
22            US
25            UK
;

data want;
   merge dataset1 (in=in1)
         dataset2 (in=in2)
   ;
   by number;
   if not first.number then call missing(place);
   if in1 and in2;
run;

You didn't want a SET statement, which appends or stacks data sets, MERGE combines them row by row.

The BY number now tries to align values based on the value of number.

The "if in1 and in2" only keeps records that appear with values from both sets (i.e. matching Number).

Since you have a peculiar requirement to have Place missing when multiple number matches occur the that is done with the call missing to set the Place to missing except for the first value of each number match.

When you use BY in a data step SAS provides automatic variables First. and Last. (please notice the dots) that are used to tell if a specific row has the First or last of each by group value. They are true/false boolean values that can be used as shown.

 

If you have multiple values of number in both sets then you have many-to-many merge and it is very likely that a data step merge won't get what you want.

View solution in original post

4 REPLIES 4
ballardw
Super User

Please watch the spelling of things like data set names.

If at all possible provide data in the form of a data step and paste code into a code box opened on the forum with the </> icon. Text pasted into the main message boxes will get reformatted by the forum and may result in "code" that no longer works when someone else copies and tries to use it, such as testing code against your data.

Consider:

data Dataset1;
   input Id Number place $;
datalines;
1    20         A
2    25         B
3    30         C
;
 
data Dataset2;
   input number  country $;
datalines;
20            US
20            US  
20            US
21            CA
22            US
25            UK
;

data want;
   merge dataset1 (in=in1)
         dataset2 (in=in2)
   ;
   by number;
   if not first.number then call missing(place);
   if in1 and in2;
run;

You didn't want a SET statement, which appends or stacks data sets, MERGE combines them row by row.

The BY number now tries to align values based on the value of number.

The "if in1 and in2" only keeps records that appear with values from both sets (i.e. matching Number).

Since you have a peculiar requirement to have Place missing when multiple number matches occur the that is done with the call missing to set the Place to missing except for the first value of each number match.

When you use BY in a data step SAS provides automatic variables First. and Last. (please notice the dots) that are used to tell if a specific row has the First or last of each by group value. They are true/false boolean values that can be used as shown.

 

If you have multiple values of number in both sets then you have many-to-many merge and it is very likely that a data step merge won't get what you want.

ihtishamsultan
Obsidian | Level 7

Thank you for your reply. 

 

I will keep in mind and use the </> icon.

 

In my real dataset, I have many variables like "place". Can you please advice how should I go about that?

ballardw
Super User

@ihtishamsultan wrote:

Thank you for your reply. 

 

I will keep in mind and use the </> icon.

 

In my real dataset, I have many variables like "place". Can you please advice how should I go about that?


Are they all supposed to be set missing the same way? Then place all the variable names separated by commas like Call Missing(place, othervar, thatvar).

Call missing is one of the few functions that will handle multiple variables of different types, so you can mix numeric and character functions.

If you need something else done with other variables then you will need to provide example data and rules for what is to be done.

 

ihtishamsultan
Obsidian | Level 7
Thank you, yes I wanted all of them to be missing. I'll separate them by commas as you advised.

Thanks again.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 478 views
  • 1 like
  • 2 in conversation