BookmarkSubscribeRSS Feed
anshul_900
Calcite | Level 5

Hello,

 

I acme across a very basic problem with merging dataset.

 

I have two datsets where one data frame has more observations than other and the by variable value for non matched observation is also different.

 

data  _1;

input subject $ marks;

datalines;

Physics 24

Chem 34

Maths 87

Biology 64

;

run;

 

data _2;

input subject $ name $;

datalines;

Physics Maria

Chem John

Biology Jane;

;

run;

 

data merg;

merge _2 _1;

by subject;

run;

 

The problem I am encountering is because a merge statement is an executable statement it automatically retains the value of previous observation after the first iteration and does not set it to missing which is why for Maths also the name is coming as Jane because it is not allowing a missing value here.

 

Current Output

 

Subject Name Marks 

Physics Maria 24

Chem  John 34

Maths  Jane 87

Biology Jane 64

 

Expected output:

Subject Name Marks 

Physics Maria 24

Chem  John 34

Maths . 87

Biology Jane 64

 

To put the question in one line, How to get all observations from one dataset when merging even if the by variable does not have the same values in two dataset for some observations?

 

7 REPLIES 7
ScottBass
Rhodochrosite | Level 12

Edit your post using "Insert SAS Code", so I can cut-and-paste it into SAS.  Don't make us do your work in creating syntactically correct code.


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
anshul_900
Calcite | Level 5

Sorry about that!

 

data _1;

input id subject $ marks;

datalines;
101 Physics 24

101 Chem 34

101 Maths 87

101 Biology 64

;

run;

 

data _2;

input id subject $ name $;

datalines;

101 Physics Maria

101 Chem John
101 Biology Jane

;

run;

proc sort data= _1;
by id;
run;

proc sort data= _2;
by id;
run;

data _3;
merge _1 _2;
by id;
run;

 

Current output

101 Physics 24 Maria

101 Chem 34 John

101 Biology 87 Jane

101 Biology 64 Jane

 

Expected output 

 

101 Physics 24 Maria

101 Chem 34 John

101 Maths 87  .

101 Biology 64 Jane

 

 

ScottBass
Rhodochrosite | Level 12

Still not posted using "Insert SAS code".  When I cut-and-paste your code into SAS I don't get the correct results, I get missing rows.

 


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.
anshul_900
Calcite | Level 5

rephrasing the question again and copying the code

 

data _1;

input id subject $ marks;

datalines;
101 Physics 24

101 Chem 34

101 Maths 87

101 Biology 64

;

run;

 

data _2;

input id subject $ name $;

datalines;

101 Physics Maria

101 Chem John
101 Biology Jane

;

run;

proc sort data= _1;
by id;
run;

proc sort data= _2;
by id;
run;

data _3;
merge _1 _2;
by id;
run;

 

Current output

101 Physics 24 Maria

101 Chem 34 John

101 Biology 87 Jane

101 Biology 64 Jane

 

Expected output 

 

101 Physics 24 Maria

101 Chem 34 John

101 Maths 87  .

101 Biology 64 Jane

Reeza
Super User

You need to merge by ID and SUBJECT. Replace your BY statements with 'BY ID SUBJECT' and you should get your expected results.

 


@anshul_900 wrote:

rephrasing the question again and copying the code

 

data _1;

input id subject $ marks;

datalines;
101 Physics 24

101 Chem 34

101 Maths 87

101 Biology 64

;

run;

 

data _2;

input id subject $ name $;

datalines;

101 Physics Maria

101 Chem John
101 Biology Jane

;

run;

proc sort data= _1;
by id;
run;

proc sort data= _2;
by id;
run;

data _3;
merge _1 _2;
by id;
run;

 

Current output

101 Physics 24 Maria

101 Chem 34 John

101 Biology 87 Jane

101 Biology 64 Jane

 

Expected output 

 

101 Physics 24 Maria

101 Chem 34 John

101 Maths 87  .

101 Biology 64 Jane


 

Kurt_Bremser
Super User
data _1;
input id subject $ marks;
datalines;
101 Physics 24
101 Chem 34
101 Maths 87
101 Biology 64
;
run;

data _2;
input id subject $ name $;
datalines;
101 Physics Maria
101 Chem John
101 Biology Jane
;
run;

proc sort data= _1;
by id subject;
run;

proc sort data= _2;
by id subject;
run;

data _3;
merge _1 _2;
by id subject;
run;

proc print data=_3 noobs;
run;

Result:

 id    subject    marks    name

101    Biology      64     Jane 
101    Chem         34     John 
101    Maths        87          
101    Physics      24     Maria

Please take note of the proper methods for posting code, textual data and "listing" results, and the fact that missing character values are not displayed as dots.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1434 views
  • 0 likes
  • 4 in conversation