Help using Base SAS procedures

call missing

Accepted Solution Solved
Reply
Super Contributor
Posts: 1,041
Accepted Solution

call missing

Hi,

I have two datasets like shown

i want to merge them by code

i have the code which produces the desired output

can someone explain to me why the code has output prior to the call missing?????and why is the output statement at all necessary???

why is the data from the second dataset not repeated for all values of code from the first???

Thanks

TEMP2

ID   CODE     LAB

102  45.77    LAB1

102  45.77   LAB2

102  45.77   LAB3

103  45.88   LAB1

103  45.88  LAB2

WANT1

CODE  COUNT  VALUE1   VALUE2

45.77       1            84            45.39

45.88       2            19            50.00   

WANT2

ID   CODE     LAB     COUNT  VALUE1   VALUE2

102 45.77    LAB1        1                 84            45.39

102  45.77   LAB2        .                   .               .

102  45.77   LAB3       .                    .               .

103  45.88   LAB1         2               19            50.00  

103  45.88  LAB2           .                    .               .

    

CODE

data want2;

merge temp2 want1;

  by CODE;

output;

call missing(of _all_);

run;


Accepted Solutions
Solution
‎11-12-2013 04:13 PM
Super Contributor
Posts: 253

Re: call missing

Posted in reply to robertrao

I couldn't explain why something was there without knowing who put it there and why Smiley Happy

I would guess it's because the merge was specifically written to have the desired effect, namely, to merge the right-hand dataset on only once even if there are duplicates of CODE.  Take away the call missing and the output, and you get it merged to all same-values of CODE, if that's your desire.

The reason it might be there is related to the way MERGE works.  MERGE identifies the first matching observation on the left with the observation on the right, and appends the right dataset's values.  Then it goes on through both datasets until it finds the next matching set of observations - it doesn't keep merging on the right-side dataset's values for the same value of CODE.  The reason the effect is (normally) to do so, is a function of how variables that come in on a MERGE/SET/UPDATE statement work.  They are automatically not set to missing (ie, RETAINed).  So the MERGE basically just outputs all of the left-side dataset rows, and they all keep the right-side dataset's values.  Nothing is set to missing until you pass a by group boundary (where, if a right-hand observation is missing, missing values are appended). 

View solution in original post


All Replies
Super User
Posts: 5,431

Re: call missing

Posted in reply to robertrao

Thwe simplest way is to try running the data step without the output, and then without the call missing.

What you will see is that because of repeated values of Code in temp2, the input from want1 will stay in the PDV, and merged into the repeated rows of temp2.

Data never sleeps
Super Contributor
Posts: 1,041

Re: call missing

Hi,

Thanks for the reply..

I dint undertand it well enough

Regards

Solution
‎11-12-2013 04:13 PM
Super Contributor
Posts: 253

Re: call missing

Posted in reply to robertrao

I couldn't explain why something was there without knowing who put it there and why Smiley Happy

I would guess it's because the merge was specifically written to have the desired effect, namely, to merge the right-hand dataset on only once even if there are duplicates of CODE.  Take away the call missing and the output, and you get it merged to all same-values of CODE, if that's your desire.

The reason it might be there is related to the way MERGE works.  MERGE identifies the first matching observation on the left with the observation on the right, and appends the right dataset's values.  Then it goes on through both datasets until it finds the next matching set of observations - it doesn't keep merging on the right-side dataset's values for the same value of CODE.  The reason the effect is (normally) to do so, is a function of how variables that come in on a MERGE/SET/UPDATE statement work.  They are automatically not set to missing (ie, RETAINed).  So the MERGE basically just outputs all of the left-side dataset rows, and they all keep the right-side dataset's values.  Nothing is set to missing until you pass a by group boundary (where, if a right-hand observation is missing, missing values are appended). 

Super Contributor
Posts: 1,041

Re: call missing

Posted in reply to snoopy369

My desire is to have it the way I had it in WANT2

But i dint understand the code??

Thanks

Super Contributor
Posts: 1,041

Re: call missing

Posted in reply to snoopy369

Hi,

My question is also the same:

Which lines of code is merging the right-hand dataset on only once even if there are duplicates of CODE on the left????

Thanks

Super Contributor
Posts: 253

Re: call missing

Posted in reply to robertrao

That was the second half of my answer.  In brief, the CALL MISSING is setting the values back to missing, and since MERGE doesn't actually read in the values from the right-hand dataset again, it won't retain the values if they're set to missing.  Read a detailed explanation of the PDV and merge statement if you want to understand better, or turn on the data step debugger and run it both ways.

Super Contributor
Posts: 1,041

Re: call missing

Posted in reply to snoopy369

OK.

If i understand the merge correctly

in a many to one merge the right side values are RETAINEd for all the records on the left by the variable we are merging with????

Thanks

Super Contributor
Posts: 253

Re: call missing

Posted in reply to robertrao

That's the gist of it, yes.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 496 views
  • 3 likes
  • 3 in conversation