Hi, I encountered a really scary behavior when setting two dataset together and the data step is including if statement which contains variables not present in the other dataset. Here is an example case with dummy data: data one; var1=1; var2=.; run; data two; var1=1; output; var1=2; output; run; data three; set one two; if var1 ne . and var2 = . then var2=var1; run; As you can see the var2 will get the value retained in rows coming from data two in the final data three. I am not a PDV expert, but my assumption here is that SAS reads the data in to PDV and populates the values. Then it realizes that ok we need to populate missing variable for this dataset since it is used in later if statement. It does it by creating an 'internal' retain statement for var2. So the PDV would look like this for the dataset two: var1=1 var2 (retain) =. but then the if statement is processed and the condition will change the value of var2=1. Then the PDV would look like var1=1 var2(retain)=1. Then in the next step the condition will never be triggered since the var2 is not missing anymore given the retain statement. This then leads to a situation where all following var2 values are populated as 1. I tried this with real data and by setting the 'data two' first row to have var1=. to test whether my hypothesis is true. It indeed seems so since then the first row will get missing var2 and second row will get the var2 populated with var1 and then that value is retained till the end. For me this is seems like a extremely critical bug, which is also not giving any note to the logs. So saying that this is the way PDV processing is set to operate is just not enough. I am running SAS EG 7.13 with SAS 9.4.5 model version 16.01
... View more