Either I've missed something simple here, or something very odd is going on.
I'm merging two files. The log below shows the contents of the two files then the results of two versions of the merge.
In the first version, I end up with most cases of the variable urban5k=9 even though the assignment that sets urban5k to 9 comes after the put statement and doesn't always meet the 'if' condition. In the second version, the results are as expected when the subsequent if statement is removed.
(The problem doesn't arise because there is only one value for the merge variable - this problem is present in my live data with many values).
1003 * Input files;
1004 data _null_;set temp1x;put _all_;run;
sla=705051004 extractdate=06JUN2008 _ERROR_=0 _N_=1
sla=705051004 extractdate=05SEP2008 _ERROR_=0 _N_=2
sla=705051004 extractdate=05DEC2008 _ERROR_=0 _N_=3
sla=705051004 extractdate=13MAR2009 _ERROR_=0 _N_=4
sla=705051004 extractdate=05JUN2009 _ERROR_=0 _N_=5
sla=705051004 extractdate=12MAR2010 _ERROR_=0 _N_=6
sla=705051004 extractdate=04JUN2010 _ERROR_=0 _N_=7
sla=705051004 extractdate=10SEP2010 _ERROR_=0 _N_=8
NOTE: There were 8 observations read from the data set WORK.TEMP1X.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1005 data _null_;set dist2x;put _all_;run;
urban5k=1 sla=705051004 _ERROR_=0 _N_=1
NOTE: There were 1 observations read from the data set WORK.DIST2X.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1006 * Code 1 (urban5k should equal 1 at the put statement);
1007 data temp2x;
1008 merge temp1x (in=intemp1) dist2x (in=indist2 );
1009 by sla;
1010 if intemp1;
1011 put extractdate= sla= urban5k= indist2=;
1012 if extractdate<'01Dec2009'd then urban5k=9;
1013 run;
extractdate=06JUN2008 sla=705051004 urban5k=1 indist2=1
extractdate=05SEP2008 sla=705051004 urban5k=9 indist2=1
extractdate=05DEC2008 sla=705051004 urban5k=9 indist2=1
extractdate=13MAR2009 sla=705051004 urban5k=9 indist2=1
extractdate=05JUN2009 sla=705051004 urban5k=9 indist2=1
extractdate=12MAR2010 sla=705051004 urban5k=9 indist2=1
extractdate=04JUN2010 sla=705051004 urban5k=9 indist2=1
extractdate=10SEP2010 sla=705051004 urban5k=9 indist2=1
NOTE: There were 8 observations read from the data set WORK.TEMP1X.
NOTE: There were 1 observations read from the data set WORK.DIST2X.
NOTE: The data set WORK.TEMP2X has 8 observations and 3 variables.
NOTE: Compressing data set WORK.TEMP2X increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
1014 * Code 2 (last line removed - works as expected);
1015 data temp2x;
1016 merge temp1x (in=intemp1) dist2x (in=indist2 );
1017 by sla;
1018 if intemp1;
1019 put extractdate= sla= urban5k= indist2=;
1020 run;
extractdate=06JUN2008 sla=705051004 urban5k=1 indist2=1
extractdate=05SEP2008 sla=705051004 urban5k=1 indist2=1
extractdate=05DEC2008 sla=705051004 urban5k=1 indist2=1
extractdate=13MAR2009 sla=705051004 urban5k=1 indist2=1
extractdate=05JUN2009 sla=705051004 urban5k=1 indist2=1
extractdate=12MAR2010 sla=705051004 urban5k=1 indist2=1
extractdate=04JUN2010 sla=705051004 urban5k=1 indist2=1
extractdate=10SEP2010 sla=705051004 urban5k=1 indist2=1
NOTE: There were 8 observations read from the data set WORK.TEMP1X.
NOTE: There were 1 observations read from the data set WORK.DIST2X.
NOTE: The data set WORK.TEMP2X has 8 observations and 3 variables.
NOTE: Compressing data set WORK.TEMP2X increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
Should have tried searching the notes first. There is a note exactly on this topic
http://support.sas.com/kb/24/453.html
In short, I think we can summarise the message as
"When merging many-to-one don't change the variables from the 'one' file in the same datastep that inclues the merge"
Should have tried searching the notes first. There is a note exactly on this topic
http://support.sas.com/kb/24/453.html
In short, I think we can summarise the message as
"When merging many-to-one don't change the variables from the 'one' file in the same datastep that inclues the merge"
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.