Hi data wide; set A; set B; run; data wide; merge A B; run; The above two code are not the same. The current data provided, they are producing the same output. But actually they are different. I tried with a different data and the results are as below data a; do i=1 to 12; output; end; run; data b; do j=2 to 20 by 2; output; end; run; data long; set a b; run; data wide; set a; set b; run; data wide_; merge a b; run; please find below the log details 25 data a; 26 do i=1 to 12; 27 output; 28 end; 29 run; NOTE: The data set WORK.A has 12 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.03 seconds 30 31 data b; 32 do j=2 to 20 by 2; 33 output; 34 end; 35 run; NOTE: The data set WORK.B has 10 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 36 37 data long; 38 set a b; 39 run; NOTE: There were 12 observations read from the data set WORK.A. NOTE: There were 10 observations read from the data set WORK.B. NOTE: The data set WORK.LONG has 22 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds 40 41 data wide; 42 set a; 43 set b; 44 run; NOTE: There were 11 observations read from the data set WORK.A. NOTE: There were 10 observations read from the data set WORK.B. NOTE: The data set WORK.WIDE has 10 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 45 46 data wide_; 47 merge a b; 48 run; NOTE: There were 12 observations read from the data set WORK.A. NOTE: There were 10 observations read from the data set WORK.B. NOTE: The data set WORK.WIDE_ has 12 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.03 seconds if the above code is executed, you will find that dataset A has 12 observations and dataset B has 10 observations. Now if you used the code set A B; it procduced 22 observations, this is because the code is concatenating the two dataset vertically. if you use the code set A; set B; then it produced wide dataset with only 10 obs. There is another thing to be noticed, let there be any number of observations in the first dataset, while compiling, only one observation greater than the next dataset will be considered. See the log, in dataset A there should be 12 observations, however only 11 observations are read since there are only 10 observation in the next dataset. Not sure why sas is doing it like that. i read the below article regarding this http://www2.sas.com/proceedings/forum2008/167-2008.pdf and found that set A; set B; is overlapping one obs over the other. But during compilation, the number of observation gets fixed in the internal memory. so we get this output. Coming to the merge A B code, if you see the 12 observations is the output, the other dataset is just horizontally concatenated. However the maximum number of observation in the column will remain the same, there is no reduction. So there is a difference in both the codes as shown above. Hope this helps Thanks, Jagadish
... View more