I didn't plan on testing it, but since it's Friday here you go. See how this runs in your environment. * create 2 identically valued datasets that only ; * differ by variable attributes; data dschar (keep=varchar) dsnum (keep=varnum); length varchar $1; * 2 million obs - values 0-9 uniformly dist; do i = 1 to 2e6; varnum = int(ranuni(5) * 10); varchar = put( varnum, 2.-l ); output; end; run; options fullstimer; * the following data steps select ~ 40% of obs; data testchar; set dschar; where varchar in ('1','2','3','4'); run; data testnum; set dsnum; where varnum in (1,2,3,4); run; I ran the 2 data steps several times and got similar results. I pasted part of the log below. This ran on my laptop with Windows XP on Intel Core2 Dual. 118 data testchar; 119 set dschar; 120 where varchar in ('1','2','3','4'); 121 run; NOTE: There were 799284 observations read from the data set WORK.DSCHAR. WHERE varchar in ('1', '2', '3', '4'); NOTE: The data set WORK.TESTCHAR has 799284 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 0.39 seconds user cpu time 0.15 seconds system cpu time 0.04 seconds Memory 206k OS Memory 8304k Timestamp 5/11/2012 10:38:39 AM 122 123 data testnum; 124 set dsnum; 125 where varnum in (1,2,3,4); 126 run; NOTE: There were 799284 observations read from the data set WORK.DSNUM. WHERE varnum in (1, 2, 3, 4); NOTE: The data set WORK.TESTNUM has 799284 observations and 1 variables. NOTE: DATA statement used (Total process time): real time 2.53 seconds user cpu time 0.21 seconds system cpu time 0.09 seconds Memory 206k OS Memory 8304k Timestamp 5/11/2012 10:38:42 AM
... View more