Hi all,
So, the title is not clear I guess. I want to tell SAS to pick all observations that start with 6. something like:
proc sort data=have;
where var=6***;
Apparently, this ** thing does not work with numbers. So, does anyone have a better solution?
I know I can first calculate the first digit and then work with that, but I thought may be there is a shorter way so I dont need to convert numbers to chars.
Thanks a lot!
Here is a timing comparison for three methods:
NOTE: AUTOEXEC processing completed.
1 data have;
2 do i = 1 to 1E6;
3 number = 10000 * rand("UNIFORM");
4 output;
5 end;
6 keep number;
7 run;
NOTE: The data set WORK.HAVE has 1000000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.20 seconds
8
9 option fullstimer;
10
11 data _null_;
12 set have end=done;
13 n = intz(number);
14 do while (n >= 10);
15 n = intz(n/10);
16 end;
17 if n = 6 then m + 1;
18 if done then put m;
19 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
user cpu time 0.54 seconds
system cpu time 0.01 seconds
memory 405.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:36 PM
Step Count 2 Switch Count 0
20
21 data _null_;
22 set have end=done;
23 if int(divide(number,10**int(log10(number)))) = 6 then m + 1;
24 if done then put m;
25 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.59 seconds
user cpu time 0.57 seconds
system cpu time 0.01 seconds
memory 404.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:37 PM
Step Count 3 Switch Count 0
26
27 data _null_;
28 set have end=done;
29 if SUBSTRN(number,1,1) = 6 then m + 1;
30 if done then put m;
31 run;
NOTE: Character values have been converted to numeric
values at the places given by: (Line):(Column).
29:4
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.87 seconds
user cpu time 0.87 seconds
system cpu time 0.00 seconds
memory 403.00k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:38 PM
Step Count 4 Switch Count 0
Here the number range is 0-10000. The advantage of the first method vanishes for larger numbers.
PG
Will your numbers have a specific range, ie all 6000's or can it be 600, 6000, 60000?
Otherwise I think you're stuck converting to character.
Try something like this.
data have;
input numbers;
datalines;
654
456
657
676
453
;
data want;
set have;
if SUBSTRN(numbers,1,1)=6;
run;
proc sort data=want;
by numbers;
run;
You don't need to convert it into character.
data have; input numbers; datalines; 654 456 657 676 453 ; run; data want; set have; if int(divide(numbers,10**int(log10(numbers)))) = 6; run;
Xia Keshan
Here is a timing comparison for three methods:
NOTE: AUTOEXEC processing completed.
1 data have;
2 do i = 1 to 1E6;
3 number = 10000 * rand("UNIFORM");
4 output;
5 end;
6 keep number;
7 run;
NOTE: The data set WORK.HAVE has 1000000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.20 seconds
8
9 option fullstimer;
10
11 data _null_;
12 set have end=done;
13 n = intz(number);
14 do while (n >= 10);
15 n = intz(n/10);
16 end;
17 if n = 6 then m + 1;
18 if done then put m;
19 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
user cpu time 0.54 seconds
system cpu time 0.01 seconds
memory 405.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:36 PM
Step Count 2 Switch Count 0
20
21 data _null_;
22 set have end=done;
23 if int(divide(number,10**int(log10(number)))) = 6 then m + 1;
24 if done then put m;
25 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.59 seconds
user cpu time 0.57 seconds
system cpu time 0.01 seconds
memory 404.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:37 PM
Step Count 3 Switch Count 0
26
27 data _null_;
28 set have end=done;
29 if SUBSTRN(number,1,1) = 6 then m + 1;
30 if done then put m;
31 run;
NOTE: Character values have been converted to numeric
values at the places given by: (Line):(Column).
29:4
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.87 seconds
user cpu time 0.87 seconds
system cpu time 0.00 seconds
memory 403.00k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:38 PM
Step Count 4 Switch Count 0
Here the number range is 0-10000. The advantage of the first method vanishes for larger numbers.
PG
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.