Hi all,
So, the title is not clear I guess. I want to tell SAS to pick all observations that start with 6. something like:
proc sort data=have;
where var=6***;
Apparently, this ** thing does not work with numbers. So, does anyone have a better solution?
I know I can first calculate the first digit and then work with that, but I thought may be there is a shorter way so I dont need to convert numbers to chars.
Thanks a lot!
Here is a timing comparison for three methods:
NOTE: AUTOEXEC processing completed.
1 data have;
2 do i = 1 to 1E6;
3 number = 10000 * rand("UNIFORM");
4 output;
5 end;
6 keep number;
7 run;
NOTE: The data set WORK.HAVE has 1000000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.20 seconds
8
9 option fullstimer;
10
11 data _null_;
12 set have end=done;
13 n = intz(number);
14 do while (n >= 10);
15 n = intz(n/10);
16 end;
17 if n = 6 then m + 1;
18 if done then put m;
19 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
user cpu time 0.54 seconds
system cpu time 0.01 seconds
memory 405.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:36 PM
Step Count 2 Switch Count 0
20
21 data _null_;
22 set have end=done;
23 if int(divide(number,10**int(log10(number)))) = 6 then m + 1;
24 if done then put m;
25 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.59 seconds
user cpu time 0.57 seconds
system cpu time 0.01 seconds
memory 404.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:37 PM
Step Count 3 Switch Count 0
26
27 data _null_;
28 set have end=done;
29 if SUBSTRN(number,1,1) = 6 then m + 1;
30 if done then put m;
31 run;
NOTE: Character values have been converted to numeric
values at the places given by: (Line):(Column).
29:4
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.87 seconds
user cpu time 0.87 seconds
system cpu time 0.00 seconds
memory 403.00k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:38 PM
Step Count 4 Switch Count 0
Here the number range is 0-10000. The advantage of the first method vanishes for larger numbers.
PG
Will your numbers have a specific range, ie all 6000's or can it be 600, 6000, 60000?
Otherwise I think you're stuck converting to character.
Try something like this.
data have;
input numbers;
datalines;
654
456
657
676
453
;
data want;
set have;
if SUBSTRN(numbers,1,1)=6;
run;
proc sort data=want;
by numbers;
run;
You don't need to convert it into character.
data have; input numbers; datalines; 654 456 657 676 453 ; run; data want; set have; if int(divide(numbers,10**int(log10(numbers)))) = 6; run;
Xia Keshan
Here is a timing comparison for three methods:
NOTE: AUTOEXEC processing completed.
1 data have;
2 do i = 1 to 1E6;
3 number = 10000 * rand("UNIFORM");
4 output;
5 end;
6 keep number;
7 run;
NOTE: The data set WORK.HAVE has 1000000 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.20 seconds
cpu time 0.20 seconds
8
9 option fullstimer;
10
11 data _null_;
12 set have end=done;
13 n = intz(number);
14 do while (n >= 10);
15 n = intz(n/10);
16 end;
17 if n = 6 then m + 1;
18 if done then put m;
19 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.55 seconds
user cpu time 0.54 seconds
system cpu time 0.01 seconds
memory 405.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:36 PM
Step Count 2 Switch Count 0
20
21 data _null_;
22 set have end=done;
23 if int(divide(number,10**int(log10(number)))) = 6 then m + 1;
24 if done then put m;
25 run;
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.59 seconds
user cpu time 0.57 seconds
system cpu time 0.01 seconds
memory 404.50k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:37 PM
Step Count 3 Switch Count 0
26
27 data _null_;
28 set have end=done;
29 if SUBSTRN(number,1,1) = 6 then m + 1;
30 if done then put m;
31 run;
NOTE: Character values have been converted to numeric
values at the places given by: (Line):(Column).
29:4
111182
NOTE: There were 1000000 observations read from the data set WORK.HAVE.
NOTE: DATA statement used (Total process time):
real time 0.87 seconds
user cpu time 0.87 seconds
system cpu time 0.00 seconds
memory 403.00k
OS Memory 7808.00k
Timestamp 2014-10-18 11:44:38 PM
Step Count 4 Switch Count 0
Here the number range is 0-10000. The advantage of the first method vanishes for larger numbers.
PG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.