Hi all,
I’m trying to figure out how to create the output below:
Sample data:
A 1 2 4 7
B 0 2 3 8
C 4 2 6 3
D 7 4 5 2
E 1 9 6 3
Number of observations that have a 4 or a 7? (3)
What I’m NOT looking for is the frequency of a 4 or 7 (5)
Bonus:
How to produce a single output that will provide data on the above case, but for multple criteria at a time, in one function. E.g. -
Number of observations that have a 4 or a 7? (3)
Number of observations that have a 2? (4)
Number of observations that have a 3 or 8? (3)
Etc.
Much thanks!!
Create a array with boolean (0 or 1) values for the possible occurences, in your case
array has{10} has0-has9;
Run a do loop through your numeric variables (define another array for all numeric variables for that) and set your boolean values accordingly.
After that, you can use freq or tabulate on your hasX variables, as needed.
You could get a much better answer if you spell out what the conditions are. For example:
If you are starting with a SAS data set, what are the variable names?
Should the result be a SAS data set with just one observation in it? A report? What exactly are you hoping to get?
The programming isn't hard. But don't ask us to imagine what you are thinking the result would be.
If it's only every one or two values you're checking for, look at WHICHN function.
data Sample_data;
input ID$ var1-var3;
datalines;
A 1 2 4 7
B 0 2 3 8
C 4 2 6 3
D 7 4 5 2
E 1 9 6 3
;
data _null_;
set Sample_data end=eof;
array AnArray{*} var1-var3;
do i=1 to dim(AnArray);
if AnArray[i] in (4,7) then do;
c+1;
return;
end;
end;
if eof then put "Number of observations that have a 4 or a 7: " c;
run;
Convert your set of variables to a string, and then use the indexc function on that string to search for characters represent each of your criteria:
data have;
input id $ var1-var4;
datalines;
A 1 2 4 7
B 0 2 3 8
C 4 2 6 3
D 7 4 5 2
E 1 9 6 3
run;
data _null_;
set have end=last_case;
array n {3} n_47 n_2 n_3;
array c {3} $2 _temporary_ ('47', '2', '3');
cat_vars=cats(of var1-var4);
do v=1 to dim(n);
n{v}+(indexc(cat_vars,trim(c{v}))>0);
end;
if last_case then put (N_:) (=);
run;
Of course, INDEXC works only when your search criteria are collections of single characters.
Edited addition: I started thinking about search for collections of multi-character values, using the same general strategy. Here's an example using the FINDW function. Instead of looking for (a) 4 or 7, (b) 2, or c(3) 3, it looks for 2-digit numbers 14 or 17, 12, or 13:
data have;
input id $ var1-var4;
datalines;
A 11 12 14 17
B 10 12 13 18
C 14 12 16 13
D 17 14 15 12
E 11 19 16 13
run;
data _null_;
set have end=last_case;
array n {3} n_1417 n_12 n_13;
array c {3} $5 _temporary_ ('14 17', '12', '13');
length cat_vars $20;
cat_vars=catx(' ',of var1-var4);
do I=1 to dim(n);
found=0;
do J=1 to countw(c{I}) until (found^=0);
found=findw(cat_vars,scan(c{I},J));
end;
n{I}+(found>0);
end;
if last_case then put (N_:) (=);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.