Hi all,
It has been a while. I am having issues with trying to count the number of values across a row that are greater than or equal to 6. The issue arises when I go to put in the variables for the array and it seems that SAS doesn't want to count the leading 0 in the variable name. Rather than recoding each variable, I was wondering if there is an easy way to fix my problem. Alternatively, is there an easy way to recode all of the variable names to something easier for the array to deal with. The code I am attempting to utilize is a simple array.
data Count;
set Employment2;
array a{*} E0011701-E001752;
values_gt_6=0;
do _n_=1 to dim(a);
if a{_n_}>5 then values_gt_6+1;
end;
run;
What this does is create new variable E011701 through E001752 in that order with all values being null. It also leaves the values_gt_6 null as well.
Any help would be appreciated.
Why is the first name 8 characters long and the second only 7?
E0011701 E001752
That doesn't look like a set of variables with a numeric suffix to me.
SAS is interpreting that as a request for 9.950 variables:
95 data employment2 ; 96 length E0011701-E001752 8; 97 run; NOTE: Variable E011701 is uninitialized. NOTE: Variable E011700 is uninitialized. ... NOTE: Variable E001753 is uninitialized. NOTE: Variable E001752 is uninitialized. NOTE: The data set WORK.EMPLOYMENT2 has 1 observations and 9950 variables.
Why is the first name 8 characters long and the second only 7?
E0011701 E001752
That doesn't look like a set of variables with a numeric suffix to me.
SAS is interpreting that as a request for 9.950 variables:
95 data employment2 ; 96 length E0011701-E001752 8; 97 run; NOTE: Variable E011701 is uninitialized. NOTE: Variable E011700 is uninitialized. ... NOTE: Variable E001753 is uninitialized. NOTE: Variable E001752 is uninitialized. NOTE: The data set WORK.EMPLOYMENT2 has 1 observations and 9950 variables.
You can use variables that start with a prefix:
array a E00: ;
Or based on position in the dataset.
array a E0011701 -- E001752
Even just the numeric variables in that position range.
array a E0011701 -numeric- E001752
@Tom A follow up question:
If all the values in the array are missing, will it output the value of "values_gt_6" as missing? If not, is there a way to do that within the same data step?
@joebacon wrote:
@Tom A follow up question:
If all the values in the array are missing, will it output the value of "values_gt_6" as missing? If not, is there a way to do that within the same data step?
Probably, but it would be simpler to just test for that separately.
if 0=n(of A[*]) then values_gt_6=.;
First thing, what are the actual names of your existing variables?
Since you say " create new variable E011701 through E001752" that means those variables did not exist, at least to me. ( and suspect a type where that should be E0011701 or your posted data step code is the typo).
If you mention variable names on an Array statement that do not exist then the Array statement will create the variables. Been that way for long time.
If your actual variables are adjacent in the data step you may be able to use the -- (two dash) list creator. You place the left most (or lowest variable number as reported by Proc Contents) before the -- and last adjacent after it.
Example: If you have variables of the same type that are in positions 15 through 30 in your data set and the one at 15 is named ABC and the 30th is named PDQ then use: Array a{*} ABC -- PDQ; to have all of those variables in the array.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.