Hello SAS community,
I am new to SAS programming. I have a data set with say, 20 variables. Suppose that I want to set up a one-to-one correspondence between these variables and variables called x_1, x_2,...,x_20. I have more of a mathematical background and find it is much easier to perform calculations and other procedures when we refer to the ith variable as x_i rather than ith-variable-name, which in my view streamlines subsequent programming. From what I understand, these data sets are essentially matrices, in something like MATLAB one can specify a column (variable) simply as x(:,j), this is where my intuition is coming from, however I am struggling to find similar analogies in SAS.
Is there a straight forward way to map k non-indexed variables {var,bar,car,....,zar} to {x1,x2,...,xk} and back again?
Another question: Does there exist a function which I can apply to a data set which returns the number of variables?
Thank you for taking the time to read this
The double dash syntax might help you to make arrays perfectly dynamic.
Consider this:
data NEW;
FIRST_VAR=1;
set SASHELP.CLASS(obs=1);
LAST_VAR=1;
array NUM_VARS[*] FIRST_VAR - numeric - LAST_VAR;
array CHR_VARS[*] FIRST_VAR - character - LAST_VAR;
putlog '***NUM VARS***';
do I=2 to dim(NUM_VARS)-1; %* ignore first and last num as they are not in the data set;
putlog NUM_VARS[I]=;
end;
putlog '***CHAR VARS***';
do I=1 to dim(CHR_VARS);
putlog CHR_VARS[I]=;
end;
run;
***NUM VARS***
Age=14
Height=69
Weight=112.5
***CHAR VARS***
Name=Alfred
Sex=M
Note how the array only picks up variables already in the PDV when defined. Variable I is not in the array.
It sounds like arrays will help you a lot..
data _null_;
array x{*} var bar car zar;
run;
In the code above, x[1]=var, x[2]=bar and so on.. Here, I assume that var, bar, bar and zar are all same-type variables, ie. either all character or all numeric.
In regards to your last question
data _null_;
nvars=attrn(open("sashelp.class"),"NVARS");
put nvars;
run;
You need to explain what you are trying to do to see if there are easier ways to do it.
For example there might not be any need to create meaningful names for the variables if you don't need them. Just name the variables var1, var2, var3, etc.
Or you might have made a wide format data set say with by week measures (week1, week2, week3, ...) when really your data is more naturally modeled as two variable (Week, Value) and multiple observations.
My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.
@Jack_Sabbath wrote:
My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.
I would suggest renaming your variables to use the technique in the blog. Either assign a label with the original variable name or use a meaningful label to begin with.
The double dash syntax might help you to make arrays perfectly dynamic.
Consider this:
data NEW;
FIRST_VAR=1;
set SASHELP.CLASS(obs=1);
LAST_VAR=1;
array NUM_VARS[*] FIRST_VAR - numeric - LAST_VAR;
array CHR_VARS[*] FIRST_VAR - character - LAST_VAR;
putlog '***NUM VARS***';
do I=2 to dim(NUM_VARS)-1; %* ignore first and last num as they are not in the data set;
putlog NUM_VARS[I]=;
end;
putlog '***CHAR VARS***';
do I=1 to dim(CHR_VARS);
putlog CHR_VARS[I]=;
end;
run;
***NUM VARS***
Age=14
Height=69
Weight=112.5
***CHAR VARS***
Name=Alfred
Sex=M
Note how the array only picks up variables already in the PDV when defined. Variable I is not in the array.
These are likely helpful for what you're trying to do and will provide some other options:
https://blogs.sas.com/content/sastraining/2017/08/29/sas-variable-lists-by-pattern/
https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html
https://blogs.sas.com/content/iml/2016/01/18/create-macro-list-values.html
@Jack_Sabbath wrote:
My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.
You gonna learn SAS/IML language ,it is very like Matlab .
and read @Rick_SAS blog would give you a lot help .
For your last question.
%let dsid=%sysfunc(open(sashelp.class));
%let nvar=%sysfunc(attrn(&dsid,nvar));
%let dsid=%sysfunc(close(&dsid));
%put nvar= &nvar;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.