BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Jack_Sabbath
Calcite | Level 5

Hello SAS community,

 

I am new to SAS programming. I have a data set with say, 20 variables. Suppose that I want to set up a one-to-one correspondence between these variables and variables called x_1, x_2,...,x_20. I have more of a mathematical background and find it is much easier to perform calculations and other procedures when we refer to the ith variable as x_i rather than ith-variable-name, which in my view streamlines subsequent programming. From what I understand, these data sets are essentially matrices, in something like MATLAB one can specify a column (variable) simply as x(:,j), this is where my intuition is coming from, however I am struggling to find similar analogies in SAS.  

 

Is there a straight forward way to map k non-indexed variables {var,bar,car,....,zar} to {x1,x2,...,xk} and back again?

Another question: Does there exist a function which I can apply to a data set which returns the number of variables?

 

Thank you for taking the time to read this

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

The double dash syntax might help you to make arrays perfectly dynamic.

Consider this:

 

data NEW;
  FIRST_VAR=1;
  set SASHELP.CLASS(obs=1);
  LAST_VAR=1;
  array NUM_VARS[*] FIRST_VAR - numeric   - LAST_VAR;
  array CHR_VARS[*] FIRST_VAR - character - LAST_VAR;
  putlog '***NUM VARS***';
  do I=2 to dim(NUM_VARS)-1;  %* ignore first and last num as they are not in the data set;
    putlog NUM_VARS[I]=;
  end;
  putlog '***CHAR VARS***';
  do I=1 to dim(CHR_VARS);
    putlog CHR_VARS[I]=;
  end;
run;

***NUM VARS***
Age=14
Height=69
Weight=112.5
***CHAR VARS***
Name=Alfred
Sex=M

  

Note how the array only picks up variables already in the PDV when defined. Variable I is not in the array.

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

It sounds like arrays will help you a lot.. 

data _null_;
array x{*} var bar car zar;
run;

 

In the code above, x[1]=var, x[2]=bar and so on.. Here, I assume that var, bar, bar and zar are all same-type variables, ie. either all character or all numeric.

 

 

In regards to your last question

 

data _null_;
nvars=attrn(open("sashelp.class"),"NVARS"); 
put nvars;
run;

 

Tom
Super User Tom
Super User

You need to explain what you are trying to do to see if there are easier ways to do it. 

 

For example there might not be any need to create meaningful names for the variables if you don't need them.  Just name the variables var1, var2, var3, etc.

 

Or you might have made a wide format data set say with by week measures (week1, week2, week3, ...) when really your data is more naturally modeled as two variable (Week, Value) and multiple observations.

Jack_Sabbath
Calcite | Level 5

My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.

ballardw
Super User

@Jack_Sabbath wrote:

My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.


 

I would suggest renaming your variables to use the technique in the blog. Either assign a label with the original variable name or use a meaningful label to begin with.

 

 

ChrisNZ
Tourmaline | Level 20

The double dash syntax might help you to make arrays perfectly dynamic.

Consider this:

 

data NEW;
  FIRST_VAR=1;
  set SASHELP.CLASS(obs=1);
  LAST_VAR=1;
  array NUM_VARS[*] FIRST_VAR - numeric   - LAST_VAR;
  array CHR_VARS[*] FIRST_VAR - character - LAST_VAR;
  putlog '***NUM VARS***';
  do I=2 to dim(NUM_VARS)-1;  %* ignore first and last num as they are not in the data set;
    putlog NUM_VARS[I]=;
  end;
  putlog '***CHAR VARS***';
  do I=1 to dim(CHR_VARS);
    putlog CHR_VARS[I]=;
  end;
run;

***NUM VARS***
Age=14
Height=69
Weight=112.5
***CHAR VARS***
Name=Alfred
Sex=M

  

Note how the array only picks up variables already in the PDV when defined. Variable I is not in the array.

Reeza
Super User

 

These are likely helpful for what you're trying to do and will provide some other options:

 

https://blogs.sas.com/content/sastraining/2017/08/29/sas-variable-lists-by-pattern/

 

https://blogs.sas.com/content/sastraining/2011/11/18/jedi-sas-tricks-building-a-name-suffix-variable...

 

https://blogs.sas.com/content/iml/2018/05/29/6-easy-ways-to-specify-a-list-of-variables-in-sas.html

 

https://blogs.sas.com/content/iml/2016/01/18/create-macro-list-values.html

 


@Jack_Sabbath wrote:

My apologies for not making my problem clearer. Suppose that I would like to perform logistic regression on some variables in a data set. Ideally I would like to avoid the brute force method of typing each variable individually. I am following the method described in the following blog post: https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html. Now, the variables which I am considering do not have the convenient names x1,x2,..,xk as in the blog post, they have names like 'aar', 'bar', 'car'... ect. Suppose, for example, I would like to drop the first 5 variables without explicitly writing a drop statement identifying the names of these variables. Instead, I would prefer to write something in the spirit of "drop x1-x5" as the position the variable in the list is what is of interest to me. Moreover, I would also prefer to not have to explicitly type in the explicit name of the variables when defining the array. I would like to take the variables in the data set and feed them into an array or even a copy of the original data set with the variables renamed to x1 to xk, where k is the number of variables.


 

Reeza
Super User
SAS doesn't work that way and if you try to work that way you'll run into some issues. If you really want to wrk that way, you can drop into PROC IML which is more matrix and similar to R or Matlab in that respect.

SAS data steps on the other hand, simply process a single row of data at a time. Think of it as loading a line, doing something, whatever you tell it, then it goes to the next line automatically and processes that line. It doesn't deal with an entire column at the same time as you would expect and you can't use shortcut references to variables or rows in that manner either. Which is why I'd suggest not going down that route.
Ksharp
Super User

You gonna learn SAS/IML language ,it is very like Matlab .

and read @Rick_SAS blog would give you a lot help .

 

 

For your last question.

 

%let dsid=%sysfunc(open(sashelp.class));
%let nvar=%sysfunc(attrn(&dsid,nvar));
%let dsid=%sysfunc(close(&dsid));

%put nvar= &nvar;

 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1044 views
  • 4 likes
  • 7 in conversation