DATA Step, Macro, Functions and more

selecting a range of columns in sas

Reply
Occasional Contributor
Posts: 8

selecting a range of columns in sas

Hello everyone, I have a data set with about 900 columns in it. I want to export this data set into access, since access cannot handle more than 255 variables, I want to split the data set into four data sets with less than 255 columns. I was wondering is there a way to select a range of columns in sas? Thanks!

Super User
Posts: 17,823

Re: selecting a range of columns in sas

Please don't post the same question in multiple forums.

You can look at the Dictionary tables, the sashelp.vcolumn for your dataset and pick up the variables required. Here's a sample, untested.

proc sql;
select nameinto :var_list1 separated by " "
where libname="MYLIB" and memname='MYDATA'
and varnum between 1 and 255;
quit;

data want1;
set have1 (keep = &var_list1);
run;
Occasional Contributor
Posts: 8

Re: selecting a range of columns in sas

Sorry about that, thank you
Contributor
Posts: 45

Re: selecting a range of columns in sas

Sure, you can use SAS variable lists to refer to a list of variable names, as long as they are similarly named, e.g. var1:var255

 

https://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a000695105.htm

 

Occasional Contributor
Posts: 8

Re: selecting a range of columns in sas

Thank you Smiley Happy
Super User
Posts: 10,500

Re: selecting a range of columns in sas

You can use variable lists. If the variables have names that are related and start with the same string of characters then use: CommonText: and would select all of the variables that start with that text. You can also use a - to indicate a range of variables suchas Avarname - ThatVarname. Use the list(s) in a keep statement. Recommend that you have any identification variables in each set.

 

You may need to experiment with the different form of lists to see how many you get.

Example: Suppose a data set has variables named VarName1 to VarName100 such as 100 similar measurements and another 20 variables whose names start with Col the dataset want below would have 45 variables, the 25 sepecified with VarName and all 20 of the Col.

 

 

data want;

   set have (keep= VarName1 - VarName25 Col: )

Occasional Contributor
Posts: 8

Re: selecting a range of columns in sas

Thank you
Trusted Advisor
Posts: 1,115

Re: selecting a range of columns in sas

[ Edited ]

Another approach (similar to Reeza's) is to use ranges of variables according to their varnum (i.e. their internal variable number). As a preparation you would run a PROC CONTENTS on your dataset with the VARNUM option:

 

proc contents data=yourdata varnum;
run;

In section "Variables in Creation Order" of the resulting output you can see (in column with header "#") which variable is no. 1, which one is no. 254, which one is no. 255, no. 508 (=2*254), no. 509, no. 762 (=3*254), no. 763 and which one is last (say, no. 900).

 

Then you can easily form the appropriate ranges in KEEP= options of data steps like the following:

data part1;
set yourdata(keep=idvar numberone--twofivefour);
run;

data part2;
set yourdata(keep=idvar twofivefive--fivezeroeight);
run;

/* and so on */

Of course, the funny names numberone, twofivefour etc. need to be replaced by the corresponding variable names (of no. 1, no. 254 etc.). Dataset PART1 will then contain the first 254 variables, PART2 the second block of 254, PART3 (not shown above) the third and PART4 the remaining ones (about 900-762=138), without any requirements on how your variables are actually named or what their types are. The disadvantage of this approach is that the variables will be grouped just as they happen to be arranged in your dataset, disregarding contents.

 

I strongly support ballardw's recommendation of keeping an additional (e.g. 255th if that is possible) variable in each of the datasets PART1 - PART4 as a common record identifier (named IDVAR in the example code above). Depending on your data, you may even need more than one record-identifying variable (in this case reduce the number of variables in the ranges accordingly to less than 254).

Super User
Super User
Posts: 7,401

Re: selecting a range of columns in sas

Well, you have some good answers above.  I would like to take another vien however.  The question I would ask is Why do you have 900 columns?  If you move this data into an RDBMs strcuture you will find your work is multiplied endlessly.  RDBMs are strcutured to have long tables, with few variables, where duplication is eraditcated.  I would start by first learning this.  Next, once you understand Relational DataBase Model strcucture, then examine your data.  Look at normalising the data, ie. take a wide dataset, and make log dataset, for example:

HAVE

ID  VAR1  VAR2   VAR3 ...

1    abc     def       efg

 

WANT

ID    VARNUM    RESULT

1      1                 abc

1      2                 def

1      3                 efg

...

 

Once you have normalised your data, then start to identify where duplication appears and split this up into different tables:

HAVE

ID    COMPANY    COMPANY_ADDRESS VARNUM   RESULT

1      1                   abc road                        1               abc

1      1                   abc road                         2              def

1      1                   abc road                         3              efg

...

 

WANT1

ID   COMPANY_ID  VARNUM   RESULT

1     1                       1                abc

1     1                       2                def

1     1                       3                efg

...

WANT2

COMPANY_ID  COMPANY_ADDR

1                       abc road

 

What you are doing is changing technologies, so you would want to change the way you think about your data.  In the above, the data is smaller, and can easily be merged, but also specifying columns and such like is far easier which will help you write SQL.  Also, as a word of recomendation, if you need a database, I would look at one of the free databases rather than Access.  Whilst Access is simple to start with a proper database will serve you better in the long run.  Depending on what you need there is SQLlite for embedded databases, MySQL etc.

Occasional Contributor
Posts: 8

Re: selecting a range of columns in sas

Thank you, that was really helpful
Ask a Question
Discussion stats
  • 9 replies
  • 526 views
  • 6 likes
  • 6 in conversation