The SAS Output Delivery System and reporting techniques

KEEP function in data step

Reply
N/A
Posts: 0

KEEP function in data step

Hi,
I have a question regarding the KEEP function in Datastep..
let say there are 10 variables in the dataset and 5 of them called temp_a, temp_b, temp_c and so on, and they are the only variable that started with temp_ . Now, I want to create a sub dataset with only the 5 variables and my usual way is:

Data ****;
set ********;
Keep temp_a temp_b temp_c temp_d temp_e;
run;

My question is: Is there any faster way to code it.5 variables is OK, but what if there is 200 variables from a dataset with 1000 variables? it will be quite cumbersome if I do it that way..

thanks
ben
N/A
Posts: 0

Re: KEEP function in data step

Posted in reply to deleted_user
I guess there are some options you might try to see which one works for you.

1) If your variables have a numeric suffix, then you might try using the minus sign in your keep statement, e.g.:

data b; set a;
keep temp_1 - temp_200;
run;

2) If your variables don't have a numeric suffix, but were created after each other, then you might try using a double minus sign, e.g.:

data c; set a;
keep temp_a -- temp_xad;
run;

You can check whether your variables were created after each other by using a proc contents with the varnum option:

proc contents data = a varnum;
run;

These 2 options are the easy ones. If these do not work, I have another - but more complex - option. I will save that one for later.
N/A
Posts: 0

Re: KEEP function in data step

Posted in reply to deleted_user
Hi Maaldijk,
thank you very much. Unfortunately, my situation doesn't fall into any one of those category.. and please tell me how to do it in the complex way...

thanks
ben
N/A
Posts: 0

Re: KEEP function in data step

Posted in reply to deleted_user
The following code might clarify Maaldijk's explanation and also show you another method to retain the variables you need.

The use of the double hyphen is discussed in the SAS Language Manual which is available through your SAS Help. A double hyphen specifies that all variables between the first and last specified in the Program Data Vector will be retained. That is why Maaldijk showed you the VarNum option in the contents procedure to show their sequence. It is hugely useful to understand because when datasets are merged together, it becomes possible to select large groups of variables contributed by one of the merged datasets. Browse the table PDVVARS after you have run the code I produced for you below.

Then look at the table AVARS and note that four columns with names beginning with A, but created out of sequence in the source table, are retained. I think this solution, using the colon modifier on the name as a wild card will solve your problem. I also suggest you search the SAS Help files for "colon modifier" and explore the uses for this syntax.

Kind regards

David


Data TEST;
AA = 1;
AB = 1;
AC = 1;
DF = 1;
AD = 1;
Run;

Data AVARS;
Set TEST( Keep = ASmiley Happy;
Run;

Data PDVVARS;
Set TEST( Keep = AA -- AD);
Run;
N/A
Posts: 0

Re: KEEP function in data step

Posted in reply to deleted_user
Hi,
thank you very much... colon modifier helps me a lot and I should spend sometime reading the help manual...

thanks
ben
N/A
Posts: 0

Re: KEEP function in data step

Posted in reply to deleted_user
I meant to clarify one other point as well. "Keep" is not a function. It is either a data set option, or a data set statement.

I'm going to be fussy about that because I want to comment that reading the Online help is wonderfully helpful, but because it is screen based it is often harder to do than flipping through the printed manual. While the structures largely mirror each other, I can find something in a manual faster than I can online. Part of that is because I used them for so long that the structure and hierarchy of the chapters are almost second nature. On that basis, looking up KEEP in the functions part of the reference will have you looking in the wrong place.

Incidentally, while the word search part of the online help is supposed to find these words quickly, I usually find the result set that is returned is too large and it is more of a hindrance. If I had any suggestion on how to do it otherwise, I'd tell the SAS Docs people, but I don't know. So I stick with my tried and trusted approach of viewing the online docs as a series of books and "virtually flick through" the chapters to get where I want. Now if I could only make bookmarking work for me!!!

Kind regards

David
Super Contributor
Posts: 291

Re: KEEP function in data step

Posted in reply to deleted_user
You can also use (keep=_Smiley Happy or (drop=_Smiley Happy (underscore colon) if the variables are named this way. I think this would also work for keeping/dropping all variables that started with any particular character, e.g. (drop=ASmiley Happy wuold drop all variables starting with the letter A - not tested, but logically safe.
Ask a Question
Discussion stats
  • 6 replies
  • 171 views
  • 0 likes
  • 2 in conversation