BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,
I have a question regarding the KEEP function in Datastep..
let say there are 10 variables in the dataset and 5 of them called temp_a, temp_b, temp_c and so on, and they are the only variable that started with temp_ . Now, I want to create a sub dataset with only the 5 variables and my usual way is:

Data ****;
set ********;
Keep temp_a temp_b temp_c temp_d temp_e;
run;

My question is: Is there any faster way to code it.5 variables is OK, but what if there is 200 variables from a dataset with 1000 variables? it will be quite cumbersome if I do it that way..

thanks
ben
6 REPLIES 6
deleted_user
Not applicable
I guess there are some options you might try to see which one works for you.

1) If your variables have a numeric suffix, then you might try using the minus sign in your keep statement, e.g.:

data b; set a;
keep temp_1 - temp_200;
run;

2) If your variables don't have a numeric suffix, but were created after each other, then you might try using a double minus sign, e.g.:

data c; set a;
keep temp_a -- temp_xad;
run;

You can check whether your variables were created after each other by using a proc contents with the varnum option:

proc contents data = a varnum;
run;

These 2 options are the easy ones. If these do not work, I have another - but more complex - option. I will save that one for later.
deleted_user
Not applicable
Hi Maaldijk,
thank you very much. Unfortunately, my situation doesn't fall into any one of those category.. and please tell me how to do it in the complex way...

thanks
ben
deleted_user
Not applicable
The following code might clarify Maaldijk's explanation and also show you another method to retain the variables you need.

The use of the double hyphen is discussed in the SAS Language Manual which is available through your SAS Help. A double hyphen specifies that all variables between the first and last specified in the Program Data Vector will be retained. That is why Maaldijk showed you the VarNum option in the contents procedure to show their sequence. It is hugely useful to understand because when datasets are merged together, it becomes possible to select large groups of variables contributed by one of the merged datasets. Browse the table PDVVARS after you have run the code I produced for you below.

Then look at the table AVARS and note that four columns with names beginning with A, but created out of sequence in the source table, are retained. I think this solution, using the colon modifier on the name as a wild card will solve your problem. I also suggest you search the SAS Help files for "colon modifier" and explore the uses for this syntax.

Kind regards

David


Data TEST;
AA = 1;
AB = 1;
AC = 1;
DF = 1;
AD = 1;
Run;

Data AVARS;
Set TEST( Keep = A:);
Run;

Data PDVVARS;
Set TEST( Keep = AA -- AD);
Run;
deleted_user
Not applicable
Hi,
thank you very much... colon modifier helps me a lot and I should spend sometime reading the help manual...

thanks
ben
deleted_user
Not applicable
I meant to clarify one other point as well. "Keep" is not a function. It is either a data set option, or a data set statement.

I'm going to be fussy about that because I want to comment that reading the Online help is wonderfully helpful, but because it is screen based it is often harder to do than flipping through the printed manual. While the structures largely mirror each other, I can find something in a manual faster than I can online. Part of that is because I used them for so long that the structure and hierarchy of the chapters are almost second nature. On that basis, looking up KEEP in the functions part of the reference will have you looking in the wrong place.

Incidentally, while the word search part of the online help is supposed to find these words quickly, I usually find the result set that is returned is too large and it is more of a hindrance. If I had any suggestion on how to do it otherwise, I'd tell the SAS Docs people, but I don't know. So I stick with my tried and trusted approach of viewing the online docs as a series of books and "virtually flick through" the chapters to get where I want. Now if I could only make bookmarking work for me!!!

Kind regards

David
Bill
Quartz | Level 8
You can also use (keep=_:) or (drop=_:) (underscore colon) if the variables are named this way. I think this would also work for keeping/dropping all variables that started with any particular character, e.g. (drop=A:) wuold drop all variables starting with the letter A - not tested, but logically safe.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 950 views
  • 0 likes
  • 2 in conversation