Dear all,
I'm trying to subset from a dataset all columns (variables) containing a pattern (named "Q1"). I tried the following:
data myout;
set mydata;
keep Q1;
run;
but it keeps also Q11,Q12, ...
Can anyone help me please?
But Q11 and Q12 contain Q1: Q11 and Q12
Does your pattern is xxQ1xx wher "x"es mean letters or _?
So you need "abcQ1efg" but not "123Q1456"?
Bart
If it's like I wrote, you may try the basePlus' package macro %getVars():
data test;
q1=1;
abcQ1efg=2;
_123Q1456=3;
q11=4;
Q12=5;
xQ1=6;
run;
%loadPackage(basePlus)
/*
%helpPackage(baseplus, '%getVars()' )
*/
%put #%getVars(test,pattern=Q1(?!\d))#;
data want;
set test;
keep %getVars(test,pattern=Q1(?!\d));
run;
Bart
You can use dictionary tables and create a macro variable of the list of desired variable names.
proc sql noprint;
select name into :varlist separated by ' '
from dictionary.columns
where libname='WORK' and memname='MYDATA' /* must be uppercase */
and upcase(name) like "Q%";
quit;
%put &varlist;
data myout;
set mydata;
keep &varlist;
run;
proc print data=myout;
run;
@NewUsrStat wrote:
Hi Bart, thank you for your support. All columns (variables) start with Q*. No Q*in the middle or at the end of a name.
Examples will help make your question clearer.
If the variables are all defined with simple numeric suffixes perhaps just using an iterative variable list will work?
keep q1-q9 ;
would keep variables Q1, Q2, .... Q9 but not keep variable Q12 or Q21 for example.
OK, so if it suppose to be "Q1xyz...", where x is not a digit (to keep "Q1A", but not "Q17") the pattern should be:
data test;
q1=1;
abcQ1efg=2;
_123Q1456=3;
q11=4;
Q12=5;
xQ1=6;
q1abc=7;
q1_efg=8;
run;
%loadPackage(basePlus)
/*
%helpPackage(baseplus, '%getVars()' )
*/
%put #%getVars(test,pattern=^Q1(?!\d))#;
data want;
set test;
keep %getVars(test,pattern=^Q1(?!\d));
run;
It will select only: q1, q1abc, and q1_efg.
Bart
The code you posted would only keep one variable. Perhaps you instead used this keep statement?
keep q1: ;
That will keep any variable whose name starts with the letter Q followed by the digit 1.
Can you provide a simple example that demonstrates the issue? And expound on any variations that you need to support.
For your simple example you could use a combination of KEEP and DROP.
data test;
length q1 q11 q12 q1test x y z 8;
keep q1: ;
drop q11: q12: ;
run;
NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
16
17 data _null_;
18 set test;
19 put (_all_) (=/);
20 run;
q1=.
q1test=.
NOTE: There were 1 observations read from the data set WORK.TEST.
But I suspect your issue is more complex than that?
But Q11 does contain Q1, so what's the issue?
@NewUsrStat wrote:
Dear all,
I'm trying to subset from a dataset all columns (variables) containing a pattern (named "Q1"). I tried the following:
data myout; set mydata; keep Q1; run;but it keeps also Q11,Q12, ...
Can anyone help me please?
No, the keep statement selects only Q1. Nothing else.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.