BookmarkSubscribeRSS Feed
NewUsrStat
Lapis Lazuli | Level 10

Dear all,

I'm trying to subset from a dataset all columns (variables) containing a pattern (named "Q1"). I tried the following:

 

data myout;
set mydata;
    keep Q1;
run;

but it keeps also Q11,Q12, ...

Can anyone help me please?

10 REPLIES 10
yabwon
Amethyst | Level 16

But Q11 and Q12 contain Q1: Q11 and Q1

Does your pattern is xxQ1xx wher "x"es mean letters or _?

So you need "abcQ1efg" but not "123Q1456"?

 

Bart

 

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



yabwon
Amethyst | Level 16

If it's like I wrote, you may try the basePlus' package macro %getVars():

data test;
  q1=1;
  abcQ1efg=2;
  _123Q1456=3;
  q11=4;
  Q12=5;
  xQ1=6;
run;

%loadPackage(basePlus)
/*
%helpPackage(baseplus, '%getVars()' )
*/

%put #%getVars(test,pattern=Q1(?!\d))#;

data want;
  set test;
  keep %getVars(test,pattern=Q1(?!\d));
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



NewUsrStat
Lapis Lazuli | Level 10
Hi Bart, thank you for your support. All columns (variables) start with Q*. No Q*in the middle or at the end of a name.
Kathryn_SAS
SAS Employee

You can use dictionary tables and create a macro variable of the list of desired variable names.

proc sql noprint;
 select name into :varlist separated by ' '
 from dictionary.columns
 where libname='WORK' and memname='MYDATA' /* must be uppercase */
  and upcase(name) like "Q%";
quit;
%put &varlist;

data myout;
set mydata;
    keep &varlist;
run;

proc print data=myout;
run;
NewUsrStat
Lapis Lazuli | Level 10
Thank you very much for your support but Q1 and for example Q17 are still subsetted together that is what is not needed. I would like to separate Q1 from Q17 and so on.
Tom
Super User Tom
Super User

@NewUsrStat wrote:
Hi Bart, thank you for your support. All columns (variables) start with Q*. No Q*in the middle or at the end of a name.

Examples will help make your question clearer.

If the variables are all defined with simple numeric suffixes perhaps just using an iterative variable list will work?

keep q1-q9 ;

would keep variables Q1, Q2, .... Q9 but not keep variable Q12 or Q21 for example.

yabwon
Amethyst | Level 16

OK, so if it suppose to be "Q1xyz...", where x is not a digit (to keep "Q1A", but not "Q17") the pattern should be:

data test;
  q1=1;
  abcQ1efg=2;
  _123Q1456=3;
  q11=4;
  Q12=5;
  xQ1=6;
  q1abc=7;
  q1_efg=8;
run;

%loadPackage(basePlus)
/*
%helpPackage(baseplus, '%getVars()' )
*/

%put #%getVars(test,pattern=^Q1(?!\d))#;

data want;
  set test;
  keep %getVars(test,pattern=^Q1(?!\d));
run;

It will select only: q1, q1abc, and q1_efg.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tom
Super User Tom
Super User

The code you posted would only keep one variable.  Perhaps you instead used this keep statement?

keep q1: ;

That will keep any variable whose name starts with the letter Q followed by the digit 1.

 

Can you provide a simple example that demonstrates the issue?  And expound on any variations that you need to support.

 

For your simple example you could use a combination of KEEP and DROP.

data test;
  length q1 q11 q12 q1test x y z 8;
  keep q1: ;
  drop q11: q12: ;
run;
NOTE: The data set WORK.TEST has 1 observations and 2 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds


16
17   data _null_;
18     set test;
19     put (_all_) (=/);
20   run;


q1=.
q1test=.
NOTE: There were 1 observations read from the data set WORK.TEST.

But I suspect your issue is more complex than that?

 

 

andreas_lds
Jade | Level 19

@NewUsrStat wrote:

Dear all,

I'm trying to subset from a dataset all columns (variables) containing a pattern (named "Q1"). I tried the following:

 

data myout;
set mydata;
    keep Q1;
run;

but it keeps also Q11,Q12, ...

Can anyone help me please?


No, the keep statement selects only Q1. Nothing else.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1327 views
  • 4 likes
  • 6 in conversation