I'm using the code below to filter only the underscore values and without any special characters using WHERE caluse, But with the following code I'm not getting the desired output;
data have;
input TBL $20.;
datalines;
(INS_CON_GRP)
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2;
run;
Desired Output is:
H_EAL_YE_A
INS_CON_GROUP_DETL
You test for 2 or more underscores, but you didn't create a where condition to apply "without any special characters".
Let's say special characters are anything except letters, numbers, or underscores. Then:
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and findc(trim(TBL),' ','ADFK')=0;
run;
The FINDC function is the interesting element here.
It ordinarily searches the first argument, trim(TBL), for any character in the second argument and returns the position of the first qualifying character in TBL. If none is found it returns a zero.
But
Edited additional note: @Ksharp contributed an even simpler solution:
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
The NOTNAME function detects any character not eligible for a version 7 sas variable name, i.e. not a letter, number, or underscore.
Change GE by GT.
No, it is not resolving the issue if I change the example data. I ran the code below.
data have;
input TBL $20.;
datalines;
(INS_CON_GRP)
INS_GRT
INS_GR_T
INS
H_EAL_YE_A
(INS_CON_GROUP_DETL)
;
data want;
set have;
where countc(TBL, '_') gt 2;
run;
Output which I got is,
H_EAL_YE_A
(INS_CON_GROUP_DETL)
Desired Output is,
H_EAL_YE_A
INS_GR_T
You test for 2 or more underscores, but you didn't create a where condition to apply "without any special characters".
Let's say special characters are anything except letters, numbers, or underscores. Then:
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and findc(trim(TBL),' ','ADFK')=0;
run;
The FINDC function is the interesting element here.
It ordinarily searches the first argument, trim(TBL), for any character in the second argument and returns the position of the first qualifying character in TBL. If none is found it returns a zero.
But
Edited additional note: @Ksharp contributed an even simpler solution:
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
The NOTNAME function detects any character not eligible for a version 7 sas variable name, i.e. not a letter, number, or underscore.
Excellent!
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.