- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I'm using the code below to filter only the underscore values and without any special characters using WHERE caluse, But with the following code I'm not getting the desired output;
data have;
input TBL $20.;
datalines;
(INS_CON_GRP)
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2;
run;
Desired Output is:
H_EAL_YE_A
INS_CON_GROUP_DETL
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You test for 2 or more underscores, but you didn't create a where condition to apply "without any special characters".
Let's say special characters are anything except letters, numbers, or underscores. Then:
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and findc(trim(TBL),' ','ADFK')=0;
run;
The FINDC function is the interesting element here.
It ordinarily searches the first argument, trim(TBL), for any character in the second argument and returns the position of the first qualifying character in TBL. If none is found it returns a zero.
But
- When the 2nd argument is blank it is ignored
- The third argument has modifiers
- A - search for any letter
- D - search for any digit
- F - search for any underscore
- But what is wanted is not the position of the first letter, digit, or underscore, but the position of the first character that is NOT one of those. That is done by use of the K in the third argument. So it finds the position of the first non-letter, non-digit, non-underscore. If the function returns a zero there are no special characters.
Edited additional note: @Ksharp contributed an even simpler solution:
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
The NOTNAME function detects any character not eligible for a version 7 sas variable name, i.e. not a letter, number, or underscore.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Change GE by GT.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No, it is not resolving the issue if I change the example data. I ran the code below.
data have;
input TBL $20.;
datalines;
(INS_CON_GRP)
INS_GRT
INS_GR_T
INS
H_EAL_YE_A
(INS_CON_GROUP_DETL)
;
data want;
set have;
where countc(TBL, '_') gt 2;
run;
Output which I got is,
H_EAL_YE_A
(INS_CON_GROUP_DETL)
Desired Output is,
H_EAL_YE_A
INS_GR_T
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You test for 2 or more underscores, but you didn't create a where condition to apply "without any special characters".
Let's say special characters are anything except letters, numbers, or underscores. Then:
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and findc(trim(TBL),' ','ADFK')=0;
run;
The FINDC function is the interesting element here.
It ordinarily searches the first argument, trim(TBL), for any character in the second argument and returns the position of the first qualifying character in TBL. If none is found it returns a zero.
But
- When the 2nd argument is blank it is ignored
- The third argument has modifiers
- A - search for any letter
- D - search for any digit
- F - search for any underscore
- But what is wanted is not the position of the first letter, digit, or underscore, but the position of the first character that is NOT one of those. That is done by use of the K in the third argument. So it finds the position of the first non-letter, non-digit, non-underscore. If the function returns a zero there are no special characters.
Edited additional note: @Ksharp contributed an even simpler solution:
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
The NOTNAME function detects any character not eligible for a version 7 sas variable name, i.e. not a letter, number, or underscore.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Excellent!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data have;
input TBL $20.;
datalines;
INS_GRT
INS
H_EAL_YE_A
INS_CON_GROUP_DETL
;
data want;
set have;
where countc(TBL, '_') ge 2 and notname(strip(TBL))=0;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------