Hi All,
I want to find special characters for the variable name but i am missing few records where the variable name contain space followed by alphabetical or space in between alphabets or alphabets followed by space.
How to capture records using regular expression where spaces are available like pid 105, 106 and 109.
I use the below approach.
data find;
input pid name $;
cards;
101 acbd
102 !and
103 X.Y
104 1TVN
105 A BCD
106 bd  
107 ANKR
108 K@234
109  KRS
110 235
;
run;
data find1;
set find;
found= prxchange('s/[A-Za-z.]//i',-1,name);
run;
data find2;
set find1;
where found ne '';
run;
Thanks
You just need to fiddle with it a bit:
data have;
infile datalines dlm=",";
input pid name $;
datalines;
102,!and
104,1TVN
108,K@234
105,A BCD
;
run;
data want;
set have;
find=compress(name,"","a");
run;
This works for all but the space part. Now this is more of an issue, because depending on the length of name all of the given items contain blanks after the text up to the length of the text string. So say name is $8:
!and
has four "spaces" after the text there. So checking space afterwards doesn't make much sense. If its before or during then possibly.
Use compress:
data find2; set find; where lengthn(compress(name,," ","ad"))>0; run;
This will remove all alphanumeric characters from the string, then if the length left is greater than zero you have special characters.
Thanks RW09.
But my query is to find all PID where all special characters including space (i.e space before text, in between text and after text) will be there.
My Output would be
pid    name   found
102   !and       !
104   1TVN    1
108  K@234 @234
110  235       235
105  A BCD 
106 bd 
109 KRS 
You just need to fiddle with it a bit:
data have;
infile datalines dlm=",";
input pid name $;
datalines;
102,!and
104,1TVN
108,K@234
105,A BCD
;
run;
data want;
set have;
find=compress(name,"","a");
run;
This works for all but the space part. Now this is more of an issue, because depending on the length of name all of the given items contain blanks after the text up to the length of the text string. So say name is $8:
!and
has four "spaces" after the text there. So checking space afterwards doesn't make much sense. If its before or during then possibly.
Thank you very much for your support.
Is there any possibility to capture the same record using prxchange function. I would like to learn it. I browse in google and found /s to be used. But it is not working.
Any suggestion will be highly appreciated
In your original dataset, the "leading" and "included" spaces never end up in the variables, as they are considered to be delimiters:
data find;
input pid name $;
cards;
101 acbd
102 !and
103 X.Y
104 1TVN
105 A BCD
106 bd  
107 ANKR
108 K@234
109  KRS
110 235
;
run;
data check;
set find;
check = put(name,$hex16.);
run;
proc print data=check noobs;
run;
Result:
pid name check 101 acbd 6163626420202020 102 !and 21616E6420202020 103 X.Y 582E592020202020 104 1TVN 3154564E20202020 105 A 4120202020202020 106 bd 6264202020202020 107 ANKR 414E4B5220202020 108 K@234 4B40323334202020 109 KRS 4B52532020202020 110 235 3233352020202020
To read leading and inserted spaces successfully, use the dsd and dlm= options, and the $CHAR informat:
data find;
infile cards dlm=' ' dsd;
input pid name :$char8.;
cards;
101 "acbd"
102 "!and"
103 "X.Y"
104 "1TVN"
105 "A BCD"
106 "bd  "
107 "ANKR"
108 "K@234"
109 " KRS"
110 "235"
;
run;
Thanks once again for your guidance.
I would like to learn is there any option in regular expression where I can add to prxchange function to capture the PID containing space
like 105 and 109 or any alternative regexpression.
prxchange('s/[A-Za-z.]//i',-1,name);
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
