I've had a quick search for any previous questions along these lines, but couldn't find any. Apologies if this has been asked before.
I'm currently trying to search data for occurances of free text which includes the asterisk (*) symbol. I need to extract records where the asterisk is included, however I am concerned that in searching for something like "A*,X,Y,Z"...
e.g.
index ([column label], ".A*,X,Y,Z")
...the asterisk symbol is being considered as a wildcard, so this search would pick-up on entries like: "Az,X,Y,Z" and "A5,X,Y,Z", etc...
1. Are my concerns valid?
2. Is there any way around this?
Thanks in advance for any help.
Cheers
Nick
I don't think this is a concern. For example, the below outputs as I would expect:
data dsn;
format str $50.;
infile datalines dsd;
input str $;
i = index (str, ".A*,X,Y,Z");
datalines;
"zzz.A*,X,Y,Zzzz"
"zzz.Azzzzzzzzz,X,Y,Zzzz"
;
run;
Only the first entry is picked up by Index. If the asterisk (or a percentage sign, etc.) were treated as a wildcard, then it would pick up both rows.
If you wanted to include wildcards you might consider using the REGEX functions like PRXMATCH() etc..
I don't think this is a concern. For example, the below outputs as I would expect:
data dsn;
format str $50.;
infile datalines dsd;
input str $;
i = index (str, ".A*,X,Y,Z");
datalines;
"zzz.A*,X,Y,Zzzz"
"zzz.Azzzzzzzzz,X,Y,Zzzz"
;
run;
Only the first entry is picked up by Index. If the asterisk (or a percentage sign, etc.) were treated as a wildcard, then it would pick up both rows.
If you wanted to include wildcards you might consider using the REGEX functions like PRXMATCH() etc..
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.