Help using Base SAS procedures

prxmatch

Reply
Contributor
Posts: 33

prxmatch

HI All
I need some help writing a prxmatch code. i need to extract some data from a string and insert it in to a new column. the string has two spaces in front and back and is two letters and one number in the middle. For example "space/space/letter/letter/number/space/space". currently i am using prxparse and prxnext having looked at sas help files. However i cnt get it to work. any advise would be much thanked. Thank you
SAS Super FREQ
Posts: 8,743

Re: prxmatch

Hi:
Does your data look like this:
[pre]
AB123
CD456
EF789
[/pre]

or like this:
[pre]
AB123 WOMBAT
CD456 KOALA
EF789 EUCALYPTUS
[/pre]

What piece of the string do you want in the new variable???? For example, if your data looks like the first example, would you want:
[pre]
var1 var2
AB 123
CD 456
EF 789
[/pre]

or if your data looked like the second example, would you want:
[pre]
var1 var2
AB123 WOMBAT
CD456 KOALA
EF789 EUCALYPTUS
[/pre]

You may not need PRXMATCH at all. You might be able to use the ANYDIGIT, SUBSTR and/or SCAN functions. If you do need help with PRXMATCH, a concrete idea of what your data looks like and the code you've already tried would be useful.

In the meantime, here are some papers about using PRX functions:
http://viergever.net/SVSUG/BasicsPDF_Cassell.pdf
http://www2.sas.com/proceedings/sugi29/129-29.pdf
http://www2.sas.com/proceedings/sugi30/138-30.pdf
http://analytics.ncsu.edu/sesug/2006/AP09_06.PDF

cynthia
Contributor
Posts: 33

Re: prxmatch

hi my data looks like this

/column 1/extracted string/
/hhfgjsjbshbsbgsbk ab1 hhdjdjdn/ab1/

so i need to scan column 1 so it extracts ab1 and puts it in to extracted string column.
SAS Super FREQ
Posts: 8,743

Re: prxmatch

Hi:
If you always want the second "chunk" delimited by spaces, as shown in your snapshot of data, then, the SCAN function will allow you to do that. Although you could use PRX functions, the SCAN function will allow you to break a text string into "chunks" or "words" based on a delimiter. If, as you describe, you want the second "chunk" deliimited by spaces, then for data like this:
[pre]
column1
hhfgjsjbshbsbgsbk ab1 hhdjdjdn
xyxyxyxyxyxy cd2 xyxyxyxy
abababababababababab ef3 ababab
123456789 hi4 abcdefghijklmnopqrstuvwxyz
abcdefghijklmn xx5 nopqrstuvwxyz
[/pre]

A simple SCAN function will do the job:
[pre]
** parse string;
chunk1 = scan(column1, 1, ' ');
chunk2 = scan(column1, 2, ' ');
chunk3 = scan(column1, 3, ' ');
[/pre]

The SCAN function treats multiple delimiters, such as the multiple spaces around "xx5" as one delimiter. As long as the rest of your string does not have spaces, SCAN might be a simpler approach. (Remember to use the LENGTH statement for the "chunk" variables or extracted variables so that you set the length you need for the maximum possible value.)

cynthia
Contributor
Posts: 33

Re: prxmatch

hi that was just a selection of my data my data strings has nultiple spaces for example....gahahshsj hshsuhsh shsusjsj ab1 hjdjjjjjffjkk a i would like the the ab1. if it helps it always two characters and one number
SAS Super FREQ
Posts: 8,743

Re: prxmatch

Ah, I wondered if that might be the situation. In that case, reading the papers on PRX functions should be most beneficial to you.

cynthia
Contributor
Posts: 33

Re: prxmatch

anyoe no the regular expression i should be using? iv tried many times and cant get it right. Thanks
Respected Advisor
Posts: 3,777

Re: prxmatch

In your first post you mentioned

space/space/letter/letter/number/space/space

[pre]
data _null_;
input target $32.;
retain rx;
if _n_ eq 1 then rx= prxparse('/( [A-Za-z]{2}\d )/');
f = prxmatch(rx,target);
x = prxposn(rx,1,target);
put _all_;
put x $char32.;
cards;
a LL3 b
a LL4 B
a LL5 B
a LL3 c
a bb3 b
a ca4 B

;;;;
run;
[/pre]
Ask a Question
Discussion stats
  • 7 replies
  • 490 views
  • 0 likes
  • 3 in conversation