BookmarkSubscribeRSS Feed
SASPhile
Quartz | Level 8


X20l,p5,op3
cX3z,p4,op2(1)

 

Need help on extracting values that are after letter X and before first coma.

 

for :X20l,p5,op3----20l
for:cX3z,p4,op2(1)--3z

 

5 REPLIES 5
andreas_lds
Jade | Level 19
Such problems are best solved by using a regular expression. I can't post working code right now, you need the functions prxmatch and prxposn. The regular expression should be "X(.+),“
Shemp
Obsidian | Level 7

Try this:

 

data temp;
   length extracted_value $ 10;
   input char_str $ 1-20;
   extracted_value = prxchange('s/^[^X]*X(.*?)\x2C.*$/$1/',1,char_str);
   datalines;
X20l,p5,op3
cX3z,p4,op2(1)
;

PGStats
Opal | Level 21

As suggested, a regular expression will do the job:

 

data temp;
if _n_ = 1 then prxId + prxParse("/X(\w+?),/i");
length extracted_value $ 10;
input char_str $ 1-20;
if prxMatch(prxId, char_str) then
    extracted_value = prxPosn(prxId, 1, char_str);
drop prxId;
datalines;
X20l,p5,op3
cX3z,p4,op2(1)
;

proc print data=temp noobs; run;

Remove the "i" in the pattern if the letter X at the beginning cannot be lowercase.

PG
Ksharp
Super User

If it appeared only once, that would be easy.

 

data have;
input x $40.;
cards;
X20l,p5,op3
cX3z,p4,op2(1)
;
run;
data want;
 set have;
 pid=prxparse('/(?<=x).+?(?=,)/i');
 call prxsubstr(pid,x,p,l);
 want=substr(x,p,l);
run;
Ksharp
Super User

data want;
 set have;
 pid=prxparse('/(?<=x)[^,]+(?=,)/i');
 call prxsubstr(pid,x,p,l);
 want=substr(x,p,l);
run;