I want to extract only some characters from a string, but I'm not getting the function to get it done. Assume I've a following values in one variable.
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
Desired result is,
271835678
371845679
221865778
One hint which I can say is, I want to identify the second occurance of first five characters seperated by underscore.
Hi @David_Billa,
@David_Billa wrote:
Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?
data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;
try this code.
data have;
length char $200;
input char;
datalines;
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
;
run;
data want;
set have;
length get $200;
cnt=countw(char,'_');
do i=2 to cnt;
get=scan(char,i,'_');
if INDEX(get,trim(scan(char,1,'_')))=1 then output;
end;
keep get;
run;
Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?
I don't know if this will fit in DI Studio, but if you want to extract without using do loop, there is a way to use substr. How about this?
data want;
set have;
length keyword get $200;
keyword=substr(char,1,index(char,'_')-1);
pos1=find(char,trim(keyword),2);
pos2=find(char,'_',pos1);
get=substr(char,pos1,pos2-pos1);
drop pos1 pos2;
run;
Hi @David_Billa,
@David_Billa wrote:
Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?
data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;
So you want to keep 9 or more consecutive digits?
This should do it:
VAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, VAR);
It's not working. Can we do this with other function instead of PRXCHANGE?
26 data want; 27 set have; 28 CHAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, CHAR); 29 run; ERROR: Invalid characters "\1/" after end delimiter "/" of regular expression "/.*?(\d{9,}).*/\1/". ERROR: The regular expression passed to the function PRXCHANGE contains a syntax error.
Oh I forgot the s
data T;
VAR1 = prxchange('s/.*(\d{9,}).*/\1/', -1, '27183_ASD_271835678_CDNTN_0004_DE_2020-09_');
put VAR1=;
run;
VAR1=271835678
data have; length char $200; input char; datalines; 27183_ASD_271835678_CDNTN_0004_DE_2020-09_ 37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_ 22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_ ; run; data want; set have; want=scan(substr(char,prxmatch('/\d{9,}/',char)),1,'_'); run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.