BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

I want to extract only some characters from a string, but I'm not getting the function to get it done. Assume I've a following values in one variable.

 

27183_ASD_271835678_CDNTN_0004_DE_2020-09_

37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_

22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_

 

Desired result is,

 

271835678

371845679

221865778

 

One hint which I can  say is, I want to identify the second occurance of first five characters seperated by underscore.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @David_Billa,


@David_Billa wrote:

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?


data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;

View solution in original post

8 REPLIES 8
japelin
Rhodochrosite | Level 12

try this code.

data have;
  length char $200;
  input char;
datalines;
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
;
run;

data want;
  set have;
  length get $200;
  cnt=countw(char,'_');
  do i=2 to cnt;
    get=scan(char,i,'_');
    if INDEX(get,trim(scan(char,1,'_')))=1 then output;
  end;
  keep get;
run;
 
David_Billa
Rhodochrosite | Level 12

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?

japelin
Rhodochrosite | Level 12

I don't know if this will fit in DI Studio, but if you want to extract without using do loop, there is a way to use substr. How about this?

 

data want;
  set have;
  length keyword get $200;
  keyword=substr(char,1,index(char,'_')-1);
  pos1=find(char,trim(keyword),2);
  pos2=find(char,'_',pos1);
  get=substr(char,pos1,pos2-pos1);
  drop pos1 pos2;
run;
FreelanceReinh
Jade | Level 19

Hi @David_Billa,


@David_Billa wrote:

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?


data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;
ChrisNZ
Tourmaline | Level 20

So you want to keep 9 or more consecutive digits?

This should do it:

VAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, VAR);

 

David_Billa
Rhodochrosite | Level 12

It's not working. Can we do this with other function instead of PRXCHANGE?

 

26         data want;
27         set have;
28         CHAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, CHAR);
29         run;

ERROR: Invalid characters "\1/" after end delimiter "/" of regular expression "/.*?(\d{9,}).*/\1/".
ERROR: The regular expression passed to the function PRXCHANGE contains a syntax error.
ChrisNZ
Tourmaline | Level 20

 

Oh I forgot the s

data T;
  VAR1 = prxchange('s/.*(\d{9,}).*/\1/', -1, '27183_ASD_271835678_CDNTN_0004_DE_2020-09_');
  put VAR1=;
run;

VAR1=271835678

 

Ksharp
Super User
data have;
  length char $200;
  input char;
datalines;
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
;
run;
data want;
 set have;
 want=scan(substr(char,prxmatch('/\d{9,}/',char)),1,'_');
run;

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2397 views
  • 5 likes
  • 5 in conversation