BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
David_Billa
Rhodochrosite | Level 12

I want to extract only some characters from a string, but I'm not getting the function to get it done. Assume I've a following values in one variable.

 

27183_ASD_271835678_CDNTN_0004_DE_2020-09_

37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_

22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_

 

Desired result is,

 

271835678

371845679

221865778

 

One hint which I can  say is, I want to identify the second occurance of first five characters seperated by underscore.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @David_Billa,


@David_Billa wrote:

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?


data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;

View solution in original post

8 REPLIES 8
japelin
Rhodochrosite | Level 12

try this code.

data have;
  length char $200;
  input char;
datalines;
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
;
run;

data want;
  set have;
  length get $200;
  cnt=countw(char,'_');
  do i=2 to cnt;
    get=scan(char,i,'_');
    if INDEX(get,trim(scan(char,1,'_')))=1 then output;
  end;
  keep get;
run;
 
David_Billa
Rhodochrosite | Level 12

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?

japelin
Rhodochrosite | Level 12

I don't know if this will fit in DI Studio, but if you want to extract without using do loop, there is a way to use substr. How about this?

 

data want;
  set have;
  length keyword get $200;
  keyword=substr(char,1,index(char,'_')-1);
  pos1=find(char,trim(keyword),2);
  pos2=find(char,'_',pos1);
  get=substr(char,pos1,pos2-pos1);
  drop pos1 pos2;
run;
FreelanceReinh
Jade | Level 19

Hi @David_Billa,


@David_Billa wrote:

Thanks! Can we also do this without do loop as I want to incorporate this logic in DI Studio expression?


data want;
set have;
want=scan(char,countc(substr(char,1,find(char,substr(char,1,5),6)-1),'_')+1,'_');
run;
ChrisNZ
Tourmaline | Level 20

So you want to keep 9 or more consecutive digits?

This should do it:

VAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, VAR);

 

David_Billa
Rhodochrosite | Level 12

It's not working. Can we do this with other function instead of PRXCHANGE?

 

26         data want;
27         set have;
28         CHAR1 = prxchange('/.*?(\d{9,}).*/\1/', -1, CHAR);
29         run;

ERROR: Invalid characters "\1/" after end delimiter "/" of regular expression "/.*?(\d{9,}).*/\1/".
ERROR: The regular expression passed to the function PRXCHANGE contains a syntax error.
ChrisNZ
Tourmaline | Level 20

 

Oh I forgot the s

data T;
  VAR1 = prxchange('s/.*(\d{9,}).*/\1/', -1, '27183_ASD_271835678_CDNTN_0004_DE_2020-09_');
  put VAR1=;
run;

VAR1=271835678

 

Ksharp
Super User
data have;
  length char $200;
  input char;
datalines;
27183_ASD_271835678_CDNTN_0004_DE_2020-09_
37184_ASD_ASG_371845679_CDSNN_0004_DN_2020-09_
22186_ASD_ASD_ASH_221865778_CKNNN_0004_CT_2020-09_
;
run;
data want;
 set have;
 want=scan(substr(char,prxmatch('/\d{9,}/',char)),1,'_');
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 3463 views
  • 5 likes
  • 5 in conversation