Solved: Re: Find a character in a string from reverse

chandan_mishra · Posted 10-16-2017 03:56 PM

Hello

I have a raw file whose name is Client_Product_Ash_HCP_Address_20171006030056_2803.txt. I read the file into the SAS but only want the part of the file name just before the second "_" from the reverse. So, basically the output file name should look like this:

Client_Product_Ash_HCP_Address

Another example would be Client_Product_Ash_HCP_Contact_20171006030058_2870.txt. The output file name should look like this:

Client_Product_Ash_HCP_Contact

Any ideas on how to use a combination of Substr and index function to achieve this goal?

Thanks

Chandan Mishra

data_null__ · Posted 10-16-2017 04:10 PM

You can use CALL SCAN to find the position of the second underscore delimited word and use P to sub-string everything from column 1 to the desired column.

27         data _null_;
28            file = 'Client_Product_Ash_HCP_Contact_20171006030058_2870.txt';
29            call scan(file,-2,p,l,'_');
30            part = substrn(file,1,p-2);
31            put _all_;
32            run;

file=Client_Product_Ash_HCP_Contact_20171006030058_2870.txt p=32 l=14 part=Client_Product_Ash_HCP_Contact _ERROR_=0 _N_=1

View solution in original post

Reeza · Posted 10-16-2017 04:02 PM

Use ANYDIGIT to find the location of the first digit.

Use SUBSTR to then extract the part desired.

index = anydigit(string);

string_want = substr(string, 1, index-2); *not sure how much you'll need to minus to get what you want, but you can test it and see;

data_null__ · Posted 10-16-2017 04:10 PM

You can use CALL SCAN to find the position of the second underscore delimited word and use P to sub-string everything from column 1 to the desired column.

27         data _null_;
28            file = 'Client_Product_Ash_HCP_Contact_20171006030058_2870.txt';
29            call scan(file,-2,p,l,'_');
30            part = substrn(file,1,p-2);
31            put _all_;
32            run;

file=Client_Product_Ash_HCP_Contact_20171006030058_2870.txt p=32 l=14 part=Client_Product_Ash_HCP_Contact _ERROR_=0 _N_=1

mkeintz · Posted 10-16-2017 04:16 PM

You could use SCAN function to concatenate the 2nd-last and last "words" (where '_' is the word delimiter). Then use TRANWRD to convert it to blank.

data _null_;
  txt="Client_Product_Ash_HCP_Address_20171006030056_2803.txt";
  drop_text='_'||catx('_',scan(txt,-2,'_'),scan(txt,-1,'_'));
  want=tranwrd(txt,trim(drop_text),' ');
  put (_all_) (= /);
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ksharp · Posted 10-17-2017 08:19 AM

Since @data_null__ John King has already posted a wonderful solution.

I would like to post another solution.

 data _null_;
            file = 'Client_Product_Ash_HCP_Contact_20171006030058_2870.txt';
            part = prxchange('s/[\d_]+$//',1,scan(file,1,'.'));
            put _all_;
          run;

chandan_mishra · Posted 10-17-2017 03:19 PM

Hi @Ksharp

I am trying to search for how prxchange works but it seems really complicated. The syntax is:

prxchange( regular expression|id, occurrence, source )

I understood the occurence and source part but how to define the regular expression|id part.

Thanks

Chandan Mishra

Reeza · Posted 10-17-2017 03:23 PM

Regular expressions are actually from PERL, they don't originate with SAS, but they're highly useful so many languages now implement this.

Perl documentation and builder are here:

https://perldoc.perl.org/perlre.html

https://regex101.com/

Ksharp · Posted 10-18-2017 08:37 AM

's/[\d_]+$//'

This means replace more than one digit or underline characters [\d_]+ (at the end of string) with null .

SAS Innovate 2025: Call for Content

Classroom Training Available!