Hello:
I have the following data. I would like to remove all of the '_' and assign what the number in the string at the end. Please help. Thanks.
data have;
input Names$100.;
cards;
lab_den_pcr_perf_1
mhh_cyto_tests_1__cfdna
nad_cert_30
nad_imag_find2___abn_cort_gyr
;
run;
The result I am looking for is
labdenpcrperf1
mhhcytotestscfdna1
nadcert30
nadimagfindabncortgyr2
Specifying the k-modifier in the compress function keeps the "_" instead of removing it.
I also think he wants the numbers at the end of the string instead of the beginning.
So modifying your code, I think this will work:
data want2;
set have;
names=cats(compress(names,"_","d"), compress(names,"_","a"));
run;
You can use the COMPRESS function to remove certain characters from a string. Not sure what you mean by "assign what the number in the string at the end". Do you want to have the number of occurences of "_" at the end of each string?
Providing a "desired result" would help us to better understand what you're after.
The following code will populate a numeric variable with a number if digits only are found as the last element of your source string (with underscore separating the "elements").
data have;
input Names :$100.;
cards;
lab_den_pcr_perf_1
mhh_cyto_tests_1__cfdna
nad_cert_30
nad_imag_find2___abn_cort_gyr
;
run;
data want;
set have;
no_at_end_of_str=input(scan(names,-1,'_'),?? best32.);
run;
Something like:
data want; set have; names=cats(compress(names,"_","ka"),compress(names,"_","kd")); run;
Specifying the k-modifier in the compress function keeps the "_" instead of removing it.
I also think he wants the numbers at the end of the string instead of the beginning.
So modifying your code, I think this will work:
data want2;
set have;
names=cats(compress(names,"_","d"), compress(names,"_","a"));
run;
Actually my code is fine, just remove the _, whiich I added in haste:
data want; set have; names=cats(compress(names,"","ka"),compress(names,"","kd")); run;
Here's a way:
data want;
set have;
prefix = compress(names, '_0123456789');
suffix = compress(names, , 'kd' );
names = strip(prefix) || suffix;
drop prefix suffix;
run;
One caution: If there are multiple sets of digits, all of them get combined and put at the end. For example:
abc1_xyz2_def becomes abcxyzdef12
Like this ?
data want;
set have;
n_position= anydigit(names,1);
number= compress(substr(names,n_position),,"kd");
names_want= cats(compress(names,'_',"d"),number);
keep names_want;
run;
something like this in regex
data want; set have; name = prxchange('s/_|([^\d]+$)//', -1, names); run;
I start to understand what the prxchange means.
The following code returns the desired output as you've posted.
Based on your narrative I'm not sure if this really is what you're after. If not then please post some additional sample data where it's not working for you.
data have;
input Names :$100.;
cards;
lab_den_pcr_perf_1
mhh_cyto_tests_1__cfdna
nad_cert_30
nad_imag_find2___abn_cort_gyr
;
run;
data want;
set have;
length names_want $100;
names_want=cats(compress(Names,'_','d'),compress(Names,,'kd'));
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.