BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
pavank
Quartz | Level 8
data dsn;
input ID $ ;
datalines;
aBC.L
ABCa.L
cDE.L
BDEna.L
bNE.L
HDF.L
;
run;


data dsn_output;
    set dsn;
new_id=compress(id,,'ku');
 proc print noobs;
   run;

without REGEX  Method required output

ABC.L
ABC.L
CDE.L
BDE.L
BNE.L
HDF.L

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Make sure that the first three characters are uppercase, then use COMPRESS with the correct modifier:

data dsn;
input ID $;
new_id = compress(upcase(substr(id,1,3)) !! substr(id,4),,'kup');
datalines;
aBC.L
ABCa.L
cDE.L
BDEna.L
bNE.L
HDF.L
;

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

You are not using REGEX. I don't understand what you want.

--
Paige Miller
pavank
Quartz | Level 8

Hi @PaigeMiller  

I am trying below output typically using functions

ABC.L
ABC.L
CDE.L
BDE.L
BNE.L
HDF.L

PaigeMiller
Diamond | Level 26

I showed in your last post how to pull strings apart and make some of it uppercase. It is essentially the same here, except you seem to want (but you haven't explicitly said so) to find lowercase letters. How can you do that? You would use the FINDC function with the modifier 'L'. See if you can put together what I did in your last post, with the FINDC function.

 

With regards to my phrase above "but you haven't explicitly said so" — I'd like to see you explain more of the problem, rather than writing as few words as possible. Instead, write complete clear explanations. Be generous with words, be generous with your explanations, be generous with information. Its not enough to show us the input and output that you want, you need to explain in words what you want. Please start doing that.

 

Still unexplained is: are there always 3 letters that will be capitalized, then a dot, and then another capital letter? Or can there be more than 3 letters before the dot? Please explain that part as well. 

--
Paige Miller
Kurt_Bremser
Super User

Make sure that the first three characters are uppercase, then use COMPRESS with the correct modifier:

data dsn;
input ID $;
new_id = compress(upcase(substr(id,1,3)) !! substr(id,4),,'kup');
datalines;
aBC.L
ABCa.L
cDE.L
BDEna.L
bNE.L
HDF.L
;
Tom
Super User Tom
Super User

What is the rule?  This rule works for your example data.  Take the first three characters, uppercase them and append .L.

data dsn;
  input ID $ ;
  new_id=catx('.',upcase(substr(id,1,3)),'L');
datalines;
aBC.L
ABCa.L
cDE.L
BDEna.L
bNE.L
HDF.L
;

Result

OBS      ID       new_id

 1     aBC.L      ABC.L
 2     ABCa.L     ABC.L
 3     cDE.L      CDE.L
 4     BDEna.L    BDE.L
 5     bNE.L      BNE.L
 6     HDF.L      HDF.L

 

pavank
Quartz | Level 8

Hi @Tom  

Thank you very much for your solution

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 901 views
  • 2 likes
  • 4 in conversation