BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RohanaBruker
Fluorite | Level 6

I have a data set with initials in the following format: last,firstmiddle (also pictured below)

 

 

I am trying to format the "Initials" variable so that the initials are displayed first middle last (FML). For example, the second observation should show DNJ instead of J,DN. I also want to leave the "-" symbol for any missing middle initials ( as seen in observation 1).

 

Any feedback would be appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
dthompsonada
Obsidian | Level 7

Hi Rohana-

Based on what we have actually covered in class, I adapted this from the Cancer RCT code example. It appears to get what we want for the initials (FML). I can't find an actual FML function:

IF MISSING(Initials) = 0 THEN
Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
SUBSTR(UPCASE(Initials), 4, 1),
SUBSTR(UPCASE(Initials), 1, 1)
);
ELSE Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
'-',
SUBSTR(UPCASE(Initials), 1, 1)
);
RUN;

Obs SSN City ZipCd Inits StateCd12345

SSN City Zip Inits  
         
         
         
         

Note: I don't have all of the code for this dataset done, so it is not sorted by SSN yet. Which means that JGS is not first. 

I hope that helps! 

 

View solution in original post

8 REPLIES 8
ballardw
Super User

If all of the values with a comma are in a L,FM layout then this would work. If they are actually in a different order you will have to provide something that lets us know what the different order would be and how to know when.

Perhaps this can get you started.

data want;
   set have;
   if index(initials,',')>0 then initials =cats(scan(initials,2),scan(initials,1));
run;

I'm sure someone will show a regular expression approach as well.

SASJedi
SAS Super FREQ

I love the SCAN solution for its simplicity and speed. But to ensure you preserve the '-' int the original values (indicating that there is no initial in that position) you need to specify comma as the only delimiter for SCAN. This should do the trick:

data want;
   set have;
   /* Split Initials into 2 "words" using SCAN with only comma as delimiter */
   /* Then concatenate the "words" in the desired order */
   initials=cats(scan(initials,-1,','),scan(initials,1,','));
run;

  

Check out my Jedi SAS Tricks for SAS Users
CurtisMackWSIPP
Lapis Lazuli | Level 10

It can be done with FCMP.

 

data have;
  infile datalines;
  length Initials $6;
  input Initials $;
datalines;
M,T-
J,DN
R,JJ
G,MG
F,BC
;
run;

proc fcmp outlib=work.functions.fun  ;
    function InitFLip(string $) $;
      length r $4;
      r = catt(scan(string,2,','),scan(string,1,','));
      return (r);
    endsub;
run;
 
options cmplib=work.functions;    
proc format;
  value $ flip (default=4) other=[InitFLip()];
run;

data want;
  set have;
  format Initials $flip.;
run;
dthompsonada
Obsidian | Level 7

Hi Rohana-

Based on what we have actually covered in class, I adapted this from the Cancer RCT code example. It appears to get what we want for the initials (FML). I can't find an actual FML function:

IF MISSING(Initials) = 0 THEN
Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
SUBSTR(UPCASE(Initials), 4, 1),
SUBSTR(UPCASE(Initials), 1, 1)
);
ELSE Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
'-',
SUBSTR(UPCASE(Initials), 1, 1)
);
RUN;

Obs SSN City ZipCd Inits StateCd12345

SSN City Zip Inits  
         
         
         
         

Note: I don't have all of the code for this dataset done, so it is not sorted by SSN yet. Which means that JGS is not first. 

I hope that helps! 

 
RohanaBruker
Fluorite | Level 6

Thank you!!

dthompsonada
Obsidian | Level 7
Confirmed this code with Laura today!
andreas_lds
Jade | Level 19

Thanks @CurtisMackWSIPP for providing data in usable form. @RohanaBruker please post data as data step using datalines, not as screenshot, not only to provide something we can actually work with, but to make it easier to read your post, too.

 

And here is a solution using a regular expression:

data want;
   set have;

   length fml $ 3;
   
   fml = prxchange('s/([A-Z]),(.+)/$2$1/', 1, trim(Initials));
run;
kirstinpruitt
Calcite | Level 5

Hello Rohana!

 

You can also use the following to code:

Inits = CATS(SUBSTR(Initials,3,2),SUBSTR(Initials,1,1));

The dashes for missing will remain in place for this dataset. 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1188 views
  • 5 likes
  • 7 in conversation