Solved: Re: How to use FML initial format?

RohanaBruker · Posted 10-26-2020 05:30 PM

I have a data set with initials in the following format: last,firstmiddle (also pictured below)

I am trying to format the "Initials" variable so that the initials are displayed first middle last (FML). For example, the second observation should show DNJ instead of J,DN. I also want to leave the "-" symbol for any missing middle initials ( as seen in observation 1).

Any feedback would be appreciated!

dthompsonada · Posted 10-26-2020 11:08 PM

Hi Rohana-

Based on what we have actually covered in class, I adapted this from the Cancer RCT code example. It appears to get what we want for the initials (FML). I can't find an actual FML function:

IF MISSING(Initials) = 0 THEN
Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
SUBSTR(UPCASE(Initials), 4, 1),
SUBSTR(UPCASE(Initials), 1, 1)
);
ELSE Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
'-',
SUBSTR(UPCASE(Initials), 1, 1)
);
RUN;

Obs SSN City ZipCd Inits StateCd12345

SSN	City	Zip	Inits

Note: I don't have all of the code for this dataset done, so it is not sorted by SSN yet. Which means that JGS is not first.

I hope that helps!

View solution in original post

ballardw · Posted 10-26-2020 06:35 PM

If all of the values with a comma are in a L,FM layout then this would work. If they are actually in a different order you will have to provide something that lets us know what the different order would be and how to know when.

Perhaps this can get you started.

data want;
   set have;
   if index(initials,',')>0 then initials =cats(scan(initials,2),scan(initials,1));
run;

I'm sure someone will show a regular expression approach as well.

SASJedi · Posted 10-27-2020 07:11 AM

I love the SCAN solution for its simplicity and speed. But to ensure you preserve the '-' int the original values (indicating that there is no initial in that position) you need to specify comma as the only delimiter for SCAN. This should do the trick:

data want;
   set have;
   /* Split Initials into 2 "words" using SCAN with only comma as delimiter */
   /* Then concatenate the "words" in the desired order */
   initials=cats(scan(initials,-1,','),scan(initials,1,','));
run;

Check out my Jedi SAS Tricks for SAS Users

CurtisMackWSIPP · Posted 10-26-2020 06:58 PM

It can be done with FCMP.

data have;
  infile datalines;
  length Initials $6;
  input Initials $;
datalines;
M,T-
J,DN
R,JJ
G,MG
F,BC
;
run;

proc fcmp outlib=work.functions.fun  ;
    function InitFLip(string $) $;
      length r $4;
      r = catt(scan(string,2,','),scan(string,1,','));
      return (r);
    endsub;
run;
 
options cmplib=work.functions;    
proc format;
  value $ flip (default=4) other=[InitFLip()];
run;

data want;
  set have;
  format Initials $flip.;
run;

dthompsonada · Posted 10-26-2020 11:08 PM

Hi Rohana-

Based on what we have actually covered in class, I adapted this from the Cancer RCT code example. It appears to get what we want for the initials (FML). I can't find an actual FML function:

IF MISSING(Initials) = 0 THEN
Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
SUBSTR(UPCASE(Initials), 4, 1),
SUBSTR(UPCASE(Initials), 1, 1)
);
ELSE Inits = CATS( SUBSTR(UPCASE(Initials), 3, 1), /* 3.3 */
'-',
SUBSTR(UPCASE(Initials), 1, 1)
);
RUN;

Obs SSN City ZipCd Inits StateCd12345

SSN	City	Zip	Inits

Note: I don't have all of the code for this dataset done, so it is not sorted by SSN yet. Which means that JGS is not first.

I hope that helps!

RohanaBruker · Posted 10-27-2020 05:13 PM

Thank you!!

dthompsonada · Posted 10-27-2020 09:10 PM

Confirmed this code with Laura today!

andreas_lds · Posted 10-27-2020 02:01 AM

Thanks @CurtisMackWSIPP for providing data in usable form. @RohanaBruker please post data as data step using datalines, not as screenshot, not only to provide something we can actually work with, but to make it easier to read your post, too.

And here is a solution using a regular expression:

data want;
   set have;

   length fml $ 3;
   
   fml = prxchange('s/([A-Z]),(.+)/$2$1/', 1, trim(Initials));
run;

kirstinpruitt · Posted 11-05-2020 11:08 AM

Hello Rohana!

You can also use the following to code:

Inits = CATS(SUBSTR(Initials,3,2),SUBSTR(Initials,1,1));

The dashes for missing will remain in place for this dataset.

Ready to join fellow brilliant minds for the SAS Hackathon?