BookmarkSubscribeRSS Feed
ChrisNZ
Tourmaline | Level 20

I finally bit the bullet and wrote the concatenation function that I needed. 

this is the first draft, there must be much room for improvement.

This function can use delimiters, formats, allows missing values to be incorporated, allows quoting, and includes basic cleaning options such as upcase, lowcase, compbl, removing non-printable characters.

Use as you see fit.

proc fcmp outlib=WORK.MYFUNCS.STR;
  function concat(
        DLM     $ /* Delimiter string to insert between concatenated strings                 */
                  /*   The delimiter string is only used if option D is found                */
      , QUOTE   $ /* Character(s) used to surround concatenated strings                      */
                  /*   Single and double quotes can be used                                  */
                  /*   Brackets ( [ { < > } ] )can be used. In this case:                    */
                  /*     Opening brackets ( [ { < are inserted at the start                  */
                  /*     Closing brackets > } ] ) are inserted at the end                    */
                  /*   Multiple characters can be used, except for single and double quotes  */
                  /*   The quoting string is only used if option Q is found                  */
      , FORMAT  $ /* Name of the character format to apply                                   */
                  /*   The format is only used if option F is found                          */
      , OPTIONS $ /* Options driving concatenation. These letters are recognised:            */
                  /*   Q Use quoting characters(s) if provided                               */
                  /*   F Use format if provided                                              */
                  /*   D Use delimiter in concatenated string                                */
                  /*   M Include missing string in concatenated string                       */
                  /*   T Trim string before adding to concatenated string                    */
                  /*   S Strip string before adding to concatenated string                   */
                  /*   C Compbl string before adding to concatenated string                  */
                  /*   U Upcase string before adding to concatenated string                  */
                  /*   L Lowcase string before adding to concatenated string                 */
                  /*   P Propcase string before adding to concatenated string                */
                  /*   N Clean non-printable characters before adding to concatenated string */
      , A[*]    $ /* Character array containing the strings to concatenate                   */
      )
      $ 32767;
    length RESULT S $32767 Q1 Q2 OPT FMT $32 L LF 8;
    RESULT = 'a';                         
    OPT    = upcase(OPTIONS);                                         
    %* Remove request to format if no format is supplied, and set dummy format so ifc() is happy ;
    if ^index(OPT,'F') | (index(OPT,'F') & FORMAT=' ') then do; OPT=compress(OPT,'F'); FMT='$1.'; end; else FMT=FORMAT;
    %* Set surrounding (quoting) characters ;                                              
    if index(OPT,'Q') & lengthn(QUOTE ) then do;
      if      QUOTE in:('(',')') then do; Q1=repeat('(',length(QUOTE)-1); Q2=repeat(')',length(QUOTE)-1); end;
      else if QUOTE in:('{','}') then do; Q1=repeat('{',length(QUOTE)-1); Q2=repeat('}',length(QUOTE)-1); end;
      else if QUOTE in:('[',']') then do; Q1=repeat('[',length(QUOTE)-1); Q2=repeat(']',length(QUOTE)-1); end;
      else if QUOTE in:('<','>') then do; Q1=repeat('<',length(QUOTE)-1); Q2=repeat('>',length(QUOTE)-1); end;
      else if index(QUOTE,'"')   then do; Q1='"'   ; Q2='"'   ; end; 
      else if index(QUOTE,"'")   then do; Q1="'"   ; Q2="'"   ; end; 
      else do;                            Q1=QUOTE ; Q2=QUOTE ; end;  
    end;                                                                

    %* Get variable length. Does not go under 8;
    S=A[1]||'a'; L=length(S)-1; 
    %* Loop through the values and append;
    do I = 1 to dim(A);
      S=ifc(index(OPT,'C'), compbl  (trim(A[I])  ), trim(A[I]));
      S=ifc(index(OPT,'U'), upcase  (trim(S   )  ), trim(S   ));
      S=ifc(index(OPT,'L'), lowcase (trim(S   )  ), trim(S   ));
      S=ifc(index(OPT,'P'), propcase(trim(S   )  ), trim(S   ));
      if index(OPT,'F') then do; S=putc(substrn(S,1,L),FMT) ; LF=length(S); end;
   
      if lengthn(S) | index(OPT,'M') then
      RESULT = ifc(index(OPT,'T'), trim(substrn(RESULT,1,length(RESULT)-1)), substrn(RESULT,1,length(RESULT)-1)) %* Add previous RESULT: Trim it if needed, otherwise use the length saved ;
             || ifc( I=1 | ^index(OPT,'D'), '', ifc(lengthn(DLM), DLM, ' '))                                     %* Add delimiter if needed                                                ;
             || ifc(QUOTE in:('"',"'")                                                                           %* Add quote character(s)                                                 ;
               , quote( ifc(index(OPT,'S'), strip(S)                                                             %* If double or single quote, use quote function                          ;
                      , ifc(index(OPT,'T'), trim(S ), substrn(S,1,max(L,LF)) ))                                  %*    Strip or trim if needed                                             ; 
                 , QUOTE)
               , trimn(Q1)                                                                                       %* If not double or single quote, concatenate manually                    ;
               || ifc(index(OPT,'S'), strip(S)                                                                   %*    Strip or trim if needed                                             ; 
                , ifc(index(OPT,'T'), trim(S ), substrn(S,1,max(L,LF)) ))
               || trimn(Q2) 
               )
             || 'a';                                                                                             %* Add end-of-string marker                                               ; 
    end;
                    
    return (substr(RESULT,1,length(RESULT)-1));                                      
  endsub;

run;

options cmplib = WORK.MYFUNCS  ;  

data TEST;
  A = 'orange         ';
  B = 'strawberry     ';
  C = '               ';
  D = 'pear           ';
  E = 'custard   apple';
  F = '"paul''s" berry ';

  array ARR[*] A B C D E F ;

  length S $ 240;
  S = concat('|'  , '{'  , '$revers20.', 't     ', ARR); output;
  S = concat('|'  , '{'  , '$revers20.', 'd     ', ARR); output;
  S = concat('|'  , '{'  , '$revers20.', 'f     ', ARR); output;
  S = concat('|'  , '{'  , '$revers20.', 'q     ', ARR); output;
  S = concat('|'  , '{'  , '$revers20.', 'm     ', ARR); output;
  S = concat('|'  , '{'  , '$revers20.', 'u     ', ARR); output;
  S = concat('-|-', ']]]', '$hex20.   ', 'tdfqmu', ARR); output;
  S = concat(' | ', '{'  , '$8.       ', 'tdfqmp', ARR); output;
  S = concat(' | ', '"'  , '$8.       ', 'qdtu  ', ARR); output;  
run;

ChrisNZ_0-1632476013092.png

 

6 REPLIES 6
Oligolas
Barite | Level 11

Hi, you've seen LOTR too often, remember Sauron looses it all in the end.

Just joking.

Well thank you for sharing, it's interesting since you're using fcmp

________________________

- Cheers -

ballardw
Super User

@Oligolas wrote:

Hi, you've seen LOTR too often,


And that's a bad thing??? 😲 Perhaps just read too often. I know I read the books at least 25 times before the movies came out (annual summer activity for awhile).

ChrisNZ
Tourmaline | Level 20

Of course, FCMP's limitations mean this function cannot be as convenient as a SAS Institute-issued function. The main restrictions are: Strings only as inputs*, all strings must have the same length, and having to build an array before using the function. A missing feature would be option v to enable/disable using the format associated with each variable, instead of supplying a global format to all variables. This too could only be done by a native function. 

 

In the meantime this serves its purpose for me. 🙂

 

* One can easily write a very similar function for numbers only, but the limitations remain. I'll write it nonetheless since the default best. format of the CAT functions is not always suitable.

ChrisNZ
Tourmaline | Level 20

@Kurt_Bremser  The last SASGF I attended was San Francisco. @ScottBass offered me to speak a few times since, but I always declined. Maybe I'll do that next time. Thank you for your encouragement. 🙂

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1006 views
  • 11 likes
  • 4 in conversation