BookmarkSubscribeRSS Feed
lisahoward
Calcite | Level 5

Hi everyone,

Does anyone know how i can "macrofy" the following code please?  Basically I am adding zeros back on to a variable. that had them stripped out from the variable code when I had to convert it to a numeric for a prior data step.

DATA wanta;

SET want;

check_for_005=find(code,'00000');

check_for_004=find(code,'0000');

check_for_003=find(code,'000');

check_for_002=find(code,'00');

check_for_001=find(code,'0');

run;

data wantaa;

  set wanta;

if check_for_001=1 and check_for_002=0 and check_for_003=0 and check_for_004=0 and check_for_005=0 then do codefinal=substr(code,1,1.);end;else;

if check_for_002=1 and check_for_003=0 and check_for_004=0 and check_for_005=0 then do codefinal=substr(code,1,2.);end;else;

if check_for_003=1 and check_for_004=0 and check_for_005=0 then do codefinal=substr(code,1,3.);end;else;

if check_for_004=1 and check_for_005=0 then do codefinal=substr(code,1,3.);end;

run;

data wantaaa;

  set wantaa;

cat=trim(left(codefinal))||trim(left(codeall));

run;

Many thanks!!

9 REPLIES 9
Amir
PROC Star

Hi,

I'm not sure that I understand your requirement. I don't see which of the "code..." variables you're using is numeric.

If you want to convert a character value to a numeric value in a data step then you can use something like:

num_var=input(char_var,8.);

Are the zeros you're adding leading or trailing?

Please provide some example data of what you have and what you want.

Regards,

Amir.

Amir
PROC Star

Hi,

Is this a double post of ?

Regards,

Amir.

lisahoward
Calcite | Level 5

Hi Amir,

Sorry for my delay in getting back to you.  Here is my program as it stands.  I just wanted to know if I could "marcrofy" some bits of it as it quote long winded.

*****Notes ***Import Excel worksheet contain code ranges eg - Excel Column called  CODE

                                                                                                                                                          001-010

                                                                                                                                                          050-055

                                                                                                                                                          1000-1100    and so on.

libname Codes excel "Codelist_LH.xlsx"

  header = yes

  mixed = yes;

run;

data Code_Range;

  set Codes.'List$'n;

  where Lvl1Group ne 'Exclusion';

run;

*****Read the Variable containing the individual code or code range that has been imported  and list out each code individually in a new Variable called INDIV_CODE;

data Code_Range1 (drop=BeginCode EndCode);

     set Code_Range ;

     BeginCode = input(scan(Code,1),best10.);

     EndCode = input(scan(Code,2),best10.);

      if missing(EndCode) then do Indiv_Code=input(Code,best10.);output ;end;

        else do Indiv_Code = BeginCode to Endcode;

          output ;

     end;

run;

*****Identify number of leading Zeros associated with original variable CODE to later assign to numeric variable INDIV_CODE ;

Data Code_Zero;

    set Code_Range1;

     Check_for_005=find(code,'00000');

     Check_for_004=find(code,'0000');

     Check_for_003=find(code,'000');

     Check_for_002=find(code,'00');

     Check_for_001=find(code,'0');

run;

*****Take the correct number of leading zeros per code and create a variable CODEFINAL containing the correct number of zeros ;

data Code_Zero1;

    set Code_Zero;

     if Check_for_001=1 and Check_for_002=0 and Check_for_003=0 and Check_for_004=0 and Check_for_005=0 then do Codefinal=substr(Code,1,1.);end;else;

     if Check_for_002=1 and Check_for_003=0 and Check_for_004=0 and Check_for_005=0 then do Codefinal=substr(Code,1,2.);end;else;

     if Check_for_003=1 and Check_for_004=0 and Check_for_005=0 then do Codefinal=substr(Code,1,3.);end;else;

     if Check_for_004=1 and Check_for_005=0 then do Codefinal=substr(Code,1,3.);end;

run;

*****Concatenate the variable CODEFINAL (contains zeros) and INDIV_CODE contains original code minus leading zeros;

data Num_Code (drop= check_for_001 check_for_002 check_for_003 check_for_004 check_for_005 Indiv_Code codefinal) ;

    set Code_Zero1;

     Numcode=trim(left(codefinal))||trim(left(Indiv_Code));

run;

*****Concatenate Character Prefix or Suffix to the variable Numcode ;

data Char_Code;

    set Num_Code;

     if missing(Prefix_) and missing(Suffix) then do Complete_Code=Numcode;end; else;

     if Prefix_ ne " " then do Complete_Code=trim(left(Prefix_))||trim(left(Numcode));end; else

     if Suffix ne " " then do Complete_Code=trim(left(Numcode))||trim(left(Suffix));end;

Run;

*****Drop Prefix and Suffix Columns and rename Complete_Code;

Data Final_Code (drop=numcode prefix_ suffix Complete_Code);

  set Char_Code;

  code= Complete_Code;

Run;

Reeza
Super User

Are you familiar with the Z format?

If you use input with the Z8. format for example you can convert number to have a specific about of leading zeros.

ie

data have;

input number;

cards;

8

28

456

3434343

343

26

6345

665

7572

;

run;

data want;

set have;

padded_number=put(number, z8.);

run;


lisahoward
Calcite | Level 5

Unfortunately the variable code is differing in length and number of leading zeros.  Would z8. not make code 8 characters long for every instance i have a variety of codes 050 50 0050 etc and the number of leading zeros is very important it has to be exact.

Reeza
Super User

You realize your find statement won't differentiate between

0500 and 0050? 

If you know the length of the original variable you can make it more flexible by using the Z||Length in combination with a putc or putn function. 

Also, why do this, in the previous step, why not create a new variable to do the analysis on and keep the old variable?

Peter_C
Rhodochrosite | Level 12

data Code_Range1 ;

     set Code_Range ;

*    BeginCode = input(scan(Code,1),best10.);

     Begin_code = scan( code,1) ;

       end_code = scan( code, -1) ;

     len_indiv =max(length( begin_code), length(end_code) ) ;

do Indiv_C= Begin_Code to End_code;

Length indiv_code $15 ;

        Indiv_code = putn( indiv_c, 'Z', len_indiv ) ;

          output ;

     end;

*Drop begin_code end_code len_indiv indiv_c ;

run;

 

Why make life difficult needlessly? Leave begin_code and end_code as strings.

Work out what output length is appropriate.

The code above assumes the wider of the begin/end pair.

Using SCAN(code, -1) for END provides the first when there is only one code. If course you might have more than two numbers. In that case the "-1" scans for the last of the three.

The PUTN() function allows the result length (format width) to be defined by the 3rd parameter of PUTN(). There you use the length of your input strings. Ensure you allow wide enough room. The original code used 10 when making those first conversions with an INPUT() function but you could make it any useful value up to 15.

 

I know this is not polishing your macro learning but I think it might be easier to maintain and hope it helps.

peterC

p.s.

I know the automatic conversion of strings begin_code and end_code will generate NOTEs in the saslog, but leave out the input() syntax for clarity.

lisahoward
Calcite | Level 5

Thanks so much Peter for your help I have updated my program with your improved suggestions.  I have a quick question about the  code below.  Is it possible to adapt it to look for ranges that have a character in them as well say 100E-150E or V300-V400.  at the moment I am stripping the prefix and suffix out in the Excel file and running the statement below and then concatenating the Character back on in another datastep.  Many thanks again for your brilliant answer.

data Code_Range1 ;

     set Code_Range ;

     Begin_code = scan( code,1) ;

       end_code = scan( code, -1) ;

     len_indiv =max(length( begin_code), length(end_code) ) ;

do Indiv_C= Begin_Code to End_code;

Length indiv_code $15 ;

        Indiv_code = putn( indiv_c, 'Z', len_indiv ) ;

          output ;

     end;

Drop begin_code end_code len_indiv indiv_c ;

run;

Peter_C
Rhodochrosite | Level 12

Lisa

where you have the variety of naming ranges you describe, just create a code block for each variety and if-then-else-if your way through

libname your (work) ;

data your.data ;

input code $char40. ;

list ; cards;

12345a-12347a

b432 -b434

2345

;

* not exhaustive testing ;

data results ;

set your.data ;

length indiv_code $20 ;

array part(3)  $20 ; * defining array of 3 $20 string vars named part1 - part3  ;

do i = 1 to 3 ;

    part(i) = scan( code, i) ;

end ;

if part3 ne ' ' then do ;

    * code block for handling  3+ parts in the code ;

end ;

else

if part2 = ' ' then do ;

    * code block to deal with single code ;

end ;

else do ;

    * handling CODE in 2 parts ;

    * first check for numeric-only strings ;

    if not verify( part1 !! part2, ' 1234567890' ) then do ;

    * code block for handling a pair of numeric strings as developed earlier ;

    end ;

    else do ;

        * handle CODE with parts that are not entirely numeric ;

        length c1-c4 $1  ;

        c1 = part1 ;  * first char of string ;

        c2 = part2 ;

         

        If verify( c1 !! c2, '1234567890' ) then do ;

            *handle non-numeric prefix ;

            if c1 = c2  then do ;

                * handle common prefix in part1 and 2 ;

                startNpos1 = anydigit( part1 ) ;

                startNpos2 = anydigit( part2 ) ;    

                prefix1 = substr( part1, 1, startNpos1 -1 ) ;

                prefix2 = substr( part2, 1, startNpos2 -1 ) ;

                if prefix1 ne prefix2 then do ;

                  * handle dis-similar prefixes !! ;

                end ;

                else do ;

                  * handle SAME prefixes !! ;

                    begin_code = substr( part1, startNpos1 ) ;

                    end_code   = substr( part2, startNpos2 ) ;

                    str_len = max( length( part1 ), length( part2) ) - length( prefix1 ) ;

                    do indiv_c = begin_code to end_code ;

                       indiv_code =  trim( prefix1 ) !! putN( indiv_c, 'Z', str_len ) ;

                       output ;

                    end ;

                end ;

              end /* prefix starts with same char */ ;

              else do ;

                    * how do you handle cases where prefix is not the same? ;

              end ;

        end /* part starts non-numeric */ ;

        else do ;

            * check for parts with non-numeric after the start

                       like 12345axx-12348axx ;

            * establish where suffix starts ;

            sufStart1 = verify( part1, '1234567890' ) ;

            sufStart2 = verify( part2, '1234567890' ) ;

            * establish suffix ;

            suf1 = substr( part1, sufStart1 ) ;

            suf2 = substr( part2, sufStart2 ) ;

            if suf1 ne suf2 then do ;

              * handle differing suffixes ;

            end ;

            else do ;

              * handle where suffixes are equal ;

                begin_code = substr( part1, 1, sufStart1 -1 ) ;

                end_code   = substr( part2, 1, sufStart1 -1 ) ;

                str_len = max( length( part1 ), length( part2) ) - length( suf1 ) ;

                do indiv_c = begin_code to end_code ;

                   indiv_code =  putN( indiv_c, 'Z', str_len ) !! suf1 ;

                   output ;

                end ;

            end ;

        end /* part contains non-numeric after the start */;

    end /* CODE not entirely numeric */ ;

end /* HANDLING CODE IN 2 PARTS */;

run ;   

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1614 views
  • 0 likes
  • 4 in conversation