BookmarkSubscribeRSS Feed
Alex_Peterson
Calcite | Level 5

I am creating variables within a macro %do loop inside a data step, kinda like this:

 

 

     %DO I = 1 %TO 5;
            NEW_VAR&I. = "Hello";
     %END;

 

This successfully creates variables named new_var1 through new_var5, all filled with “Hello.” What I’m trying to figure out is if you can do conditional formatting. For example:

 

 

      %DO I = 1 %TO 5;
            IF &I. < 5 THEN DO;
                  NEW_VAR&I. = "12/12/2010";  * This automatically makes the variables type CHAR;
            END;

            IF &I. = 5 THEN DO;
                  NEW_VAR&I. = 18608; * the number equivalent of the date I want;
                  FORMAT NEW_VAR&I. MMDDYY10.; * I want new_var5 to be a different format than the others;
            END;
      %END;

 

This gives me the error: “The format $MMDDYY was not found or could not be loaded.”

 

This seemingly would only arise if NEW_VAR5 was somehow already defined as CHAR.  It appears that NEW_VAR5 is already initialized deep in SAS somewhere with a CHAR format, which is causing me issues. I tried doing MACRO %IF statements instead, which actually does solve this problem, but the reason I don’t want to switch to MACRO IF statements is because functions like %ANYALPHA don’t exist in natural SAS macro functions (unlike %UPCASE(), %SUBSTR(), etc).

 

Any ideas for how to solve this problem? Thanks!

12 REPLIES 12
mkeintz
PROC Star

 

You have an ordinary data step IF test that you intend to dynamically assign formats.  But FORMAT statements are not "executable" in a data step.  So they are honored regardless of whether they are in an IF ... THEN DO ... structure.  So you can assign conflicting formats to a variable.

 

You need to use MACRO %IF ... %THEN %DO, as in:

%macro chk;
%DO I = 1 %TO 5;
            %IF &I. < 5 %THEN %DO;
                  NEW_VAR&I. = "12/12/2010";  * This automatically makes the variables type CHAR;
            %END;

            %IF &I. = 5 %THEN %DO;
                  NEW_VAR&I. = 18608; * the number equivalent of the date I want;
                  FORMAT NEW_VAR&I. MMDDYY10.; * I want new_var5 to be a different format than the others;
            %END;
%END;
%mend chk;
dm 'clear log';
options mprint;
data new;
  %chk ;
run;

If you set OPTIONS MPRINT prior to your data step, you would see relevant notes in the sas log.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Alex_Peterson
Calcite | Level 5

Thank you for this answer! Is there any other method to accomplish this (no matter how inefficient) ? As I mentioned, I am trying to avoid using macro %IF statements because I cannot do something like %IF ANYALPHA(variable) %THEN %DO;.  ANYALPHA is not a compatible function for MACRO %IF.

mkeintz
PROC Star

Show us how you want to use the ANYALPHA function.   Perhaps it can be successfully used (like most normal sas functions) when embedded in the macro %SYSFUNC() function.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Alex_Peterson
Calcite | Level 5

This is ultimately the type of thing I'm hoping to accomplish:

 

I have a data set called A1, with the variables VAR1 - VAR500. The variables are either dates or names, all in CHAR format.  

%MACRO test;
       DATA A2;
              set A1;
              %DO I = 1 %TO 500;
                     IF ANYALPHA(VAR&I.) THEN DO;
                            NEW_VAR&I. = VAR&I.;
                     END;
                     ELSE DO;
                            NEW_VAR&I. = input(VAR&I., MMDDYY10.);
                            FORMAT NEW_VAR&I. MMDDYY10.;
                     END;
              %END;
       RUN;
%MEND test;
Tom
Super User Tom
Super User

That doesn't make sense.  If you attach a format to a variable it is attached to the VARIABLE, not any particular VALUE that the variable might have on some individual observation. 

 

The only reason to try to do what it looks like your code is trying to do would to standardize the way the dates a represented in the character variable NEW_VAR&i. So values like '1/5/19' would get converted to '01/05/2019'.

 

new_var&i = var&i;
if 0 < lengthn(new_var&i) <=10 then
  if not missing(input(new_var&i,??mmddyy10.)) then
  new_var&i = put(input(new_var&i,mmddyy10.),mmddyy10.)
;

 

 

Alex_Peterson
Calcite | Level 5

Tom, I do see what you are saying, but I will need to make calculations on the variables that are dates, so they will need to be some numeric date format. For example, my data set could look like:

 

var1        var2              var3             var4          var5

Bill          12/12/2011    01/03/1987   Sean         Matt

Mike        01/01/1991   09/29/2017   Carol        Deborah

 

 

I want my program to go through each of these variables in a macro (with the var&I. in a %DO loop method), and when there is no letter detected in one of the values in that variable, to assign that variable the DDMMYY10 format. 

 

 

Tom
Super User Tom
Super User

@Alex_Peterson wrote:

Tom, I do see what you are saying, but I will need to make calculations on the variables that are dates, so they will need to be some numeric date format. For example, my data set could look like:

 

var1        var2              var3             var4          var5

Bill          12/12/2011    01/03/1987   Sean         Matt

Mike        01/01/1991   09/29/2017   Carol        Deborah

 

 

I want my program to go through each of these variables in a macro (with the var&I. in a %DO loop method), and when there is no letter detected in one of the values in that variable, to assign that variable the DDMMYY10 format. 

 

 


That is a totally different problem.  You will need to read the ENTIRE dataset and derive a flag for each variable, as a whole, to indicate what type you want to convert it into. Then use that information to generate the code to make the conversion.  You will need to make a decision what to do when 9 out of 10 observations for VAR2 look like dates.  Should the variable become a date? Or stay as a character string? What about when only 60% look like dates? 40%?

 

So before trying to make a MACRO to generate the code you want to run you need to figure what code you need to run.  

 

It might be easier to just dump the data to delimited text file and let PROC IMPORT do the guessing for you.

 

filename csv temp;
data _null_;
  set have ;
  file csv ;
  put var1-var5;
run;
proc import datafile=csv dbms=csv 
  out=want replace 
;
  getnames=no;
run;

 

 

 

Alex_Peterson
Calcite | Level 5

Tom, thanks for your reply. Assume the data is for sure fully date in DD/MM/YYYY style or fully names. So no checking required. In fact, assume I'm only reading the first observation to determine format assignments. The fundamental question is whether, in a MACRO, you can assign a different format to a dynamically created variable conditioned on whether or not the value in that variable contains a letter.

 

I know there are other ways I could change the format of these variables. For example, with PROC IMPORT, or just doing a new data step and writing something like FORMAT NEWVAR1-NEWVAR15  NEWVAR43-NEWVAR65 MMDDYY10.;

 

The question is whether you can accomplish this using IF THEN logic, and having the ANYALPHA function as the condition. The reason it needs to be in a MACRO is because I need the ability to create variables with the form NEWVAR&I. in a %DO loop.  

Tom
Super User Tom
Super User

Before you make a macro you need to make a program. Then generalize it so you can generate the program with a macro.

So if you are ok with only testing one observation it will be easier.

%let var1=0;
data _null_;
  set have (obs=1);
  if anyalpha(VAR1) then call symputx('VAR1',1);
run;

data want;
  set have ;
%if &var1 = 1 %then %do;
  new_var1 = var1 ;
%end;
%else %do;
  new_var1 = input(var1,mmddyy10.);
  format new_var1 mmddyy10.;
%end;
run;

I will leave it as an exercise for you how to extend this to multiple variables.

Reeza
Super User

Sounds like you're cleaning up data. 

 

What types are you current variables? What makes you think you need macro's at all here by the way, have data step functions not worked so far?

 

You cannot just assign a variable format by the way, you need to first convert it into a new variable and then apply a format to that data set. 

 

If your data is as clean as shown any reason to not use the / as a delimiter?

 


@Alex_Peterson wrote:

Tom, I do see what you are saying, but I will need to make calculations on the variables that are dates, so they will need to be some numeric date format. For example, my data set could look like:

 

var1        var2              var3             var4          var5

Bill          12/12/2011    01/03/1987   Sean         Matt

Mike        01/01/1991   09/29/2017   Carol        Deborah

 

 

I want my program to go through each of these variables in a macro (with the var&I. in a %DO loop method), and when there is no letter detected in one of the values in that variable, to assign that variable the DDMMYY10 format. 

 

 


 

Astounding
PROC Star

One possible key to resolving this:  what is in the data set A1?  If it already contains new_var5 (defined as character) that would cause a problem.  You may need to get rid of any existing variable names that begin with "new_var":

 

set A1 (drop=new_var: );

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 4500 views
  • 0 likes
  • 6 in conversation