DATA Step, Macro, Functions and more

formats for variables with all missing values

Reply
Regular Contributor
Posts: 196

formats for variables with all missing values

I have a macro that reads in a group of files.

However, in some files all observations for 1 or more variables are missing.

If I try to apply a format (eg MMDDYY10.) to a variable with all missing values I receive an error.

Does anyone know how I can apply a format only when the variable's observations are not completely missing?

Super User
Posts: 10,474

Re: formats for variables with all missing values

Does the error relate to data type such as trying to use a numeric function with a character variable? Missing values are usually not a problem but type mismatches are. In which case you would need to know whether the variable(s) were actually created as numeric or character.

Regular Contributor
Posts: 196

Re: formats for variables with all missing values

For files where the variable is completely missing it's being stored as Char with Format $1.

In files where it's not completely missing it's stored as Num.

For the variable where I'm receiving an error I'm trying to apply MMDDY10.

Thank you.

PROC Star
Posts: 7,357

Re: formats for variables with all missing values

Post your macro.  I would guess that the problem is in your use of the format.

Super User
Posts: 5,076

Re: formats for variables with all missing values

The best solution would be finding a way to make the variable numeric, even when it is always missing.

You could modify your format statement, but you run the risk of formatting variables that you don't want formatted.  For example:

format colE - numeric - colH mmddyy10.;

That will format all numeric variables only, starting with colE and ending with colH.  But there may be other numeric variables in that range that aren't dates.  It all depends on whether this becomes easy or not when specifying ranges of variables.

Good luck.

PROC Star
Posts: 7,357

Re: formats for variables with all missing values

Sounds like a proc import problem.  You cannot apply a mmddyy10. format to a character variable.

Are the ONLY variables in your file that are char and have a format of $1 variables that were imported incorrectly?

If that is the case, it would be easy to identify them and replace them with same named variables with the format you want to apply.

Regular Contributor
Posts: 196

Re: formats for variables with all missing values

The format should be char for the variable when it is completely missing. In some files the variable's observations are completely missing (this isn't an error). What I'm asking is how do I tell SAS not to apply a format (eg MMDDYY10.) for files where the variable's observations are completely missing.

Super Contributor
Posts: 1,636

Re: formats for variables with all missing values

You need to find out which variable has completely missing value first.

Regular Contributor
Posts: 196

Re: formats for variables with all missing values

There are only two variables where this is an issue (ie date_returned and date_left).

How do I tell SAS to not apply MMDDYY10 when either date_returned or date_left is completely missing?

Super User
Posts: 17,771

Re: formats for variables with all missing values

Do you plan to ever combine these files into one?

You won't be able to if you leave one character in one dataset and one numeric in the other.

Its also much easier to correct the code to convert date_returned and date_left (which are dates into numbers) to a number than it is to check if all is missing.

You need to have the dataset created first. Then need to check if all are missing or if the variable is character. If it isn't character then apply the format, if it is don't. But this will have to be macro code because formats aren't conditional statements so you can't put it in a standard dataset.

Super User
Posts: 5,076

Re: formats for variables with all missing values

Reeza raises a good point about whether you need to combine the files.

But if you need to apply this to an individual file, take two steps.  First, remove date_returned and date_left from the FORMAT statement that you have now.  Then add these two statements:

format date_returned - numeric - date_returned mmddyy10.;

format date_left - numeric - date_left mmddyy10.;

That will apply the format when they are numeric, and leave the variables untouched when they are character.

PROC Star
Posts: 7,357

Re: formats for variables with all missing values

Here is another way to approach the problem, using some code stolen from one of Mike Zdeb's papers (see: http://www.nesug.org/Proceedings/nesug11/ds/ds12.pdf )

/*create some test data*/

data class;

  set sashelp.class;

  call missing(weight);

  if ranuni(987) le .2 then

   call missing(sex, age);

run;

proc format;

  value nm

  . = 'MISSING'

  other = 'OK'

  ;

  value $ch

  ' ' = 'MISSING'

  other = 'OK'

  ;

run;

ods output onewayfreqs=tables (keep=table f_: frequency percent);

proc freq data=class;

  tables _all_ / missing;

  format _numeric_ nm.

  _character_ $ch.;

run;

ods output close;

ods listing;

data need;

  set tables;

  table=substr(table,7);

  if strip(coalescec(of f_Smiley Happy) eq "MISSING" ad percent ge 100;

run;

proc sql noprint;

  select catt(table,"=_",table),

         catx(" ","format",table,"mmddyy10.;")

         into :renames separated by " ",

              :formats separated by " "

    from need

  ;

quit;

data want (drop=_Smiley Happy;

  set class (rename=(&renames.));

  &formats.

run;

Ask a Question
Discussion stats
  • 11 replies
  • 361 views
  • 3 likes
  • 6 in conversation