BookmarkSubscribeRSS Feed
Anotherdream
Quartz | Level 8

Hello everyone. Is it possible to set up an array on multi type columns and use a missing function?

what I am trying to do is flag any row in which ANY variables are null between Variables starting in Column D and ending in Column Y (out of 50+ columns). However there is a wide mix of data between these columns (both character and numeric) so I was wondering if this could be done with a fancy trick using an array statement?

I could always write multiple where statments, such as "where missing(VARH)=0 OR missing(VARI)=o.......", but it seems like there should be a much easier way to do this.

Thanks so much!

Brandon

5 REPLIES 5
ballardw
Super User

IF you wanted all variable instead of arrays use the NMISS and CMISS functions with the speciall variable lists _NUMERIC_ and _CHARACTER_;

    flag = nmiss (of _numeric_) or cmiss (of _character_);

but since you need specific variable you would need to put lists of the variable names in place of _numeric_ or character.

Something like:

proc sql;

   select name into : clist separated by ','

   from dictionary.columns

   where libname='WORK' and memname = 'DATASETNAME' and TYPE='char' and

      varnum between 1 and 50;

   select name into : nlist separated by ','

   from dictionary.columns

   where libname='WORK' and memname = 'DATASETNAME' and TYPE='num' and

      varnum between 1 and 50;

quit;

NOTE: the library and dataset names above must be in upper case when compared with the dictionary value. Replace with your data set libarary and dataset name. You will need to determine the minimum and maximum values for VARNUM by counting across your columns.

Then in your data step:

    flag = nmiss (&nlist) or cmiss (&clist);

Flag should have a value of 1 for any row with a missing value, 0 otherwise.

Vince28_Statcan
Quartz | Level 8

You can't quite have an array for both char and num. If your table structure is fairly static, you can use one-dimensional arrays using _character_ and _numeric_ to load all variables into an array at each step and use array to loop. However, to do this, you need to know the index of columnD in the subset of ordered character or numeric variables. If your table structure is fairly static, it is probably simpler to hard code it.

However, otherwise, you can use a if _n_=1 then do; end; block to read into sashelp.vcolumn to figure out how how many total columns are numeric (resp char), what's the relative index of the first columnX that is num (resp char) and index of last one, store those in a 2*2 array _temporary_ (so that you have it availible through each further iteration of the data step without recalculating) and use this array of indexes to control your loops over the _character_ and _numeric_ array.

The painful part is coding the do;end; block to generate the indexes, hence why I suggest that you hard code it if your table structure is static.

Vince

Haikuo
Onyx | Level 15

I suspect this is one of the reasons why SAS came up with Hash().

data_null__
Jade | Level 19

Easy cmiss counts missing values and you can use the name range list.

data cmiss;
   set sashelp.heart;
   cmiss = cmiss(of deathcause--cholesterol);
  
run;
Vince28_Statcan
Quartz | Level 8

Interesting, I didn't know cmiss worked on both character and numeric variables (even though it's actually written in sas documentation).

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1975 views
  • 1 like
  • 5 in conversation