Hello everyone. Is it possible to set up an array on multi type columns and use a missing function?
what I am trying to do is flag any row in which ANY variables are null between Variables starting in Column D and ending in Column Y (out of 50+ columns). However there is a wide mix of data between these columns (both character and numeric) so I was wondering if this could be done with a fancy trick using an array statement?
I could always write multiple where statments, such as "where missing(VARH)=0 OR missing(VARI)=o.......", but it seems like there should be a much easier way to do this.
Thanks so much!
Brandon
IF you wanted all variable instead of arrays use the NMISS and CMISS functions with the speciall variable lists _NUMERIC_ and _CHARACTER_;
flag = nmiss (of _numeric_) or cmiss (of _character_);
but since you need specific variable you would need to put lists of the variable names in place of _numeric_ or character.
Something like:
proc sql;
select name into : clist separated by ','
from dictionary.columns
where libname='WORK' and memname = 'DATASETNAME' and TYPE='char' and
varnum between 1 and 50;
select name into : nlist separated by ','
from dictionary.columns
where libname='WORK' and memname = 'DATASETNAME' and TYPE='num' and
varnum between 1 and 50;
quit;
NOTE: the library and dataset names above must be in upper case when compared with the dictionary value. Replace with your data set libarary and dataset name. You will need to determine the minimum and maximum values for VARNUM by counting across your columns.
Then in your data step:
flag = nmiss (&nlist) or cmiss (&clist);
Flag should have a value of 1 for any row with a missing value, 0 otherwise.
You can't quite have an array for both char and num. If your table structure is fairly static, you can use one-dimensional arrays using _character_ and _numeric_ to load all variables into an array at each step and use array to loop. However, to do this, you need to know the index of columnD in the subset of ordered character or numeric variables. If your table structure is fairly static, it is probably simpler to hard code it.
However, otherwise, you can use a if _n_=1 then do; end; block to read into sashelp.vcolumn to figure out how how many total columns are numeric (resp char), what's the relative index of the first columnX that is num (resp char) and index of last one, store those in a 2*2 array _temporary_ (so that you have it availible through each further iteration of the data step without recalculating) and use this array of indexes to control your loops over the _character_ and _numeric_ array.
The painful part is coding the do;end; block to generate the indexes, hence why I suggest that you hard code it if your table structure is static.
Vince
I suspect this is one of the reasons why SAS came up with Hash().
Easy cmiss counts missing values and you can use the name range list.
Interesting, I didn't know cmiss worked on both character and numeric variables (even though it's actually written in sas documentation).
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.