- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am working with a very big dataset (more than 10k rows) and am trying to find all the columns that have no data in them. I ran a proc freq but is there a way to put all those columns that are completely empty into a list or a table?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried any of the solutions found on here? There are several answered questions relating to this:
https://communities.sas.com/t5/SAS-Programming/deleting-empty-columns/m-p/308045
https://communities.sas.com/t5/SAS-IML-Software-and-Matrix/Delete-quot-zero-columns-quot/m-p/164261
If none of those work, please indicate how your situation is different so we can help you find a customized solution.
@marleeakerson wrote:
Hello,
I am working with a very big dataset (more than 10k rows) and am trying to find all the columns that have no data in them. I ran a proc freq but is there a way to put all those columns that are completely empty into a list or a table?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Then just don't drop them.
You'll get a list of your variables this way for example, slightly modified from first link:
ods select none;
ods output nlevels=temp;
proc freq data=your_data nlevels;
tables _all_;
run;
proc sql;
select tablevar into : drop separated by ','
from temp
where NNonMissLevels=0;
quit;
%put &drop;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You might try the option NLEVELS on proc freq with the ODS Select to restrict the output to that table only.
Quick example:
data example; input x y z q; datalines; 1 . 3 . 2 2 2 . 3 . 3 . ; ods select nlevels; proc freq data=example nlevels; run;
The output only shows the "levels" information, which is number of different values.
The key for your part is in the result for Q that shows Nonmissing levels of 0. That is what appears if a variable only has missing values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Ha. Check paper of mine and @Kurt_Bremser
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2020/4737-2020.pdf