BookmarkSubscribeRSS Feed
Rik
Calcite | Level 5 Rik
Calcite | Level 5
Hi,

I'm looking for a simple solution to retrieve all variables from a data set which are empty for all observations.

Anyone?
6 REPLIES 6
deleted_user
Not applicable
SAS-L provides an excellent source for such challenges. It is accessible either as comp.soft-sys.sas in Google Groups or at http://www.listserv.uga.edu/archives/sas-l.html. These loosely mirror each other.

A quick search in SAS-L archives or Google Group comp.soft-sys.sas for "empty columns" will return a thread from earlier this year.
From discussions on SAS-L the most common method uses SQL to acquire the list of entirely-missing columns.
The following code creates syntax to create a table with like-named columns and just one row, holding the count of rows where that column is non-missing. When the count is zero, the column is entirely missing.
SQL does not support the missing() function because it already has {column} is null .
I have created a test table in work.demo in which some character columns are entirely missing.
In the code that follows the first SQL statement creates the "counting syntax". The second SQL statement then uses that to create the table of summary counts [pre]proc sql noprint ;
select 'sum( not ' !! trim(name) !! ' is null ) as ' !! name
into :counters separated by ', '
from dictionary.columns
where libname='WORK' and memname= 'DEMO'
;
create table emptys as
select &counters
from demo
;
quit; [/pre]
A common reason for creating this information is to build a fresh table excluding the empty columns. The following code achieves that. [pre]proc transpose ; run;
proc sql noprint ;
select _name_ into :wanted separated by ', '
from _last_ where col1 gt 0
;
create table reduced as select &wanted from demo;
quit; [/pre]
Here proc transpose has taken the default options. It reads the latest dataset work.emptys, which has just come out of proc sql. By Default it writes to the next dataset in the work.data{n} sequence. In the proc SQL that follows, the output from proc transpose is referred as _last_.

Although such extended use of defaults may be deprecated by some, I find it helpful in reducing code to the more-relevant issues.

Another $0.02
from

PeterC
Olivier
Pyrite | Level 9
Some shorter code if you have SAS 9 : proc Freq does count how many non-missing distinct value each variable has, so just re-read these countings and build a list of unwanted variables. Then use the list in an ALTER TABLE ... DROP statement.
[pre]
DATA work.test ;
DO i = 1 TO 30 ;
x1 = i*2 ;
x2 = . ;
x3 = " " ;
x4 = PUT(i,Z2.) ;
OUTPUT ;
END ;
RUN ;
ODS OUTPUT nLevels = work.nValues ;
PROC FREQ DATA = work.test NLEVELS ;
TABLES _ALL_ / NOPRINT ;
RUN ;
PROC SQL NOPRINT ;
SELECT tableVar
INTO : dropThoseVars SEPARATED BY ","
FROM work.nValues
WHERE nNonMissLevels = 0
;
ALTER TABLE work.test
DROP &dropThoseVars ;
QUIT ;
[/pre]
Regards.
Olivier
deleted_user
Not applicable
Thank you Oliver

That provides a very much simpler method of obtaining nMiss counts for character ( as well as numeric ) columns.

Regards

PeterC

P.S.
my watch of the message returned the syntax in "fixed pitch" which made reading and understanding the code, much easier. White space and indentation are also maintained on the posting.

We can achieve the same effect on messages in the Forum. Place ["pre"] before code and ["/pre"] after ~~~ but without those quotes.
Like: [pre]ODS OUTPUT nLevels= work.nValues ;
PROC FREQ DATA= work.test NLEVELS ;
TABLES _ALL_ / NOPRINT ;
run;[/pre]
Rik
Calcite | Level 5 Rik
Calcite | Level 5
Thanks Olivier,

This is what I needed.
There is 1 problem left with the solution that you propose and that is the output that’s created by the FREQ procedure. I receive hundreds of pages in my output window

Any suggestion on how to avoid having this output?
Specify the NOPRINT option in the PROC statement will result in terminating the FREQ procedure.

Thanks again for your help.

Message was edited by: Rik Message was edited by: Rik
deleted_user
Not applicable
Put an [pre] ODS Listing close ;[/pre] statement before the PROC FREQ statement.
Rik
Calcite | Level 5 Rik
Calcite | Level 5
Problem solved!

Thanks both for your help!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Health and Life Sciences Learning

 

Need courses to help you with SAS Life Sciences Analytics Framework, SAS Health Cohort Builder, or other topics? Check out the Health and Life Sciences learning path for all of the offerings.

LEARN MORE

Discussion stats
  • 6 replies
  • 1818 views
  • 0 likes
  • 3 in conversation