Solved: Re: delete columns if they are empty

hexx18 · Posted 08-08-2016 11:42 AM

Hi ,

I used proc import and get datasets a and b

PROC IMPORT OUT= WORK.auto&i DATAFILE= "C:\auto\&&price&i.xlsx"

DBMS=xlsx REPLACE;

SHEET="auto";

GETNAMES=YES;

RUN;

and i get two datasets as below and how to delete the columns which are empty like C,D,E,F,G,H from auto1and column D in auto2

Auto1

Zone	pricing	C	D	E	F	G	H
876231	1428
650123	1556
754258	1235

Auto2

Safety	measures	dimension	D
floor	1	2
lock	0.95	3
alarm	0.85	5
alarm	0.98	7

KoVa · Posted 01-22-2024 10:11 AM

I have tried the proc freq approach yet encountered the following error

ERROR: Not enough memory for all variables.

context: We have to use a standardized template table which is large, 138 columns, and I want to remove unused columns for a specific export.

is there another approach to this?

View solution in original post

ballardw · Posted 08-08-2016 12:22 PM

Welcome to the wonderful word of Excel and poor data results.

You could drop the variables for any purpose. Or if you really want a new data set:

data work.want;

set work.auto1 (drop=C D E F G H);

run;

hexx18 · Posted 08-08-2016 01:33 PM

Hi ,

I am running in a loop if i drop C column when i running i a loop i will not have C column in Auto2 also right but i want that column in Auto2

I need to write code n0284330in such a way that it drops the empty columns automatically from each dataset while creating the output

so that the same code used for Auto1 it will drop CD EFGH

and for Auto2 it will drop D

Can anyone pls help

Thanks

ballardw · Posted 08-08-2016 03:00 PM

If one really wants to automate a process then using Excel as a data source and Proc import to read the data are two suboptimal choices.

Excel in the form of XLSX or XLS files has no actual structure and manipulation of files can creat "phantom" variables and rows of data as you are experiencing.

Proc Import has to guess every single time a file is read as to the types of data and characteristics. Using default settings for proc import and Excel spreadsheets you can get different data types just by changing the sort order of the data before proc import.

If these files are supposed to contain the same data it may be worth the effort to convert the XLSX to CSV and write, or modify the program creatd by proc import for one file, a custom program to read in a consistent manner.

LinusH · Posted 08-08-2016 02:18 PM

There are at least 10 threads on communities that deals with this issue. Just do a search.

Data never sleeps

Ksharp · Posted 08-08-2016 11:51 PM

The simplest way is using proc freq+nlevels, if you want more flexibilty ,try SQL.

data have;
 set sashelp.class;
 call missing(sex,age);
 if _n_=3 then call missing(weight);
 run;
 
ods select none;
ods output nlevels=temp;
proc freq data=have nlevels;
 tables _all_;
run;
proc sql;
 select tablevar into : drop separated by ','
  from temp
   where NNonMissLevels=0;
   
  alter table have
   drop &drop; 
quit;

KoVa · Posted 01-22-2024 10:11 AM

I have tried the proc freq approach yet encountered the following error

ERROR: Not enough memory for all variables.

context: We have to use a standardized template table which is large, 138 columns, and I want to remove unused columns for a specific export.

is there another approach to this?

PaigeMiller · Posted 01-22-2024 10:36 AM

@KoVa wrote:

I have tried the proc freq approach yet encountered the following error

ERROR: Not enough memory for all variables.

How many rows in the data set? What is the PROC FREQ code you are using?

--
Paige Miller

KoVa · Posted 01-22-2024 02:35 PM

just a quick test table with about 8-9000 rows, but some columns are containing unique id's, so I think that's why the proc freq runs out of memory.

I used the exact code from above, only changed it to my table. .

data have;
 set sashelp.class;
 call missing(sex,age);
 if _n_=3 then call missing(weight);
 run;
 
ods select none;
ods output nlevels=temp;
proc freq data=have nlevels;
 tables _all_;
run;
proc sql;
 select tablevar into : drop separated by ','
  from temp
   where NNonMissLevels=0;
   
  alter table have
   drop &drop; 
quit;

KoVa · Posted 01-22-2024 02:55 PM

update: I found this macro with a proc freq but a bit different, which worked fine.

%macro findmiss(ds,macvar);
%local noteopt;
%let noteopt=%sysfunc(getoption(notes));
option nonotes;
*ds is the data set to parse for missing values;
*macvar is the macro variable that will store the list of empty columns;
%global &macvar; 
proc format;
  value nmis  .-.z =' ' other='1';
  value $nmis ' '=' ' other='1';
run;
ods listing close;
ods output OneWayFreqs=OneValue(
  where=(frequency=cumfrequency 
  AND CumPercent=100));

proc freq data=&ds;
  table _All_ / Missing ;
  format _numeric_ nmis. 
        _character_ $nmis.;
  run;
ods listing;
data missing(keep=var);
  length var $32.;
  set OneValue end=eof;
    if percent eq 100 AND sum(of F_:) < 1 ;
    var = scan(Table,-1,' ');
run;
proc sql noprint;
  select var into: &macvar separated by " "
  from missing;quit;
option &noteopt.;
%mend;

%findmiss(mydata,droplist); /*generate the list of empty columns */


data new;
  set mydata(drop=&droplist);
run;

Ksharp · Posted 01-22-2024 08:21 PM

I think you must have too many levels/values in some variable,

You could try PROC SQL way,that would get you faster.

data have;
 set sashelp.class;
 call missing(sex,age);
 if _n_=3 then call missing(weight);
 run;
 



proc transpose data=have(obs=0) out=vnames;
var _all_;
run;
proc sql noprint;
select cat('n(',_NAME_,') as ',_NAME_) into :vnames separated by ',' from vnames;
create table temp as
select &vnames. from have;
quit;
proc transpose data=temp out=temp2;
var _all_;
run;
proc sql noprint;
select _NAME_ into :drops separated by ',' from temp2 where col1=0;
alter table have
 drop &drops.;
quit;

Kurt_Bremser · Posted 08-09-2016 12:50 AM

If you use a textual file format and a data step for the transfer, the data step specifies the columns. Any Excel-typical "extras" are automatically dropped.

Proc Import is nice for tests, but should never be used in production-stage programs.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

The 2025 SAS Hackathon has begun!

SAS Training: Just a Click Away