Hello everyone and thank you in advance for your help! This is my first post, so please forgive me if this is confusing or redundant.
Each month, I receive a large dataset with around 4000+ variables and 3000+ observations. The data is from 30 different centers, but I am only interested in the data from 3 centers. Some variables are the same for all centers, but variables are only linked to some of the centers. I would like to get rid of the variables that have no values for the centers that I am interested in. I currently have a code that works, but I have to split my file into 6 lists of variables. I feel like there must be a way to do this in one step without getting the messae "ERROR: The text expression length (65639) exceeds maximum length (65534). The text expression has been truncated to 65534" because I'm SURE people work with even larger datasets than mine!
I have this current setup:
/* To split up my dataset, I am using this code to find the names of the variables so I can create a list. The SAS Macro is able to handle only 800 variable names at a time. Otherwise I receive an error about macro variable size and truncation*/ proc contents data= &month noprint out=_contents_ order=varnum; run; proc sort data=_contents_; by varnum; run; quit; proc print data=_contents_ ; where varnum in (1, 2, 800, 801, 1600, 1601, 2400, 2401, 3200, 3201, 4000, 4386); var name varnum; run; /*I change the LET statement to reflect the part of the list I am at */ %LET listnum0=CallReportNum ReportVersion--Zip_Code_Information___Did_you_a; %LET set1= iteration1; %LET set0= &month; /*this is the main part of the code that gets rid of the variables with no observations */ proc transpose data= &set0(obs=0) out= vname ; var &listnum0; run; proc contents data= vname order= varnum; run; quit; proc sql; select catx(' ','n(',_name_,') as ',_name_) into : list separated by ',' from vname; create table temp as select &list from &set0 ;quit; proc transpose data= temp out= drop ; var _all_; run; proc sql; select _name_ into : drop separated by ' ' from drop where col1= 0; quit; data &set1; set &set0 (drop= &drop); run;
Is there a simple fix or creative workaround to this issue that I am unaware of?
Thanks again!
The limitation of that approach is the requirement to use macro variables. So don't use macro variables. It is probably easier to just generate code a text file instead.
* Get list of variables ;
proc transpose data=&month (obs=0) out=names ;
var _all_;
run;
* Generate code to count non-missing values ;
filename code temp;
data _null_;
file code ;
set names end=eof;
if _n_=1 then put
'proc sql noprint;'
/'create table _counts as select'
/' ' @
; else put ',' @ ;
put 'sum(not missing(' _name_ ')) as ' _name_ ;
if eof then put
'from &month'
/';'
/'quit;'
;
run;
* Run generated code ;
%include code / source2 ;
* Generate DROP statement ;
filename code temp;
data _null_;
set _counts ;
array c _numeric_;
file code lrecl=80 ;
length _name_ $32 ;
put 'drop ' @;
do _n_=1 to dim(c);
if c(_n_)=0 then do ;
_name_ = vname(c(_n_));
put _name_ @ ;
end;
end;
put ';' ;
run;
* Make version of data without empty variables ;
data &set1 ;
set &month ;
%include code / source2;
run;
@sabotaged Welcome to SAS forum. Would be nice to follow if you could state your objective/requirement before your code like
My objective is to ...............
The following code does it............
The problem I am encountering....
Any solutions plz....................
Are the missing variables likely to be character or numeric?
This is actually asked pretty frequently, so a search generates several ready made solutions:
User group paper:
http://support.sas.com/resources/papers/proceedings10/048-2010.pdf
Previous question here:
SAS Note:
The limitation of that approach is the requirement to use macro variables. So don't use macro variables. It is probably easier to just generate code a text file instead.
* Get list of variables ;
proc transpose data=&month (obs=0) out=names ;
var _all_;
run;
* Generate code to count non-missing values ;
filename code temp;
data _null_;
file code ;
set names end=eof;
if _n_=1 then put
'proc sql noprint;'
/'create table _counts as select'
/' ' @
; else put ',' @ ;
put 'sum(not missing(' _name_ ')) as ' _name_ ;
if eof then put
'from &month'
/';'
/'quit;'
;
run;
* Run generated code ;
%include code / source2 ;
* Generate DROP statement ;
filename code temp;
data _null_;
set _counts ;
array c _numeric_;
file code lrecl=80 ;
length _name_ $32 ;
put 'drop ' @;
do _n_=1 to dim(c);
if c(_n_)=0 then do ;
_name_ = vname(c(_n_));
put _name_ @ ;
end;
end;
put ';' ;
run;
* Make version of data without empty variables ;
data &set1 ;
set &month ;
%include code / source2;
run;
Tom, you are a genius! Thank you SO much for your help! This is exactly what I needed. Sending you many, many positive vibes. 🙂
Observation is the word used for one record (or row) in a SAS dataset. So every variable in a dataset has the same number of observations.
Do you mean that you want find out which variables have only missing values?
Do you have any numeric variables? If so does it matter if the variable has some observations with special missing values, like .A or .Z?
Also why do you want eliminate them?
Sounds like missing status has nothing to do with the request. Perhaps I misunderstood your original code.
So you just want to keep variables related to the selected centers?
How do you know which variables are related to which center?
How do you know which centers you want to keep?
It does not need to be real data. Make fake data that has the same structure and issues as the real data. Yes this takes work but you'll get an answer much faster so in the long run you save time.
Do you have a CENTER variable you can use to find the centers you want to keep?
If so you could just add a first step to subset the observations (rows) to just the centers of interest.
data step1 ;
set &month ;
where center in (......);
run;
Then use that dataset as the input to the code I posted and it will eliminate the columns that are all empty for just that subset of the rows.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.