I wanted to drop a list of variables from my inputds. This list itself is present as observations in another dataset. After doing some googling, I found this excellent paper on the topic.
http://www2.sas.com/proceedings/sugi30/028-30.pdf
So i used the following code to make a list in a macro variable:
/*make a list of variables as a macro variable */
data _null_;
length allvars $1000;
retain allvars ' ';
set to_drop end=eof;
allvars = trim(left(allvars))||' '||left(_name_);
if eof then call symput('varlist', allvars);
run;
I am facing three problems now:
1) When I %PUT &VARLIST , the log displays only 31 of the variables whereas my list is actually 2000+ variables.
2) I don't clearly understand what the statement: trim(left(allvars)) || ' ' || left(_name_); is doing. I know trim removes leading spaces and left is to align left character strings but cannot understand the full statement.
3) Then I try to drop it from my inputds using the following code, I get a warning message and the drop doesn't happen:
data inputds2 (drop = &varlist);
set inputds;
run;
WARNING: The variable avg_weighted in the DROP, KEEP, or RENAME list has never been referenced.
NOTE: There were 43662 observations read from the data set WORK.INPUTDS.
NOTE: The data set WORK.INPUTDS2 has 43662 observations and 3465 variables.
In reality my variable name reads like : avg_weighted_minutes_view_3739 avg_weighted_minutes_view_7963 avg_weighted_minutes_view_(XXXX) The last 4 digits are random. The are SAS generated names since my labels contain spaces.
EDIT: Tried using another code which is working partially - it makes a bigger list- around 1000 of the 2000+ variables in the &VARLIST macro variable.
data _null_;
set to_drop;
call symput('varlist',trim(
resolve('&varlist')
)||' '||trim(_name_));
run;
%put &varlist;
Total beginner at SAS .
Thanks,
nikhil
Message was edited by: Nikhil Goyal
Nikhil,
The problems you are noticing come about because $1000 is not long enough to hold 2000 variable names. While you could increase it, there are easier ways to get the full set of 2000 names into a macro variable:
proc sql noprint;
select strip(_name_) into : varlist separated by ' ' from to_drop;
quit;
Finally, move the DROP to the SET statement to speed up processing:
data inputds2;
set inputds (drop=&varlist);
run;
Good luck.
Nikhil,
The problems you are noticing come about because $1000 is not long enough to hold 2000 variable names. While you could increase it, there are easier ways to get the full set of 2000 names into a macro variable:
proc sql noprint;
select strip(_name_) into : varlist separated by ' ' from to_drop;
quit;
Finally, move the DROP to the SET statement to speed up processing:
data inputds2;
set inputds (drop=&varlist);
run;
Good luck.
Hi Astounding,
The proc sql creates the list very fast. However when I drop the &varlist, I get an error message:
: The variable _TYPE_ in the DROP, KEEP, or RENAME list has never been referenced.
289
290 run;
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.INPUTDS2 may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.INPUTDS2 was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 1.03 seconds
cpu time 0.87 seconds
there is an observation called _TYPE_ in the to_drop list. I think that is causing the problem. Also I tried to increase the legth of allvars but i think there are limits - the 2000 variables have a length which exceeds the limit hence my &varlist is getting truncated !
data _null_;
length allvars $100000;
Sounds like you have a variable in the to_drop dataset not in the data set you want to modify. I would guess that you have some sort of summary data set generated by a procedure that adds _type_ to identify the combination of variables represented by the record.
proc sql noprint;
select strip(_name_) into : varlist separated by ' '
from to_drop
where _name_ ne '_TYPE_';
quit;
I agree with ballardw's diagnosis. It might be safer to try:
where upcase(_name_) ne '_TYPE_'
Also note, SAS places a limit on the length of character variables. 100,000 exceeds that ... I believe the limit is roughly 64,000. Macro variables have a similar size restriction as well. So if you had 5,000 variable names to drop, even a single macro variable might not be able to hold them all.
Looks like you're most of the way there.
You are spot on ! that worked and fixed it . I am thinking the best practice is to first create a new to_drop2 (matched list) from to_drop and inputds containing only common varnames. something along the line of :
proc sql;
create table to_drop2 as
select distinct a._name_
from to_drop as a
inner join dictionary.columns as b
on a._name_ = b.name
where b.memname = 'INPUTDS';
quit;
but it's returning a 0 obs dataset . . . if it works then I wont have to bother with the manual entering the "where _name_ . . . . "
Hi,
Memname is always stored in upper case, so if you used:
where b.memname = 'INPUTDS';
it should work.
Character values in SAS are case-sensitive, so when joining two datasets, you have to remember that and consider how that might affect your join.
"Note that many character values in the DICTIONARY tables are stored as all-uppercase characters; you should design your queries accordingly." That is from http://support.sas.com/documentation/cdl/en/lrcon/62955/HTML/default/viewer.htm#a002300185.htm
So I would suggest that you use the UpCase function as below:
on UpCase(a._name_) = b.name
Seems to be a growing number of these type of questions regarding using macro variables to hold large lists of items. IMO its really not a good idea. There are limits on the lengths of the them to start with, and the spaghetti %&&&&%& nonsense needed to handle them just gets out of hand. I would recommend you look at the metadata for your dataset and work from there:
/* I assue your dataset with variables to drop is have and is in work, and you have a dataset called drop_list which has one column called name */
data _null_;
set drop_list end=last;
if _n_=1 then call execute('data have; set have (drop='||strip(name));
else call execute(' '||strip(name));
if last then call execute('); run;');
run;
This will create a dataset code which will have a drop statement for each row in drop_list.
Furthermore, SAS itself keeps plenty of metadata on the datasets in SASHELP.VCOLUMN/VTABLE. From there you can query columns, types, obs etc.
Repeatedly getting error message:
155-205: Line generated by the CALL EXECUTE routine.
2076 + .
-
23
ERROR 23-7: Invalid value for the DROP option.
The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.have may be incomplete. When this step was stopped there were 0
observations and 0 variables.
WARNING: Data set WORK.have was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 3.66 seconds
cpu time 3.29 seconds
Since you are going for a "best practice", I would use the approach suggested by .
The following uses that approach, combined with your latest code:
data to_drop;
informat _name_ $32.;
input _name_;
cards;
_type_
name
age
height
;
proc sql;
create table to_drop2 as
select distinct a._name_
from to_drop as a
inner join dictionary.columns as b
on upcase(a._name_) = upcase(b.name)
where b.libname='SASHELP' and
b.memname = 'CLASS'
;
quit;
data _null_;
set to_drop2 end=last;
if _n_=1 then call execute('data want; set sashelp.class (drop=');
call execute(' '||strip(_name_));
if last then call execute('); run;');
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.