BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Birgithj
Fluorite | Level 6

Hi, 

 

I am hoping you can help me. I have a large set of variables, where some of the variables have missing values (small example shown below). I only want to keep variables where there are no missing values, e.g. I need to delete variable B below. 

How can I delete all variables with missing values?

 

Var AVar BVar C
123
123
123
123
123
1missing3
1missing3

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

One way

 

data have;
input VarA VarB VarC;
datalines;
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 . 3
1 . 3
;

proc transpose data=have out=temp;
    var _numeric_;
run;

data temp2;
    set temp;
    if nmiss(of col:)=0;
run;

proc transpose data=temp2 out=want(drop=_:);
    id _name_;
    var col:;
run;

View solution in original post

8 REPLIES 8
PeterClemmensen
Tourmaline | Level 20

One way

 

data have;
input VarA VarB VarC;
datalines;
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 . 3
1 . 3
;

proc transpose data=have out=temp;
    var _numeric_;
run;

data temp2;
    set temp;
    if nmiss(of col:)=0;
run;

proc transpose data=temp2 out=want(drop=_:);
    id _name_;
    var col:;
run;
Birgithj
Fluorite | Level 6

Thank you so much! This solved my problem 🙂

FreelanceReinh
Jade | Level 19

Hi @Birgithj,

 

Alternatively, you can adapt a technique that has been suggested by @yabwon for a similar question yesterday, e.g., like this:

ods output nlevels=_nlev(keep=tablevar nmisslevels);
proc freq data=have nlevels;
tables _all_ / noprint;
run;

data _null_;
call execute('data want; set have; drop _n_');
do until(last);
  set _nlev end=last;
  if nmisslevels then call execute(tablevar);
end;
call execute('; run;');
stop;
run;

(This works for numeric and character variables.)

s_lassen
Meteorite | Level 14

Here is a way to do it:

data have;
  input Var_A	Var_B	Var_C Var_D;
cards;
1	2	3  1
1	2	3  .
1	2	3  .
1	2	3  .
1	2	3  .
1	.	3  .
1	.	3  .
run;

%let inds=WORK.HAVE;
data _NULL_;
  length missvars $2000;
  dsid=open("&inds");
  do i=1 to attrn(dsid,'nvars');
    var=varname(dsid,i);
	dsid2=open(catx(' ',"&inds(where=(",var,'is null))'));
	if fetch(dsid2)=0 then
	  call catx(' ',missvars,var);
	dsid2=close(dsid2);
	end;
  call symputx('missvars',missvars);
run;
%put &missvars;

Data want;
  set have(drop=&missvars);
run;
hashman
Ammonite | Level 13

@Birgithj:

For a task as simple as this, I see no reason to rewrite the original data set even once, let alone twice. Just auto-define a view VHAVE with the unneeded variables dropped and refer to it instead of HAVE in the processing downstream:

data have ;                                                                            
  input VarA VarB VarC VarD VarE VarF ;                                                
  cards ;                                                                              
1 2 3 1 2 3                                                                            
1 2 3 1 2 3                                                                            
1 2 3 1 2 .                                                                            
1 2 3 . 2 .                                                                            
1 2 3 1 2 3                                                                            
1 . 3 1 . 3                                                                            
1 . 3 1 . 3                                                                            
;                                                                                      
run ;                                                                                  
                                                                                       
data _null_ ;                                                                          
  do until (z) ;                                                                       
    set have end = z ;                                                                 
    array v var: ;                                                                     
    length _d $ 32767 ;                                                                
    do over v ;                                                                        
      if missing (v) and findw (_d, vname(v)) = 0 then _d = catx (" ", _d, vname (v)) ;
    end ;                                                                              
  end ;                                                                                
  call execute (cats ("data vhave/view=vhave; set have(drop=", _d, "); run;")) ;       
run ;                                                                                  

Now if you read VHAVE instead of HAVE, you'll have what you need. Note that the entire resource consumption with this method amounts to a single read through HAVE.

 

Kind regards

Paul D. 

Ksharp
Super User

Do not use PROC TRANSPOSE if you have a big table. ( EDITED )

data have;
input VarA VarB VarC;
datalines;
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 . 3
1 . 3
;
ods select none;
ods output nlevels=temp;
proc freq data=have nlevels ;
table _all_;
run;
ods select all;

proc sql;
select tablevar into : drop separated by ' '
 from temp
  where nmisslevels>0;
quit;

data want;
 set have;
 drop &drop;
run;
proc print;run;

 

hashman
Ammonite | Level 13

@Ksharp:

 "Do not use PROC TRANSPOSE if you have a big table."

 

Agree; TRANSPOSE is quite memory-hungry. Also, if better performance is desired, MEANS does much better than FREQ with NLEVELS (and that without the burden of using ODS), particularly if the VAR variables are numerous; e,g.:

proc means noprint data = have ;                                               
  var var: ;                                                                   
  output out = temp (drop = _:) nmiss=;                                        
run ;                                                                          
                                                                               
data _null_ ;                                                                  
  set temp ;                                                                   
  array nn _numeric_ ;                                                         
  call symputx ("drop", "") ;                                                  
  do over nn ;                                                                 
    if nn then call symputx ("drop", catx (" ", symget ("drop"), vname (nn))) ;
  end ;                                                                        
run ;                                                                          

Kind regards

Paul D.

 

p.s. As a side note, in your SQL, TableVar should be used instead of NLevels, or else the DATA step consuming the DROP macro variable won't produce the desired result (and bomb to boot on the attempt to use an invalid SAS name in the DROP list).  

 

 

As a side note: Your last step will bomb on the DROP statement because in the preceding SQL you select NLevels, while it should be TableVar instead.

Ksharp
Super User

Hi @hashman ,

Thanks point out that error. it has been fixed.

 

As you said PROC MEANS can't handle character variable.

Another Alternative way is using PROC SQL, but need some more code (I did't post it ), I believe SQL would have faster speed .

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 4304 views
  • 3 likes
  • 6 in conversation