Hi All.
I am having around 30 datasets and every data set containing more then 20 variables. In 20 variables,6 variables are header variables(I.e Every dataset having this variables)
i want to delete the row if all the variables(other then header variables) contains null values in each dataset.
Could any one please help on this...
Thanks in Advance..
Regards..
Sanjeev.K
hi ... if all the non-header variables are contiguous, this works (consider ID and AGE as two header variables) ...
data x;
input id age a b c (d e f) (:$1.);
datalines;
12 10 . . . . . .
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3 . . .
. . 6 7 8 o o o
. . . . . . . .
;
options missing=' ';
data y;
set x;
if ^missing(catt(of a--f));
run;
options missing='.';
id age a b c d e f
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3
. . 6 7 8 o o o
if the non-header variables are not contiguous, you just have to modify the arguments to the CATT function to include all the non-header variables
hi ... if all the non-header variables are contiguous, this works (consider ID and AGE as two header variables) ...
data x;
input id age a b c (d e f) (:$1.);
datalines;
12 10 . . . . . .
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3 . . .
. . 6 7 8 o o o
. . . . . . . .
;
options missing=' ';
data y;
set x;
if ^missing(catt(of a--f));
run;
options missing='.';
id age a b c d e f
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3
. . 6 7 8 o o o
if the non-header variables are not contiguous, you just have to modify the arguments to the CATT function to include all the non-header variables
Hi...
My non-header variables containing both numeric and char variables..
Is this logic works on my datasets???
Thanks..
Sanjeev.K
Hi .. in my example, the non-header variables are a mix of numeric (A B C) and character (D E F). Yes, it works with that combination.
another way:
data x;
input id age a b c (d e f) (:$1.);
datalines;
12 10 . . . . . .
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3 . . .
. . 6 7 8 o o o
. . . . . . . .
;
data y;
set x;
if cmiss(of a--f)<6;
proc print;run;
Not Sure if this is the right way.
Because In my datasets non-header variables are not contiguous, and the count of non header variables are vary from one dataset to another.
So for each dataset i have to count non header variables and have to put that number in the condition.It may create additional work...
Thanks
Sanjeev.K
Hi,
Please check out my generalized approach. It does not have the restriction on variable position.
All you need to do:
1. know your common head variable names.
2. put all and only your old data sets under a certain library.
Haikuo
You can try the modified code:
data x;
input id age a b c (d e f) (:$1.);
datalines;
12 10 . . . . . .
13 . 1 2 3 q w e
14 24 . . . c v b
15 30 1 2 3 . . .
. . 6 7 8 o o o
. . . . . . . .
;
%let dsn=x; /* your dataset */
%let header=id age; /* your header variables */
proc sql noprint;
select CATS(max(varnum)) into :max from dictionary.columns where libname='WORK' AND MEMNAME="%upcase(&dsn)";
select CATS(NAME) INTO :VNAME SEPARATED BY ' ' FROM DICTIONARY.COLUMNS where libname='WORK' AND MEMNAME="%upcase(&dsn)";
quit;
data new_&dsn.;
set &dsn;
if cmiss(of &vname)-cmiss(of &header)<&max-%sysfunc(countw(&header));
run;
proc print;run;
Hi,
You have got great suggestions on single data set, while it would still be a chore to do 30 data set if they have different variable structures. Macro has been introduced below to help you on generalization of your problem.
1. Suppose your head variables are known, and in this example they are 'id' and 'age'.
2. You will output your cleaned data set to another library, in this case 'out'
3. Your original data sets are located in library 'work'.
4. The number of your header variable is 2 , in this case. So do replace it with 6, which is supposed to be your real case?.
libname out 'c:\temp\';
proc sql noprint;
select distinct cats(memname) into :dname separated by ' ' from dictionary.columns where libname='WORK';
quit;
%macro del_row;
%do i=1 %to %sysfunc(countw(&dname));
%let _dname=%scan(&dname,&i);
proc sql noprint;
select CATS(max(varnum)) into :max from dictionary.columns where libname='WORK' AND MEMNAME="&_DNAME";
SELECT DISTINCT CATS(NAME) INTO :VNAME SEPARATED BY ',' FROM DICTIONARY.COLUMNS where libname='WORK' AND MEMNAME="&_DNAME";
QUIT;
data out.&_dname._new;
set &_dname;
if cmiss(&vname)-cmiss(id,age)<&max-2;
run;
%end;
%mend;
%del_row
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.