I am trying to dynamically build an output file depending on if data is present in the input file.
For example I have the following input file:
Name DOB Add1 Add2 Addr3
1 John 10/10/80 data data
2 Alan 11/11/81 data
3 Paul 10/10/79 data
In the ouput file I would not want Addr3 to be included as there is no data, obviously this is easy to see in a small file but I have a file with 1000's of records how can I write a progam to check each varaiable and only include variables that have data against at least 1 observation in the new output file.
Thanks
you can run the code below to drop the variables with all missing values:
data one;
infile datalines missover;
input name $ name1 name2 name3 name4 $;
datalines;
A . . . .
B . . . .
C 1 . . .
D 2 . . .
;
options missing= ' ';
proc transpose data=one out=two;
var _all_;
run;
options missing='.';
data _null_;
length drop $500;
if 0 then set one nobs=nvar;
do until (done);
set two end=done;
if cmiss(of col:) eq nvar then drop=catx(' ',drop,_name_);
end;
call symputx('drop',drop);
run;
data want;
set one (drop=&drop);
run;
proc print data=want;run;
Linlin
sorry little confused, do you have to transpose the data? an ideal ouput would be
Name Name 1
A .
B .
C 1
D 2
Yes. the output from my code is
Obs name name1
1 A .
2 B .
3 C 1
4 D 2
The purpose of the code is to drop the variables that have no value. what SAS version do you use? I think 'cmiss' is only available for sas 9.2 and 9.3.
I am using WPS Workbench at the moment and unfortunately it does not seem to support all the latest functions
There must be better approaches, but it is already 10pm and I could barely open my eye lips. So here is my 'brute force' solution:
data have;
infile cards missover;
input Name $ DOB :$10. (Add1 Add2 Addr3) (:$);
cards;
John 10/10/80 data data
Alan 11/11/81 data
Paul 10/10/79 data
;
data _null_;
length vdrop $300.;
set have nobs=nobs end=done;
array v (5) $ name--addr3;
array m (5) m1-m5;
do i=1 to 5;
if missing(v(i)) then m(i)+1;
end;
if (done) then do i=1 to 5;
if m(i)=nobs then do;
vdrop=catx(' ',vdrop,vname(v(i)));
end;
end;
call symput ('vdrop', vdrop);
call execute ('data want; set have (drop=&vdrop);run;');
run;
Regards,
Haikuo
Edit: You should go for Linlin's solution, since you only have 1k rows, unless you have more than 32,767 rows.
Hi Haikuo
Will try your solution now, having just checked my files I found some of them have over 200000 rows so your solution might be best.
Thanks, Robbie
Unfortunately and I can only assume that this is because I am using WPS workbench that output still has blank observations in.
Obs Name DOB Add1 Add2
1
2 John 10/10/80 data data
3
4 Alan 11/11/81 data
5
6 Paul 10/10/79 data
7
I have coded it before .
data have; infile cards missover; input Name $ DOB :$10. (Add1 Add2 Addr3) (:$); cards; John 10/10/80 data data Alan 11/11/81 data Paul 10/10/79 data ; run; proc sql noprint; select catx(' ','nmiss(',name,') as',name) into : list separated by ',' from dictionary.columns where libname='WORK' and memname='HAVE'; select count(*) into : nobs from have; create table temp as select &list from have; quit; data _null_; set temp; length _list $ 4000; array _x{*} _numeric_; do i=1 to dim(_x); if _x{i} eq &nobs then _list=catx(' ',_list,vname(_x{i})); end; call symputx('drop',_list); run; data want; set have(drop=&drop); run;
Ksharp
Hooray! I knew that we deserve to see something of this quality.
Two things learned:
1. nmiss() can be applied on character variables.
2. nmiss() can be used as 'summary functions' in SQL.
One question: Please explain the construct of nmiss(',name,'). I know this is the only way that works, but how and why?
Thank you, Ksharp!
Haikuo
HaiKuo,
I just make a series of string like:
nmiss(sex) as sex ,nmiss(name) as name,nmiss(weight) as weight ..........
and feed them to SELECT statement in SQL to get the number of missing value for all the variables.
Ksharp
Ksharp,
Thanks for the answer. I was just wondering why it has to be like: nmiss(',name,') instead of nmiss(name) or nmiss(,name,) or nmiss(,'name',) etc. I know this is the only way that works, but I can't figure it out how this construct be translated to nmiss(sex) as sex ,nmiss(name) as name,nmiss(weight) as weight ....
Any reference link will also be appreciated.
Haikuo
Is this helpful?
data have;
infile cards missover;
input Name $ DOB :$10. (Add1 Add2 Addr3) (:$);
cards;
John 10/10/80 data data
Alan 11/11/81 data
Paul 10/10/79 data
;
run;
proc sql noprint;
select catx(' ','nmiss(',name,') as',cat('n_missing_',name)) into : list separated by ','
from dictionary.columns
where libname='WORK' and memname='HAVE';
quit;
%put &list;
105 %put &list;
nmiss( Name ) as n_missing_Name,nmiss( DOB ) as n_missing_DOB,nmiss( Add1 ) as
n_missing_Add1,nmiss( Add2 ) as n_missing_Add2,nmiss( Addr3 ) as n_missing_Addr3
Thanks, LinLin. I know my brain stops working on Friday , so bear with me:
I am aware of the final outcome of &list, I am just could not figure out why nmiss(name) not working, while you have to use nmiss(',name,')?
Thanks,
Haikuo
Hi Haikuo,
catx(' ','nmiss(',name,')as ' ) will create a string "nmiss(variable-name)".
nmiss(variable-name) will create a number.
maybe you treat "'nmiss(',name,') as' "
"'nmiss(',name,') as'"
It should be
"'nmiss(',name,') as'"
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.