I've got this table.
I want to know which variables has only missing values after a specific date (like September 2013 or any other date I will put) e.g var910001_t0 or var91059_t0 will be associated with September 2013. I'm showing you here, the variables that have only missing values after September 2013 but it can be June 2014.
I thought about doing a proc sql but it does not seem to do what I want. I thought about doing an array but I'm not that familiar with Array yet.
Any help will be welcome as I've never done this before.
Sorry, yes I should have replaced that:
options noquotelenmax;
data _null_;
set WORK.LOOP;
call execute('proc sql;
insert into WORK.RESULTS
set VARIABLE="'||strip(name)||'",
FIRST_DATE=(select MIN(BASE.NOUV_DATE) format=date9.
from WORK.HAVE BASE
where BASE.'||strip(name)||' is null
and not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));
quit;');
run;
options quotelenmax;
Eew, horrible data. The below will create a results table which, for each variable mentioned in the vname arrays, create one row with variable name and the date where missings started and didn't stop.
data have;
attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;
proc sql;
create table WORK.RESULTS
(
VARIABLE char(20),
FIRST_DATE num format=date9.
);
quit;
options noquotelenmax;
data _null_;
array vnames{2} $200. ("VAR91001_T0","VAR91059_T0");
do i=1 to 2;
call execute('proc sql;
insert into WORK.RESULTS
set VARIABLE="'||strip(vnames{i})||'",
FIRST_DATE=(select MIN(BASE.NOUV_DATE) format=date9.
from WORK.HAVE BASE
where BASE.VAR91001_t0 is null
and not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
quit;');
end;
run;
options quotelenmax;
Hi RW9,
Don't let me start on that. Thinking about drinking in the morning. Just kidding.
I have a try.
RW9,
Definitely do what I want but I need to have all the variables (around 5000). Is there a way to have them all?
Cheers.
Whoah, it gets worse, 5000 variables! Anyways, just create a dataset with the list of variables (from sashelp.vcolumn) and then use that in your data _null_ step as the loop instead of the do + array:
data have;
attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;
proc sql;
create table WORK.RESULTS
(
VARIABLE char(20),
FIRST_DATE num format=date9.
);
create table WORK.LOOP as
select NAME
from SASHELP.VCOLUMN
where LIBNAME="WORK"
and MEMNAME="HAVE"
and NAME ne "NOUV_DATE";
quit;
options noquotelenmax;
data _null_;
set WORK.LOOP;
call execute('proc sql;
insert into WORK.RESULTS
set VARIABLE="'||strip(name)||'",
FIRST_DATE=(select MIN(BASE.NOUV_DATE) format=date9.
from WORK.HAVE BASE
where BASE.VAR91001_t0 is null
and not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
quit;');
run;
options quotelenmax;
One more question RW9.
In your query
where BASE.VAR91001_t0 is null
and not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
How do you use specific variable name instead of VAR91001_t0 ?
Sorry, yes I should have replaced that:
options noquotelenmax;
data _null_;
set WORK.LOOP;
call execute('proc sql;
insert into WORK.RESULTS
set VARIABLE="'||strip(name)||'",
FIRST_DATE=(select MIN(BASE.NOUV_DATE) format=date9.
from WORK.HAVE BASE
where BASE.'||strip(name)||' is null
and not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));
quit;');
run;
options quotelenmax;
What is wrong with a datastep and using then NMISS function?
SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (nmiss)
I believe that function operates across variables, this problem is across rows per variable. Also there are > 5000 variables so couldn't dynamically put them into a macro variable. Plus the possibility that just one row may be missing then the next not, so only min of last block of missing.
Thanks RW9.
It is working fine. This is the first time I'm seeing something like what you did. Definitely useful.
data have; attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.; nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output; nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output; nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output; nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output; nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output; nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output; nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output; run; data temp(keep=var:); set have(where=(nouv_date ge '01may13'd)); call symputx('n',_n_); run; %put &n; data _null_; set sashelp.vcolumn(where=(libname='WORK' and memname='TEMP' )) end=last; if _n_ eq 1 then call execute('proc sql;create table x as select '); call execute('nmiss('||strip(name)||') as '||name ); if not last then call execute(','); else call execute('from temp;quit;'); run; proc transpose data=x out=want(where=(col1 = &n )); run;
Xia Keshan
Thanks Xia.
Runnin RW9 sql at the moment. Will run yours when everything will be completed.
Cheers.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.