Help using Base SAS procedures

knowing when there are only missing values

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

knowing when there are only missing values

I've got this table.

Capture1.PNG

I want to know which variables has only missing values after a specific date (like September 2013 or any other date I will put) e.g var910001_t0 or var91059_t0 will be associated with September 2013. I'm showing you here, the variables that have only missing values after September 2013 but it can be June 2014.

I thought about doing a proc sql but it does not seem to do what I want. I thought about doing an array but I'm not that familiar with Array yet.

Any help will be welcome as I've never done this before.


Accepted Solutions
Solution
‎07-11-2014 08:17 AM
Super User
Super User
Posts: 7,405

Re: knowing when there are only missing values

Sorry, yes I should have replaced that:

options noquotelenmax;

data _null_;

  set WORK.LOOP;

  call execute('proc sql;

                  insert into WORK.RESULTS

                  set VARIABLE="'||strip(name)||'",

                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.

                                  from    WORK.HAVE BASE

                                  where   BASE.'||strip(name)||' is null

                                    and   not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));

                quit;');

run;

options quotelenmax;

View solution in original post


All Replies
Super User
Super User
Posts: 7,405

Re: knowing when there are only missing values

Eew, horrible data.  The below will create a results table which, for each variable mentioned in the vname arrays, create one row with variable name and the date where missings started and didn't stop.

data have; 
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;

proc sql;
  create table WORK.RESULTS
  (
    VARIABLE char(20),
    FIRST_DATE num format=date9.
  );
quit;

options noquotelenmax;
data _null_;
  array vnames{2} $200. ("VAR91001_T0","VAR91059_T0");
  do i=1 to 2;
    call execute('proc sql;
                    insert into WORK.RESULTS
                    set VARIABLE="'||strip(vnames{i})||'",
                        FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.
                                    from    WORK.HAVE BASE
                                    where   BASE.VAR91001_t0 is null
                                      and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
                  quit;');
  end;
run;
options quotelenmax;

Frequent Contributor
Posts: 75

Re: knowing when there are only missing values

Hi RW9,

Don't let me start on that. Thinking about drinking in the morning. Just kidding.

I have a try.

Frequent Contributor
Posts: 75

Re: knowing when there are only missing values

RW9,

Definitely do what I want but I need to have all the variables (around 5000). Is there a way to have them all?

Cheers.

Super User
Super User
Posts: 7,405

Re: knowing when there are only missing values

Whoah, it gets worse, 5000 variables!  Anyways, just create a dataset with the list of variables (from sashelp.vcolumn) and then use that in your data _null_ step as the loop instead of the do + array:

data have; 
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;

proc sql;
  create table WORK.RESULTS
  (
    VARIABLE char(20),
    FIRST_DATE num format=date9.
  );

  create table WORK.LOOP as
  select  NAME
  from    SASHELP.VCOLUMN
  where   LIBNAME="WORK"
    and   MEMNAME="HAVE"
    and   NAME ne "NOUV_DATE";
quit;


options noquotelenmax;
data _null_;
  set WORK.LOOP;
  call execute('proc sql;
                  insert into WORK.RESULTS
                  set VARIABLE="'||strip(name)||'",
                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.
                                  from    WORK.HAVE BASE
                                  where   BASE.VAR91001_t0 is null
                                    and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
                quit;');
run;
options quotelenmax;

Frequent Contributor
Posts: 75

Re: knowing when there are only missing values

One more question  RW9.

In your query

where   BASE.VAR91001_t0 is null

                                    and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));


How do you use specific variable name instead of VAR91001_t0 ?

Solution
‎07-11-2014 08:17 AM
Super User
Super User
Posts: 7,405

Re: knowing when there are only missing values

Sorry, yes I should have replaced that:

options noquotelenmax;

data _null_;

  set WORK.LOOP;

  call execute('proc sql;

                  insert into WORK.RESULTS

                  set VARIABLE="'||strip(name)||'",

                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.

                                  from    WORK.HAVE BASE

                                  where   BASE.'||strip(name)||' is null

                                    and   not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));

                quit;');

run;

options quotelenmax;

Valued Guide
Posts: 3,208

Re: knowing when there are only missing values

What is wrong with a datastep and using then NMISS function?

SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (nmiss)

---->-- ja karman --<-----
Super User
Super User
Posts: 7,405

Re: knowing when there are only missing values

I believe that function operates across variables, this problem is across rows per variable.  Also there are > 5000 variables so couldn't dynamically put them into a macro variable.  Plus the possibility that just one row may be missing then the next not, so only min of last block of missing.

Frequent Contributor
Posts: 75

Re: knowing when there are only missing values

Thanks RW9.

It is working fine. This is the first time I'm seeing something like what you did. Definitely useful.

Super User
Posts: 9,681

Re: knowing when there are only missing values

data have;  
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;
data temp(keep=var:);
 set have(where=(nouv_date ge '01may13'd));
  call symputx('n',_n_);
run;
%put &n;


data _null_;
 set sashelp.vcolumn(where=(libname='WORK' and memname='TEMP' )) end=last;
 if _n_ eq 1 then call execute('proc sql;create table x as select ');
 call execute('nmiss('||strip(name)||') as '||name );
 if not last then call execute(',');
  else call execute('from temp;quit;');
run;
proc transpose data=x out=want(where=(col1 = &n ));
run;

Xia Keshan

Frequent Contributor
Posts: 75

Re: knowing when there are only missing values

Thanks Xia.

Runnin RW9 sql at the moment. Will run yours when everything will be completed.

Cheers.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 261 views
  • 6 likes
  • 4 in conversation