BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
andy_wk
Calcite | Level 5

I've got this table.

Capture1.PNG

I want to know which variables has only missing values after a specific date (like September 2013 or any other date I will put) e.g var910001_t0 or var91059_t0 will be associated with September 2013. I'm showing you here, the variables that have only missing values after September 2013 but it can be June 2014.

I thought about doing a proc sql but it does not seem to do what I want. I thought about doing an array but I'm not that familiar with Array yet.

Any help will be welcome as I've never done this before.

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, yes I should have replaced that:

options noquotelenmax;

data _null_;

  set WORK.LOOP;

  call execute('proc sql;

                  insert into WORK.RESULTS

                  set VARIABLE="'||strip(name)||'",

                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.

                                  from    WORK.HAVE BASE

                                  where   BASE.'||strip(name)||' is null

                                    and   not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));

                quit;');

run;

options quotelenmax;

View solution in original post

11 REPLIES 11
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Eew, horrible data.  The below will create a results table which, for each variable mentioned in the vname arrays, create one row with variable name and the date where missings started and didn't stop.

data have; 
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;

proc sql;
  create table WORK.RESULTS
  (
    VARIABLE char(20),
    FIRST_DATE num format=date9.
  );
quit;

options noquotelenmax;
data _null_;
  array vnames{2} $200. ("VAR91001_T0","VAR91059_T0");
  do i=1 to 2;
    call execute('proc sql;
                    insert into WORK.RESULTS
                    set VARIABLE="'||strip(vnames{i})||'",
                        FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.
                                    from    WORK.HAVE BASE
                                    where   BASE.VAR91001_t0 is null
                                      and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
                  quit;');
  end;
run;
options quotelenmax;

andy_wk
Calcite | Level 5

Hi RW9,

Don't let me start on that. Thinking about drinking in the morning. Just kidding.

I have a try.

andy_wk
Calcite | Level 5

RW9,

Definitely do what I want but I need to have all the variables (around 5000). Is there a way to have them all?

Cheers.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Whoah, it gets worse, 5000 variables!  Anyways, just create a dataset with the list of variables (from sashelp.vcolumn) and then use that in your data _null_ step as the loop instead of the do + array:

data have; 
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;

proc sql;
  create table WORK.RESULTS
  (
    VARIABLE char(20),
    FIRST_DATE num format=date9.
  );

  create table WORK.LOOP as
  select  NAME
  from    SASHELP.VCOLUMN
  where   LIBNAME="WORK"
    and   MEMNAME="HAVE"
    and   NAME ne "NOUV_DATE";
quit;


options noquotelenmax;
data _null_;
  set WORK.LOOP;
  call execute('proc sql;
                  insert into WORK.RESULTS
                  set VARIABLE="'||strip(name)||'",
                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.
                                  from    WORK.HAVE BASE
                                  where   BASE.VAR91001_t0 is null
                                    and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));
                quit;');
run;
options quotelenmax;

andy_wk
Calcite | Level 5

One more question  RW9.

In your query

where   BASE.VAR91001_t0 is null

                                    and   not exists(select distinct THIS.VAR91001_t0 from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.VAR91001_t0 ne .));


How do you use specific variable name instead of VAR91001_t0 ?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Sorry, yes I should have replaced that:

options noquotelenmax;

data _null_;

  set WORK.LOOP;

  call execute('proc sql;

                  insert into WORK.RESULTS

                  set VARIABLE="'||strip(name)||'",

                      FIRST_DATE=(select  MIN(BASE.NOUV_DATE) format=date9.

                                  from    WORK.HAVE BASE

                                  where   BASE.'||strip(name)||' is null

                                    and   not exists(select distinct THIS.'||strip(name)||' from WORK.HAVE THIS where THIS.NOUV_DATE > BASE.NOUV_DATE and THIS.'||strip(name)||' ne .));

                quit;');

run;

options quotelenmax;

jakarman
Barite | Level 11

What is wrong with a datastep and using then NMISS function?

SAS(R) 9.4 Functions and CALL Routines: Reference, Second Edition (nmiss)

---->-- ja karman --<-----
RW9
Diamond | Level 26 RW9
Diamond | Level 26

I believe that function operates across variables, this problem is across rows per variable.  Also there are > 5000 variables so couldn't dynamically put them into a macro variable.  Plus the possibility that just one row may be missing then the next not, so only min of last block of missing.

andy_wk
Calcite | Level 5

Thanks RW9.

It is working fine. This is the first time I'm seeing something like what you did. Definitely useful.

Ksharp
Super User
data have;  
  attrib nouv_date format=date9. var91001_t0 var91059_t0 format=best.;
  nouv_date='01jan13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01feb13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01mar13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01apr13'd; var91001_t0=8900; var91059_t0=9200; output;
  nouv_date='01may13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jun13'd; var91001_t0=.; var91059_t0=.; output;
  nouv_date='01jul13'd; var91001_t0=.; var91059_t0=.; output;
run;
data temp(keep=var:);
 set have(where=(nouv_date ge '01may13'd));
  call symputx('n',_n_);
run;
%put &n;


data _null_;
 set sashelp.vcolumn(where=(libname='WORK' and memname='TEMP' )) end=last;
 if _n_ eq 1 then call execute('proc sql;create table x as select ');
 call execute('nmiss('||strip(name)||') as '||name );
 if not last then call execute(',');
  else call execute('from temp;quit;');
run;
proc transpose data=x out=want(where=(col1 = &n ));
run;

Xia Keshan

andy_wk
Calcite | Level 5

Thanks Xia.

Runnin RW9 sql at the moment. Will run yours when everything will be completed.

Cheers.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1119 views
  • 6 likes
  • 4 in conversation