BookmarkSubscribeRSS Feed
KMHbiol
Calcite | Level 5

This is an example of the data I have.

IDdateobs
10001Jan201445
10002Jan201474
10003Jan2014.
10004Jan2014.
10005Jan201420
10006Jan201443
10007Jan201423
10008Jan2014.
10009Jan2014456
10010Jan201423
20001Jan2014.
20002Jan2014.
20003Jan2014.
20004Jan2014.
20005Jan201442
20006Jan201456
20007Jan201432
20008Jan2014.
20009Jan201432
20010Jan2014.
30001Jan2014556
30002Jan2014.
30003Jan2014.
30004Jan2014.
30005Jan2014.
30006Jan2014.
30007Jan2014.
30008Jan201452
30009Jan201412
30010Jan201445

I want output that reports each occasion of missing values (coded as '.') for the variable obs.  Each observation in the output dataset should represent an occasion when obs is missing for either a single day or consecutive days.  Each observation in the new output needs to include the ID variable, a variable that identifies the occassion of missing data and the number of consecutive days that obs was missing.  So the new count variable should equal 1 for single day of missing data or >1 for consecutive days of missing data.

Thank you in advance for any help you can offer on this problem.

6 REPLIES 6
mohamed_zaki
Barite | Level 11

Based on the data you mentioned. Give example of the output of data set you want.

KMHbiol
Calcite | Level 5

I am looking for output like this:

IDfirst_datecount
10003Jan20142
10008Jan20141
20001Jan20144
20008Jan20141
20010Jan20141
30002Jan20146

The first_date variable represents the date of missing data for a single day or the first day of consecutive days of missing data.  Basically the output should let me estimate a mean length of consecutive days of missing values.

ballardw
Super User

Assuming your data are sorted  by id and date:

data want (where=(miscount>0));

   set have;

   by id date;

   retain misscount;

   if first.id then misscount= .;

   if missing(obs) then misscount+1;

   else misscount= .;

run;

Or if you want 0 instead of . for the missing count use 0.

data_null__
Jade | Level 19
data missdate;
   infile cards dsd firstobs=2;
  
input ID :$3. date :date. obs;
   format date date11.;
  
cards;
ID,date,obs
100,01Jan2014,45
100,02Jan2014,74
100,03Jan2014,.
100,04Jan2014,.
100,05Jan2014,20
100,06Jan2014,43
100,07Jan2014,23
100,08Jan2014,.
100,09Jan2014,456
100,10Jan2014,23
200,01Jan2014,.
200,02Jan2014,.
200,03Jan2014,.
200,04Jan2014,.
200,05Jan2014,42
200,06Jan2014,56
200,07Jan2014,32
200,08Jan2014,.
200,09Jan2014,32
200,10Jan2014,.
300,01Jan2014,556
300,02Jan2014,.
300,03Jan2014,.
300,04Jan2014,.
300,05Jan2014,.
300,06Jan2014,.
300,07Jan2014,.
300,08Jan2014,52
300,09Jan2014,12
300,10Jan2014,45
;;;;
   run;
proc print;
  
run;
proc summary data=missdate nway;
  
by id obs notsorted;
   format obs 1.;
   var date;
   output out=missreport(drop=_: where=(missing(obs))) min=start max=end n=duration;
   run;
proc print;
  
run;

10-24-2014 2-59-36 PM.png

Message was edited by: data _null_ Added format statement format obs 1.;

KMHbiol
Calcite | Level 5

This is what I needed.  Thank you.

Ksharp
Super User
data missdate;
   infile cards dsd firstobs=2; 
   input ID :$3. date :date. obs;
   format date date11.; 
   cards; 
ID,date,obs
100,01Jan2014,45
100,02Jan2014,74
100,03Jan2014,.
100,04Jan2014,.
100,05Jan2014,20
100,06Jan2014,43
100,07Jan2014,23
100,08Jan2014,.
100,09Jan2014,456
100,10Jan2014,23
200,01Jan2014,.
200,02Jan2014,.
200,03Jan2014,.
200,04Jan2014,.
200,05Jan2014,42
200,06Jan2014,56
200,07Jan2014,32
200,08Jan2014,.
200,09Jan2014,32
200,10Jan2014,.
300,01Jan2014,556
300,02Jan2014,.
300,03Jan2014,.
300,04Jan2014,.
300,05Jan2014,.
300,06Jan2014,.
300,07Jan2014,.
300,08Jan2014,52
300,09Jan2014,12
300,10Jan2014,45
;;;;
   run; 
data want(drop=date);
 set missdate;
 by id obs notsorted;
 retain _date;
 if first.obs then do;_date=date;n=0; end;
 n+1;
 if last.obs and missing(obs) then output;
 format _date date9.;
 run;

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3909 views
  • 0 likes
  • 5 in conversation