Help using Base SAS procedures

Handle missing when outputing data for every 365 days

Reply
Frequent Contributor
Posts: 107

Handle missing when outputing data for every 365 days

Hi

I have some daily data but some days may have missing data. I'd like to output only records for every 365 days, and if the data is missing for that day, I'd like to choose the nearest day point to output - either before or after. Is there an easy and efficient way to do it? Thanks in advance.

data data; input days;

datalines;

1

364

365

729

732

1094

1096

1099

3285

3630

3675

;

Desire output:

365

729

1096  (either 1094 or 1096 is fine, preferably the latter day though)

3285

3630

Trusted Advisor
Posts: 1,301

Re: Handle missing when outputing data for every 365 days

This code is ugly:

data foo;

input days;

cards;

1

364

365

729

732

1094

1096

1099

3285

3630

3675

;

run;

data bar(where=(c>0));

set foo;

c=round(days/(365));

_c=lag(c);

if c=_c then _j=0; else _j=1;

m=abs((365*_j)-(mod(days,365)));

drop _:;

run;

proc sort data=bar; by c m descending days; run;

data want;

set bar;

by c m descending days;

if first.c then output;

keep days;

run;

365

729

1096

3285

3630

Super Contributor
Posts: 1,636

Re: Handle missing when outputing data for every 365 days

another approach:

data foo;

input days;

cards;

1

364

365

729

732

1094

1096

1099

3285

3630

3675

;

run;

data bar;

set foo(where=(days>300));

c=round(days/(365));

diff=abs(days-c*365);

proc sort;

  by c diff;

data want(keep=days);

  set bar;

  by c diff;

  if first.c;

proc print;run;

Super User
Posts: 10,028

Re: Handle missing when outputing data for every 365 days

Assuming you don't have a group missing data for a whole year.

data foo;
input days;
cards;
1
364
365
729
732
1094
1096
1099
3285
3630
3675
;
run;
proc sql noprint;
 select max(days) into : max from foo;
quit;
data x(drop=n);
 do n=1 to int(&max/365);
  _days=n*365;output;
 end;
run;
proc sql;
 create table want as
  select  _days label='365th date',days
   from x,foo
    group by _days
     having abs(_days-days)=min(abs(_days-days));
quit;
proc sort data=want out=want1 ;by _days descending days;run;
proc sort data=want1 out=want2 nodupkey;by _days ;run;

Ksharp

Frequent Contributor
Posts: 107

Re: Handle missing when outputing data for every 365 days

Thanks everyone. You guys are so helpful and the solutions so clever. Tried all and worked. With Linlin's proc sort statement, I just need to add Days in and in a descending order to get 1096 outputted when both 1096 and 1094 have the same nearest distance.

     proc sort; by c diff descending days;

Ask a Question
Discussion stats
  • 4 replies
  • 620 views
  • 0 likes
  • 4 in conversation