BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi, I have a dataset as follows and I need to retrieve two things: 1) the sum of VALUE between (date-1) and (date-3) for each date and 2) whether, during the 5 days, there are >= two days where the VALUE is 0. I think PROC SQL should be used but I'm not sure how to implement this.

INPUT DATASET:

ID DATE VALUE
1 2011/01/01 0
1 2011/01/02 0
1 2011/01/03 1
1 2011/01/04 2
2 2011/01/01 1
2 2011/01/02 2
2 2011/01/03 3
2 2011/01/04 4

Output should be 1) 1 (0+0+1) for ID1, 2011/01/04 and 6 (1+2+3) for ID2, 20110104. and 2) a mark for ID1, 2011/01/04, since there are 2 days with a value of 0 during the 3-day window.

Any help is greatly appreciated!
5 REPLIES 5
chang_y_chung_hotmail_com
Obsidian | Level 7
Here is a way, which is rather straight-forward.



   /* test data */


   data zero;


     input id date :yymmdd10. value;


     format date e8601da.;


   cards;


   1 2011/01/01 0


   1 2011/01/02 0


   1 2011/01/03 1


   1 2011/01/04 2


   2 2011/01/01 1


   2 2011/01/02 2


   2 2011/01/03 3


   2 2011/01/04 4


   ;


   run;


 


   proc sql;


 


     /* sum of value between (date-1) and (date-3) for each date */


     select zero.id, zero.date, sum(one.value) as sum3


     from   zero left join zero as one on zero.id = one.id and


            one.date between zero.date-1 and zero.date-3


     group by zero.id, zero.date;


     /* on lst


            id        date      sum3


      ------------------------------


             1  2011-01-01         .


             1  2011-01-02         0


             1  2011-01-03         0


             1  2011-01-04         1


             2  2011-01-01         .


             2  2011-01-02         1


             2  2011-01-03         3


             2  2011-01-04         6


     */


 


     /* if there are two or more days with value=0


        between (date-1) and (date-5) */


     select zero.id, zero.date, sum(one.value=0)>=2 as zero2


     from   zero left join zero as one on zero.id = one.id and


            one.date between zero.date-1 and zero.date-5


     group by zero.id, zero.date;


     /* on lst


            id        date     zero2


      ------------------------------


             1  2011-01-01         0


             1  2011-01-02         0


             1  2011-01-03         1


             1  2011-01-04         1


             2  2011-01-01         0


             2  2011-01-02         0


             2  2011-01-03         0


             2  2011-01-04         0


     */


   quit;

Ksharp
Super User
Hi.
For your situation.I think proc sql is not a good idea.
The biggest virtue of proc sql is cartesian product.
The data step is much more suitable.
One question : you said 'during the 5 days' but your origin data has only 4 days.That is right?


[pre]
data temp;
input ID DATE yymmdd10. VALUE;
format date yymmdd10.;
datalines;
1 2011/01/01 0
1 2011/01/02 0
1 2011/01/03 1
1 2011/01/04 2
2 2011/01/01 1
2 2011/01/02 2
2 2011/01/03 3
2 2011/01/04 4
;
run;
proc sort data=temp;
by id date;
run;
data result;
set temp;
by id;
retain count_three sum_three count_zero .;
if first.id then do;
count_three=0;sum_three=0;count_zero=0;
end;
count_three+1;
if count_three le 3 then sum_three+value;
if value eq 0 then count_zero+1;
if last.id then do;
if count_zero ge 2 then flag=1;
else flag=0;
output;
end;
keep id date sum_three flag;
run;
proc print;run;
[/pre]



Ksharp
ChrisNZ
Tourmaline | Level 20
I think the OP meant 3 days, not 5, so only one sql statement is necessary.

Another virtue of proc sql is to allow merge on non-matching values like:

where ... one.date between zero.date-1 and zero.date-3

chang_y_chung's code works for rolling dates so you can have a full year in the dataset and still calculate the previous 3 days' sums/counts.

One can do this in a data step also using lag() for example, but sql is much simpler in this case imho.
Ksharp
Super User
Hi.


>Another virtue of proc sql is to allow merge on non-matching values


That is just because of proc sql 's Cartesian Product.


Ksharp
deleted_user
Not applicable
Sorry I meant 3 days...
Problem solved by using the simple PROC SQL statement above. Thanks for all your help!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 748 views
  • 0 likes
  • 4 in conversation