Solved
New Contributor
Posts: 3

# Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Hi Guys,

I would need your help on the problem that I'm currently into...

Basically I have the data below.

I would like to extract the data having criteria:

• (# of data occurences under same Group >= 2 OR total dollar under same Group >= 25) , AND
• time range is less than 3 hours

• field "delta time" in "data wants" is row N - row N-1 on field "time" under same Group. For example: "delta time" = 5:51 in row 2 is coming from "time" in row 2 substracted by time in row 1.
• sample remarks:
• row 4 & 5 got remarks Yes because within 3 hrs (delta time for both rows is 1:43) and total "dollar" is 32 (16+18) or total count is 2 (count for row 4 & 5).
• row 9,10,11 got remarks Yes because within 3 hrs (sum delta time from 9-10 & 10-11 is 2 minutes, coming from 0:01+0:01) and total count is 3 (count for row 9,10,11).

Thank you in advance

Data Have

:

 row Group dollar time 1 a 10 2:43 AM 2 a 10 8:34 AM 3 a 11 11:52 AM 4 b 16 5:32 AM 5 b 18 7:16 AM 6 b 12 2:08 PM 7 b 11 8:34 PM 8 b 13 9:31 PM 9 c 11 2:18 AM 10 c 5 2:19 AM 11 c 5 2:20 AM 12 c 12 8:34 AM 13 d 18 2:12 PM 14 e 19 2:43 AM 15 e 17 5:30 AM 16 e 11 12:07 PM 17 e 14 8:40 PM 18 e 16 8:46 PM 19 f 11 8:24 AM 20 f 12 11:55 AM

Data Wants:

 row Group dollar time delta time Remarks 1 a 10 2:43 AM 2 a 10 8:34 AM 5:51 3 a 11 11:52 AM 3:18 4 b 16 5:32 AM Yes 5 b 18 7:16 AM 1:43 Yes 6 b 12 2:08 PM 6:51 7 b 11 8:34 PM 6:26 8 b 13 9:31 PM 0:57 9 c 11 2:18 AM Yes 10 c 5 2:19 AM 0:01 Yes 11 c 5 2:20 AM 0:01 Yes 12 c 12 8:34 AM 6:14 13 d 18 2:12 PM 5:38 14 e 19 2:43 AM Yes 15 e 17 5:30 AM 2:47 Yes 16 e 11 12:07 PM 6:37 17 e 14 8:40 PM 8:32 Yes 18 e 16 8:46 PM 0:05 Yes 19 f 11 8:24 AM 20 f 12 11:55 AM 3:31

Message was edited by: feira h --> to provide better explanation.

Accepted Solutions
Solution
‎03-16-2015 01:35 AM
Super User
Posts: 10,787

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

```data have;
input Group     \$ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
create table want as
select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
(select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25
then 'YES'
else 'NO ' end as falg
from have as a;
quit;
```

Xia Keshan

All Replies
Super User
Posts: 10,787

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

You didn't post the output you want yet .

```data have;
input Group     \$ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
create table want as
select *
from have
group by group
having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

```

Xia Keshan

Super User
Posts: 10,787

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

You didn't post the output you want yet .

```data have;
input Group     \$ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
create table want as
select *
from have
group by group
having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

```

Xia Keshan

Super User
Posts: 10,787

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

It is more complicated than I imaged . What you gonna do if you have data :

time      dif

...............

08:00    04:20

09:00    01:00

11:30    02:30

You want

08:00    04:20   YES

09:00    01:00   YES

or want ?

09:00    01:00  YES

11:30    02:30  YES

New Contributor
Posts: 3

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Hi Xia,

Assuming the criteria of total dollar >=25 are met, then the output should be all of them, because:

• no 1 and 2 data are < 3 hrs. & count >= 2
• no 2 and 3 data are also < 3 hrs & count >= 2

Thanks for your assistance...

Solution
‎03-16-2015 01:35 AM
Super User
Posts: 10,787

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

```data have;
input Group     \$ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
create table want as
select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
(select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25
then 'YES'
else 'NO ' end as falg
from have as a;
quit;
```

Xia Keshan

New Contributor
Posts: 3

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Thanks a lot Xia... it's very helpful... !!!

Frequent Contributor
Posts: 115

## Re: Extract Data Having # or Sum more than certain threshold, within certain Time Range.

Posted in reply to feira_feira

Please mark the question as answered and award the correct answers to extraordinary master  Xia Keshan.

I tried to play around with your requirement, but unfortunately couldn;t meet it properly. But for learning purpose, you can play and have fun

data have;
input Group     \$ dollar     time24hrs : time.;
*format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;

data want new(keep=group remarks);
set have;
by group;
k=lag(time24hrs);

if first.group then
do;
totaldollar=0;
grpcount=0;
sumtimediff=0;
end;

grpcount+1;
totaldollar+dollar;

if not first.group then
do;
deltatime=intck('hour',k,time24hrs);
sumtimediff+deltatime;
end;

if grpcount>=2 and totaldollar>=25 and sumtimediff le 3 then
do;
remarks='yes';
output new;
end;

output want;
run;

data final;
merge want new;
by group;

keep group dollar time24hrs remarks;
run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
• 7 replies
• 377 views
• 0 likes
• 3 in conversation