BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
feira_feira
Calcite | Level 5

Hi Guys,

I would need your help on the problem that I'm currently into...

Basically I have the data below.

I would like to extract the data having criteria:

  • (# of data occurences under same Group >= 2 OR total dollar under same Group >= 25) , AND
  • time range is less than 3 hours

Please note:

  • field "delta time" in "data wants" is row N - row N-1 on field "time" under same Group. For example: "delta time" = 5:51 in row 2 is coming from "time" in row 2 substracted by time in row 1.
  • sample remarks:
    • row 4 & 5 got remarks Yes because within 3 hrs (delta time for both rows is 1:43) and total "dollar" is 32 (16+18) or total count is 2 (count for row 4 & 5).
    • row 9,10,11 got remarks Yes because within 3 hrs (sum delta time from 9-10 & 10-11 is 2 minutes, coming from 0:01+0:01) and total count is 3 (count for row 9,10,11).


Thank you in advance

Data Have

:

rowGroupdollartime
1a102:43 AM
2a108:34 AM
3a1111:52 AM
4b165:32 AM
5b187:16 AM
6b122:08 PM
7b118:34 PM
8b139:31 PM
9c112:18 AM
10c52:19 AM
11c52:20 AM
12c128:34 AM
13d182:12 PM
14e192:43 AM
15e175:30 AM
16e1112:07 PM
17e148:40 PM
18e168:46 PM
19f118:24 AM
20f1211:55 AM

Data Wants:

rowGroupdollartimedelta timeRemarks
1a102:43 AM
2a108:34 AM5:51
3a1111:52 AM3:18
4b165:32 AM Yes
5b187:16 AM1:43Yes
6b122:08 PM6:51
7b118:34 PM6:26
8b139:31 PM0:57
9c112:18 AM Yes
10c52:19 AM0:01Yes
11c52:20 AM0:01Yes
12c128:34 AM6:14
13d182:12 PM5:38
14e192:43 AM Yes
15e175:30 AM2:47Yes
16e1112:07 PM6:37
17e148:40 PM8:32Yes
18e168:46 PM0:05Yes
19f118:24 AM
20f1211:55 AM3:31

Message was edited by: feira h --> to provide better explanation.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
                     (select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25 
                          then 'YES'
                    else 'NO ' end as falg
   from have as a;
quit;

Xia Keshan

View solution in original post

7 REPLIES 7
Ksharp
Super User

You didn't post the output you want yet .

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *
   from have
    group by group
      having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

Xia Keshan

Ksharp
Super User

You didn't post the output you want yet .

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *
   from have
    group by group
      having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

Xia Keshan

Ksharp
Super User

It is more complicated than I imaged . What you gonna do if you have data :

time      dif

...............

08:00    04:20

09:00    01:00

11:30    02:30

You want

08:00    04:20   YES

09:00    01:00   YES

or want ?

09:00    01:00  YES

11:30    02:30  YES

feira_feira
Calcite | Level 5

Hi Xia,

Assuming the criteria of total dollar >=25 are met, then the output should be all of them, because:

  • no 1 and 2 data are < 3 hrs. & count >= 2
  • no 2 and 3 data are also < 3 hrs & count >= 2

Thanks for your assistance...

Ksharp
Super User

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
                     (select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25 
                          then 'YES'
                    else 'NO ' end as falg
   from have as a;
quit;

Xia Keshan

feira_feira
Calcite | Level 5

Thanks a lot Xia... it's very helpful... !!!

naveen_srini
Quartz | Level 8

Please mark the question as answered and award the correct answers to extraordinary master  Xia Keshan.

I tried to play around with your requirement, but unfortunately couldn;t meet it properly. But for learning purpose, you can play and have fun

data have;
input Group     $ dollar     time24hrs : time.;
*format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;

data want new(keep=group remarks);
set have;
by group;
k=lag(time24hrs);

if first.group then
  do;
   totaldollar=0;
   grpcount=0;
   sumtimediff=0;
  end;

grpcount+1;
totaldollar+dollar;

if not first.group then
  do;
   deltatime=intck('hour',k,time24hrs);
   sumtimediff+deltatime;
  end;

if grpcount>=2 and totaldollar>=25 and sumtimediff le 3 then
  do;
   remarks='yes';
   output new;
  end;

output want;
run;

data final;
merge want new;
by group;

keep group dollar time24hrs remarks;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1155 views
  • 0 likes
  • 3 in conversation