BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
feira_feira
Calcite | Level 5

Hi Guys,

I would need your help on the problem that I'm currently into...

Basically I have the data below.

I would like to extract the data having criteria:

  • (# of data occurences under same Group >= 2 OR total dollar under same Group >= 25) , AND
  • time range is less than 3 hours

Please note:

  • field "delta time" in "data wants" is row N - row N-1 on field "time" under same Group. For example: "delta time" = 5:51 in row 2 is coming from "time" in row 2 substracted by time in row 1.
  • sample remarks:
    • row 4 & 5 got remarks Yes because within 3 hrs (delta time for both rows is 1:43) and total "dollar" is 32 (16+18) or total count is 2 (count for row 4 & 5).
    • row 9,10,11 got remarks Yes because within 3 hrs (sum delta time from 9-10 & 10-11 is 2 minutes, coming from 0:01+0:01) and total count is 3 (count for row 9,10,11).


Thank you in advance

Data Have

:

rowGroupdollartime
1a102:43 AM
2a108:34 AM
3a1111:52 AM
4b165:32 AM
5b187:16 AM
6b122:08 PM
7b118:34 PM
8b139:31 PM
9c112:18 AM
10c52:19 AM
11c52:20 AM
12c128:34 AM
13d182:12 PM
14e192:43 AM
15e175:30 AM
16e1112:07 PM
17e148:40 PM
18e168:46 PM
19f118:24 AM
20f1211:55 AM

Data Wants:

rowGroupdollartimedelta timeRemarks
1a102:43 AM
2a108:34 AM5:51
3a1111:52 AM3:18
4b165:32 AM Yes
5b187:16 AM1:43Yes
6b122:08 PM6:51
7b118:34 PM6:26
8b139:31 PM0:57
9c112:18 AM Yes
10c52:19 AM0:01Yes
11c52:20 AM0:01Yes
12c128:34 AM6:14
13d182:12 PM5:38
14e192:43 AM Yes
15e175:30 AM2:47Yes
16e1112:07 PM6:37
17e148:40 PM8:32Yes
18e168:46 PM0:05Yes
19f118:24 AM
20f1211:55 AM3:31

Message was edited by: feira h --> to provide better explanation.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
                     (select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25 
                          then 'YES'
                    else 'NO ' end as falg
   from have as a;
quit;

Xia Keshan

View solution in original post

7 REPLIES 7
Ksharp
Super User

You didn't post the output you want yet .

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *
   from have
    group by group
      having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

Xia Keshan

Ksharp
Super User

You didn't post the output you want yet .

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *
   from have
    group by group
      having range(time24hrs) le 30*60 and (count(*) ge 3 or sum(dollar) ge 25);
quit;

Xia Keshan

Ksharp
Super User

It is more complicated than I imaged . What you gonna do if you have data :

time      dif

...............

08:00    04:20

09:00    01:00

11:30    02:30

You want

08:00    04:20   YES

09:00    01:00   YES

or want ?

09:00    01:00  YES

11:30    02:30  YES

feira_feira
Calcite | Level 5

Hi Xia,

Assuming the criteria of total dollar >=25 are met, then the output should be all of them, because:

  • no 1 and 2 data are < 3 hrs. & count >= 2
  • no 2 and 3 data are also < 3 hrs & count >= 2

Thanks for your assistance...

Ksharp
Super User

So you want all of them ?

time      dif

...............

08:00    04:20  YES

09:00    01:00  YES

11:30    02:30  YES

data have;
input Group     $ dollar     time24hrs : time.;
format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;
proc sql;
 create table want as
  select *,case when (select count(*) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 2 or
                     (select sum(dollar) from have where Group=a.Group and time24hrs between a.time24hrs-3*60*60 and a.time24hrs+3*60*60) ge 25 
                          then 'YES'
                    else 'NO ' end as falg
   from have as a;
quit;

Xia Keshan

feira_feira
Calcite | Level 5

Thanks a lot Xia... it's very helpful... !!!

naveen_srini
Quartz | Level 8

Please mark the question as answered and award the correct answers to extraordinary master  Xia Keshan.

I tried to play around with your requirement, but unfortunately couldn;t meet it properly. But for learning purpose, you can play and have fun

data have;
input Group     $ dollar     time24hrs : time.;
*format      time24hrs time.;
cards;
a     10     2:43
a     10     8:34
a     11     11:52
b     16     5:32
b     18     7:16
b     12     14:08
b     11     20:34
b     13     21:31
c     11     2:18
c     12     8:34
d     18     14:12
e     19     2:43
e     17     5:30
e     11     12:07
e     14     20:40
e     16     20:46
f     11     8:24
f     12     11:55
;
run;

data want new(keep=group remarks);
set have;
by group;
k=lag(time24hrs);

if first.group then
  do;
   totaldollar=0;
   grpcount=0;
   sumtimediff=0;
  end;

grpcount+1;
totaldollar+dollar;

if not first.group then
  do;
   deltatime=intck('hour',k,time24hrs);
   sumtimediff+deltatime;
  end;

if grpcount>=2 and totaldollar>=25 and sumtimediff le 3 then
  do;
   remarks='yes';
   output new;
  end;

output want;
run;

data final;
merge want new;
by group;

keep group dollar time24hrs remarks;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 1188 views
  • 0 likes
  • 3 in conversation