## DATA Step, Macro, Functions and more

Solved
Contributor
Posts: 37

I select a month's data at a time from two production tables. There can be 40k to 50k rows per month per table. Below is a sample of the data. I have to calculate the time it takes in days for a person to respond (an activity in table2) to an assignment (table1) taking into account holidays (skip holidays). The did column is only there to help someone see how the data should match up in the two tables. It is not part of the data extracted from the two production tables. The rows should be matched on pgm_id and user and in table2 we want the first row with a time greater than the time in table1. For instance, for did = 12 in table1 there are two possible rows in table2 (did=12,13) but we use the row with did = 12 because that activity happened first. For did = 1 in table 1 there is no corresponding row in table 2 so by default the calculation is 30 days. One of the issues is that for user = Dan there are several rows in both tables so how can we process them? I was wondering if it could be done in a loop on table1 with another loop inside of that one on table2. In this case, to process did = 3 in table1, if we had ordered the rows in table2 then we could start looping through the rows in table2 and the first row we get to with user=Dan and pgm_id = 63832680 would satisfy the requirements because its tiime is later than the time in table1. Then we exit that loop and go to the next row in the outside loop with did=4. We go back to the inside loop and when we get to did=4 it again satisfies the conditions and we do the calculation and exit the inside loop and so on.

I've never done a loop within a loop so the first question is will it work for this problem? If so, how and where do I store the results of the calculation. Also, how do I deal with holidays? I've read somewhere that there is a SAS function for holidays.

data table1;

infile datalines;

input did pgm_id user \$ dttime DATETIME24.3;

format dttime DATETIME24.3;

return;

datalines;

1 60124541 BOB 14AUG2013:22:24:41.894

2 51452877 JAN 01AUG2013:19:20:58.785

3 63832680 DAN 08AUG2013:19:13:00.356

4 63832680 DAN 13AUG2013:19:29:39.578

5 63832680 DAN 16AUG2013:18:40:52.547

6 63832680 DAN 22AUG2013:13:57:12.506

7 63832680 DAN 23AUG2013:15:30:00.452

8 63832680 DAN 23AUG2013:19:42:55.160

9 63832680 DAN 26AUG2013:20:29:40.386

11 59796423 JUNE 05AUG2013:19:16:37.764

12 59796423 JUNE 14AUG2013:15:37:05.595

14 59829038 RON 27AUG2013:18:34:11.931

;

run;

data table2;

infile datalines;

input did pgm_id user \$ dttime DATETIME24.3;

format dttime DATETIME24.3;

return;

datalines;

2 51452877 JAN 01AUG2013:23:59:58.285

3 63832680 DAN 08AUG2013:19:51:53.322

4 63832680 DAN 13AUG2013:20:01:24.427

5 63832680 DAN 16AUG2013:20:52:52.136

6 63832680 DAN 22AUG2013:14:50:58.576

7 63832680 DAN 23AUG2013:19:33:19.111

8 63832680 DAN 23AUG2013:20:27:52.563

9 63832680 DAN 28AUG2013:20:28:50.595

10 63832680 SUE 11AUG2013:22:24:41.894

11 59796423 JUNE 05AUG2013:19:24:22.605

12 59796423 JUNE 14AUG2013:19:48:40.025

13 59796423 JUNE 14AUG2013:19:58:59.937

14 59829038 RON 27AUG2013:21:58:11.160

;

run;

Thanks for the help.

DanD

Accepted Solutions
Solution
‎02-07-2014 05:54 PM
Super Contributor
Posts: 307

Try this to get the start and end date times. The code below assumes you want an equijoin (only rows that have matching pgm_id and user in both tables). If you want all records in table1, even if there is no matching observation in table2, we will need to use a left join instead.

You will also need to add logic to deal with calculating the time between the two dates . . . there are plenty of examples in these forums and elsewhere how to calculate intervals excluding holidays.

proc sql;

create table want as

select t1.did

, t1.pgm_id

, t1.user

, t1.dttime as start_dtm

, t2.dttime as end_dtm

from table1 t1

, table2 t2

where t1.pgm_id = t2.pgm_id

and t1.user = t2.user

and t2.dttime =

( select min ( t3.dttime )

from table2 t3

where t1.pgm_id = t3.pgm_id

and t1.user = t3.user

and t1.dttime < t3.dttime )

order by did

;

quit

All Replies
Solution
‎02-07-2014 05:54 PM
Super Contributor
Posts: 307

Try this to get the start and end date times. The code below assumes you want an equijoin (only rows that have matching pgm_id and user in both tables). If you want all records in table1, even if there is no matching observation in table2, we will need to use a left join instead.

You will also need to add logic to deal with calculating the time between the two dates . . . there are plenty of examples in these forums and elsewhere how to calculate intervals excluding holidays.

proc sql;

create table want as

select t1.did

, t1.pgm_id

, t1.user

, t1.dttime as start_dtm

, t2.dttime as end_dtm

from table1 t1

, table2 t2

where t1.pgm_id = t2.pgm_id

and t1.user = t2.user

and t2.dttime =

( select min ( t3.dttime )

from table2 t3

where t1.pgm_id = t3.pgm_id

and t1.user = t3.user

and t1.dttime < t3.dttime )

order by did

;

quit

Contributor
Posts: 37

I thought of doing it that way but decided that it wouldn't work and I didn't even try it. But it does work and it is much simpler than working with loops. Thanks Fugue.

🔒 This topic is solved and locked.

Discussion stats
• 2 replies
• 273 views
• 1 like
• 2 in conversation