DATA Step, Macro, Functions and more

Programming question

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Programming question

I have a travel dataset as follows:

travelerid  orig   dest    traveltime   time_spent_at_dest   orig_stop  dest_stop

1              A      B            5                      0                              Y            N

1              B      C            2                      2                              N            N         

1              C      D            3                      1                              N             Y

1              D       E            2                      1                             Y              Y

...........

 

And I want to sum up the time spent between any two consecutive stopping stations

(not counting the time spent at the current stopping station),

for example, the output dataset should look like:

travlerid  orig   dest    traveltime  

1              A      D            12            

1              D       E            2                     

...........

 

Can anyone help me with this question?

Thanks!


Accepted Solutions
Solution
‎08-30-2016 09:26 PM
Super User
Posts: 5,085

Re: Programming question

Make sure you stash the original data somewhere safe ... if  you ever get this data out of order it will be a monstrous task to reassemble it.  You would be better off creating new fields (TRIP_ID, LEG) that would let you put the data back in order if a problem ever arose.

 

All that being said, here's an approach you can try for your problem:

 

data want;

set have;

if orig_stop='Y' then do;

   total_time=0;

   starting_point = orig;

end;

retain total_time starting_point;

if dest_stop = 'N' then total_time = total_time + traveltime + time_spent_at_dest;

else do;

   total_time = total_time + traveltime;

   ending_point = dest;

   output;

end;

keep travelerid starting_point ending_point total_time;

run;

 

It's untested code, but should be OK.

View solution in original post


All Replies
Solution
‎08-30-2016 09:26 PM
Super User
Posts: 5,085

Re: Programming question

Make sure you stash the original data somewhere safe ... if  you ever get this data out of order it will be a monstrous task to reassemble it.  You would be better off creating new fields (TRIP_ID, LEG) that would let you put the data back in order if a problem ever arose.

 

All that being said, here's an approach you can try for your problem:

 

data want;

set have;

if orig_stop='Y' then do;

   total_time=0;

   starting_point = orig;

end;

retain total_time starting_point;

if dest_stop = 'N' then total_time = total_time + traveltime + time_spent_at_dest;

else do;

   total_time = total_time + traveltime;

   ending_point = dest;

   output;

end;

keep travelerid starting_point ending_point total_time;

run;

 

It's untested code, but should be OK.

Super User
Posts: 10,516

Re: Programming question

Here's one way:

data want (keep=travelerid orig dest traveltime);
   set have;
   retain cumtime 0 RealOrig "   ";
   if orig_stop='Y' then do;
      cumtime=0;
      RealOrig=Orig;
   end;
   If dest_stop = 'N' then cumtime= sum(cumtime,traveltime,time_spent_at_dest);
   if dest_stop='Y' then do;
      cumtime= sum(cumtime,traveltime);
      orig= RealOrig;
      TravelTime=CumTime;
      output;
   end;
run;

This assumes well formed data: No traveller id starting with something for Orig_stop other than 'Y', last traveller Id is Dest_stop='Y'.

 

If you travell id doesn't behave that way then you'll need to make some decisions about how to handle the exceptions. They might be amenable to BY Travellerid processing with First and Last but no promises.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 224 views
  • 0 likes
  • 3 in conversation