BookmarkSubscribeRSS Feed
tomrvincent
Rhodochrosite | Level 12
Seven hours ago I was fast asleep.

Just to make sure, you have the following problem:
1. Origin point and departure date/time.
2. Destination point.
3. A schedule table of boarding and target stops with dates and times.

And so you want to find the fastest route between Origin and Destination points...regardless of how many stops there are between?

If this is the case, I'd suggest you provide an example from your data.

Show us the origin record and the trip records you or someone has manually determined to be the optimal route (assuming this is a process that currently exists at all).

You'll have to do QA on your logic at some point in time, otherwise you won't be able to prove to anybody that your coding is successful, right?
ulricius
Obsidian | Level 7

Hi Tom,

 

Good that you checked the problem formulation - forces me to improve the structuring of the problem!

 

It is not a simple "travelling salesman" problem, the fastest route between two points, but rather the closest match, to a target stop, along a PT route. Thus, the first table of Message #9, which consists of only one card transaction, contains, in column #8, possible (candidate) stops based on network _walk_ distances to stops surrounding to stop in the 7th column (#7400003), which is the boarding stop of the _subsequent_ (next in time) PT trip (i e subsequent card transaction in the ticketing data). Note that all the rows in ths sample of the first table have the same values of column1-6 and thus belong to the same card transaction, with the boarding of the subsequent trip appended to it. (The boarding stop of the transaction itself is thus 7401547 - column #3).

 

What about the second table then? Well, it contains the trip itinerary of line route 817, in this case identified by the line route ID of the first column in table #1, and looked up in the trip itinerary database based on column #2 (table #2 of message #9). Now, the script should loop through the stops of column #8 of table #1 and compare them to the stops of column #4 in Table #2 and indicate the stop (in column #8 in Table #1) with the smallest value of either column #9 OR column #10 in Table #1. This should result in a '1' in column #10 of Table #3, which is just a reformulation of Table #1, plus the "flag" column.

 

As to what is missing in the code of the datasteps i provided in message #9, I am still lost in oblivion...

Patrick
Opal | Level 21

As to what is missing in the code of the datasteps i provided in message #9, I am still lost in oblivion...


For datastep  data data.PossibleTargetStops7400003;

1.datalines1;  is invalid syntax. It's either datalines or datalines4

2. column trs_dt is always missing due to missing informat.

3. you mention in the narrative card id no 10643 but I can't find such a column or value in the data you've posted.

4. is column Time of any relevance? And if yes what does a value of 20772.699931 mean? Are these seconds?

 

 

ulricius
Obsidian | Level 7

Thanks, @Patrick, for some concrete comments, finally! Il be back shortly with corrections, in message #9! /Ulrik

Patrick
Opal | Level 21

Sorry to be such a pain.

 

I know from own experience how hard it can be to ask a question if the problem is not trivial. I've also made the experience that it really helped me to get to the solution by being forced to reduce the problem to its core.

Try to provide as little data (rows and columns) as you can - but these as representative as possible.

Also try in your narrative to be as short and precise as you can be. And provide examples illustrating what you describe (as you've done already).

ulricius
Obsidian | Level 7

No worries, @Patrick 🙂

Just to be able to move on, I have issues importing the "time" column, as you may have noticed. I now changed it to 'hh:mm:ss' in the 24 hour-a-day version. What informat should I define for SAS to import it from datalines?

Patrick
Opal | Level 21

@ulricius wrote:

No worries, @Patrick 🙂

Just to be able to move on, I have issues importing the "time" column, as you may have noticed. I now changed it to 'hh:mm:ss' in the 24 hour-a-day version. What informat should I define for SAS to import it from datalines?


O.K., here your datastep for the first table "fixed". Please run it and check if the output looks like what you'd expect.

data work.PossibleTargetStops7400003;
  infile datalines dsd truncover dlm=' ';
  input 
    linje:$3. dep_num stp trs_dt :date9.
    crd_num Time stp_t To Total_Mete Total_Seco;
    format trs_dt date9.;
    format time time.;
  datalines;
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415574 412.0518591 296.6773356
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415600 471.70787795 339.62966984
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415847 564.73844061 406.61167038
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415576 620.70628168 446.90851942
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415723 632.69061063 455.53724234
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415571 749.44706581 539.6018891
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415724 893.89056819 643.60119944
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415745 928.47124344 668.4992945
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415577 1011.1364702 728.01825104
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415689 1020.296277 734.61330423
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415606 1137.1403793 818.74107625
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415816 1257.7971129 905.61392546
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415728 1269.5402379 914.06896269
817 2 7401547 14NOV2016 59 20772.699931 7400003 7413039 1330.3410244 957.84552638
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415732 1392.4036034 1002.5305784
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415624 1414.5585351 1018.4821456
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415690 1437.3201032 1034.8704716
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415580 1442.9038977 1038.8907905
817 2 7401547 14NOV2016 59 20772.699931 7400003 7415747 1470.6074832 1058.8373881
;

proc print data=work.PossibleTargetStops7400003;
run;
tomrvincent
Rhodochrosite | Level 12
So describe a complete case. Start at the beginning and show the 1st leg and why it was selected. Go on to the 2nd leg, the 3rd leg and so on until the final leg is shown and explained.
ulricius
Obsidian | Level 7
@tomrvincent, I'll be right at it! Finally, I have figured out the secrets behind reproducing datasteps:) I'll edit message #9 again within minutes...
ulricius
Obsidian | Level 7
Message #9 is updated with relevant data and a clarified description of the steps desired to find the closest inferred (alighting) stop
tomrvincent
Rhodochrosite | Level 12
I think I'd suggest you start with a simpler example. From what I can tell from message 9, you're looking at 17 rows of output from a single input line, correct?

Pick a simpler example: One source record and maybe 3 output records, explaining how/why those 3 records were selected and all others not. That might lead us to figure out a general rule: Source record results in the selection of one result, that resulting record is then used as the source to find the next result record and so on. Walk us thru your logic.

For me, it's very hard to follow what you're starting with and what you want to finish with.
mkeintz
PROC Star

@ulricius wrote:

Since I don't seem to get any help in this thread, I have created a new one here:

https://communities.sas.com/t5/SAS-Programming/Creating-a-to-from-list-from-a-vector/m-p/630612#M186...


 

I think you received LOTS of help on this thread - asking for and making suggestions about problem clarification - probably the single most important skill one can learn with the help of forum members.  I have seen the other thread, which is clear to me.  But I didn't see its relationship to this one.  Good luck.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
tomrvincent
Rhodochrosite | Level 12
I agree. I don't see any connection between this problem and the new thread...nothing about times, departures, distances, etc. Oh well...nobody to blame but myself. 🙂

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 29 replies
  • 1789 views
  • 5 likes
  • 5 in conversation