Hi everyone,
I have a rather complicated task which I would love your input on. It consists of several tasks/ issues. Overall, the data look similar to this (beware: these data are fake and is only an example):
ID | Event_time | Event_east | Event_north | Poi_east | Poi_north | Notify_time | Pos_east | Pos_north | Pos_time1 | Pos_east1 | Pos_north1 | Pos_time2 | Pos_east2 | Pos_north2 | Pos_time3 | Pos_east3 | Pos_north3 |
1 | 11Feb2021:18:08:02 | 12.134963 | 55.461058 | 12.451788 | 55.421875 | 2021-02-11T18:09:55.000+01:00 | 12.124598 | 55.457985 | 2021-02-11T18:10:55.000+01:00 | 12.561334 | 55.876498 | 2021-02-11T18:11:55.000+01:00 | 12.457935 | 55.215469 | 2021-02-11T18:10:57.000+01:00 | 12.457896 | 55.897569 |
2 | 12Feb2021:16:12:00 | 12.457965 | 55.589512 | 12.021369 | 55.021369 | 2021-02-12T16:12:55.000+01:00 | 12.146026 | 55.012459 | 2021-02-12T16:12:56.000+01:00 | 12.012459 | 55.124598 | 2021-02-12T16:12:57.000+01:00 | 12.215498 | 55.124023 | 2021-02-12T16:12:58.000+01:00 | 12.459065 | 55.213026 |
I have several GPS- and timestamps (up to e.g. Pos_time60 and Pos_east60 etc). The different positions (pos_eastn and pos_northn may represent a route of a biker).
I want to:
1) Calculate the true-line distance from pos_east and pos_north (e.g. a biker's starting position) to event_east and event_north (the event). As I have several GPS stamps from 'the biker' (up to >60) I believe it could be done by straight-line calculation from pos_east to pos_east1 to pos_east2 to pos_eastn.... to event_east (the event). However, some of the last GPS-stamps from 'the biker' may be relatively close to the event GPS position. Therefore, the first position of 'the biker' within 25 meters from event_east and event_north should be used as the final position (if it makes any sense).
I do not know how to incorporate all this in the geodist function.
Thanks!
All best
/MM
Please provide an example of what you want to calculate. With 60 points it is not clear exactly what you are calculating. Do you want the distance between successive pairs of points, i.e. 60 additional values? One value of the cumulative distance as you go to point to point?
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.
If you have data that is too sensitive to share either exclude the variable or change values so that they aren't sensitive.
I suspect that you be looking at a number of arrays such as this:
data example; set have; array pe (*) Pos_east: ; array pn (*) Pos_north: ; array d (60); do i= 1 to ( dim(pe) -1); d[i] = geodist(pn[i],pe[i],pn[i+1],pe[i+1]); end; cumd = sum(of d(*)); run;
which calculates the distance between successive points, if I understand your data, and places each distance in the array D. The totals all those values.
You can get the index position in the arrays of "last" assigned values by using the N function: lastindex = n(of pe(*));
and use that index with the array to the pair of values for comparison with that "first position", not really well defined.
BTW including "the biker" in your description does not provide any information as there is no variable in the data with that designation. If you mean by ID, then use that since it is in the data.
data example;
set have;
array pe (*) Pos_east: ;
array pn (*) Pos_north: ;
array d (60);
do i= 1 to ( dim(pe) -1);
d[i] = geodist(pn[i],pe[i],pn[i+1],pe[i+1]);
*distance to event;
d2event = geodist(pn[i], pe[i], event_north, event_east);
*exit out of array loop once closes distance to event is found;
if d2event < 25 then leave;
end;
cumd = sum(of d(*));
run;
Thanks for your time. I tried my best to explain it, I am sorry if it was unclear.
It was just an example with the baker, to make it easier to explain and understand. The data is highly sensitive. I tried to make a data-step as requested, however, it would be several hundred values I had to re-arrange. Further, to test it in the right way you would need reel GPS-stamps on a given road.
Every observation can have several (more than 2 and up to around 60) GPS-stamps from pos_east and pos_north to event_east and event_north. I labelled the travelling positions wrong, they are called "pos_east_1, pos_east_2 and so on). If I understand it correctly, the method @Reeza describes in code does calculate the distance from pos_east to pos_east_1 and from pos_east_1 to pos_east_2 and finally sum the different distances. The first position within 25 meters from the event_east and event_north should be used as the last (I believe it is also done in the code). Unfortunately, the code does not seem to work. Maybe it is because of the wrong labelling?
Further, most observations goes by poi_east and poi_north on its way to event_east and event_north. Can this be taken in to account? So those observations without and poi-coordinates are done as first described, but those with poi-coordinates will first have the distance from pos_east and pos_north to poi-coordinates calculated and then again from poi-coordinates to event-coordinates?
I know I must sound like a giant newbie which is also what I am. I am very grateful for your spare full time and a huge help! Sorry for my incompetence.
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "<>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
Thanks for both of your answers. I have tried to make a datastep as requested:
data WORK.FAKE; infile datalines dsd truncover; input IncidentId:BEST12. IncidentTime:DATETIME18. AssignmentTime:DATETIME18. ConfirmedTime:DATETIME18. AcceptedTime:DATETIME18. incident_easting:$9. incident_northing:$9. _alias:$6. selected_time:$21. pos_easting:$9. pos_northing:$9. poi_id:$10. poi_eastin g:$9. poi_northing:$9. notified_time:$29. closed_time:$29. pos_time_1:$29. pos_easting_1:$9. pos_northing_1:$9. pos_time_2:$29. pos_easting_2:$9. pos_northing_2:$9. pos_time_3:$29. pos_easting_3:$9. pos_northing_3:$9. pos_time_4:$29. pos_easting_4:$9. pos_ northing_4:$9. pos_time_5:$29. pos_easting_5:$9. pos_northing_5:$9. pos_time_6:$29. pos_easting_6:$9. pos_northing_6:$9. pos_time_7:$29. pos_easting_7:$10. pos_northing_7:$9.; format IncidentId BEST12. IncidentTime DATETIME18. AssignmentTime DATETIME18. ConfirmedTime DATETIME18. AcceptedTime DATETIME18.; datalines; 1064378 12FEB18:15:43:22 12FEB18:15:59:53 12FEB18:15:59:40 12FEB18:15:59:53 12.280186 55.626429 451245 2018-02-12 15:46:56.0 12.266320 55.610988 2018-02-12T15:46:56.000+01:00 2018-02-12T12:34:20.000+01:00 12.267114 55.611986 2018-02-12T12:34:20.000+01:00 12.269278 55.614381 2018-02-12T15:59:54.000+01:00 12.270823 55616077 2018-02-12T15:59:54.000+01:00 12.272192 55.617449 2018-02-12T15:59:54.000+01:00 12.274400 55.620118 2018-02-12T15:59:54.000+01:00 12.276432 55.622264 2018-02-12T16:01:26.000+01:00 12.278331 55.624359 1064378 12FEB18:15:43:22 12FEB18:15:47:19 12FEB18:15:47:17 12FEB18:15:47:20 12.280186 55.626429 745985 2018-02-12 15:46:56.0 12.318220 55.623781 2018-02-12T15:46:56.000+01:00 2018-02-12T15:58:16.000+01:00 2018-02-12T09:42:44.000+01:00 12.311948 55.625928 2018-02-12T15:47:16.000+01:00 12.304616 55.628223 2018-02-12T15:47:16.000+01:00 12.296400 55.630943 2018-02-12T15:47:21.000+01:00 12.285620 55.632489 2018-02-12T15:47:52.000+01:00 12.278861 55.624907 2018-02-12T15:48:19.000+01:00 12.278508 55.624558 1064378 12FEB18:15:43:22 12FEB18:15:47:20 12FEB18:15:47:14 12FEB18:15:47:20 12.280186 55.626429 785489 2018-02-12 15:46:56.0 12.265386 55.633611 1000009418 12.262428 55.627251 2018-02-12T15:46:56.000+01:00 2018-02-12T15:55:16.000+01:00 2018-02-12T09:46:25.000+01:00 12.265431 55.630718 2018-02-12T15:47:13.000+01:00 12.261898 55.627226 2018-02-12T15:47:13.000+01:00 12.257615 55.622935 2018-02-12T15:47:21.000+01:00 12.267510 55.622737 2018-02-12T15:47:58.000+01:00 12.278199 55.624434 2018-02-12T15:48:15.000+01:00 12.278773 55.624833 1064378 12FEB18:15:43:22 12FEB18:15:48:30 12FEB18:15:48:17 12FEB18:15:48:30 12.280186 55.626429 547845 2018-02-12 15:46:56.0 12.314328 55.616049 2018-02-12T15:46:57.000+01:00 2018-02-12T16:04:31.000+01:00 2018-02-12T15:30:30.000+01:00 12.320076 55.622383 2018-02-12T15:47:14.000+01:00 12.308370 55.623458 2018-02-12T15:48:17.000+01:00 12.290789 55.628274 2018-02-12T15:48:17.000+01:00 12.278729 55.624733 6 The SAS System 12:54 Friday, September 17, 2021 6326548 13FEB18:08:14:26 13FEB18:08:26:36 13FEB18:08:26:22 13FEB18:08:26:36 12.298998 55.929033 741014 2018-02-13 08:15:51.0 12.321744 55.927888 2018-02-13T08:15:53.000+01:00 2018-02-13T08:27:52.000+01:00 2018-02-12T18:37:57.000+01:00 12.325474 55.925710 2018-02-12T18:37:57.000+01:00 12.326471 55.920177 2018-02-13T08:26:37.000+01:00 12.320957 55.916881 2018-02-13T08:26:38.000+01:00 12.318909 55.919059 2018-02-13T08:27:09.000+01:00 12.312553 55.923355 2018-02-13T08:45:13.000+01:00 12.306773 55.927387 2018-02-13T08:45:13.000+01:00 12.299365 55.928975 ;;;;
I hope it is correct? I have replaced all values with fake values. To ensure, that the calculations is correct, I have replaced all GPS stamps with real GPS stamps on real roads (got the GPS stamps from Google Maps). As mentioned, there can be up to around 63 GPS observations. However, in this fake dataset, I did only make up a maximum of 7 positions for each observation.
I tried the code:
data mydata; set mydata; array pe (*) Pos_easting: ; array pn (*) Pos_northing: ; array d (63); do i= 1 to ( dim(pe) -1); d[i] = geodist(pn[i],pe[i],pn[i+1],pe[i+1]); *distance to event; d2event = geodist(pn[i], pe[i], incident_northing, incident_easting); *exit out of array loop once closes distance to event is found; if d2event < 25 then leave; end; cumd = sum(of d(*)); run;
It runs without any errors. But my output data does not seem correct. I get several new variables (d1-d60) but most with missing values (value = .) except d1. However, d1 is often 0 which cumd also is (because of all the missing or the fact that d1=0, I guess).
I hope that it is more clear with the above information. Thanks once again!
/MM
When I run it, I get a few warnings that may give you a clue to what's wrong:
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 103:22 103:28 103:34 103:42 106:27 106:34 106:41 106:60 NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 16 at 103:14 16 at 106:19 16 at 110:11
Big hint - you cannot do math on character columns.
EDIT your input code doesn't work at all for some reason. Did you manually modify the generated file?
Northing and easting values provided to GEODIST must be numeric.
To make any use of the "time" values you will want those as actual SAS times, read with the E8601dt. format for those values with T in the middle.
You really need to describe 1) why you think the values calculated are wrong, give the actual easting and northing data pair you think are wrong and what you think would be correct.
For purposes of working with the GEODIST, you could drop all of the non easting and northing variables to simplify things.
First, you need to fix your data and read it properly. For date and time values, your data (the first observation, for example) seems to contain fields that should be read with the following informats :
DATETIME16. for "12FEB18:15:43:22"
YYMMDD10. for "2018-02-12"
TIME10.1 for "15:46:56.0"
E8601DZ29. for "2018-02-12T15:46:56.000+01:00"
Furthermore, all easting and northing values should be read as simple numbers, i.e. without informats
(untested)
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.