BookmarkSubscribeRSS Feed
dwsmith
Obsidian | Level 7

I want to find avg latitude and longitude of as set of coordinates based on the distance they are from each other. Consider the following example set:

 

data test;
     input lat long lat1 long1 date dt hdist;
     format date dt date9.;
     datalines;
     55.173874 -118.86073 55.173763 -118.85976 20222 20230 0.062882
     55.173786 -118.85968 55.173763 -118.85979 20223 20234 0.007439
     55.173702 -118.85984 55.173775 -118.85979 20225 20235 0.008716
     55.173775 -118.85976 54.685486 -118.77554 20300 20314 54.561229
     55.173763 -118.85976 54.685551 -118.77554 20301 20315 54.552708
     ;
run;

I haven't figured out much SAS cdoe to achieve the results. Here is some psuedo code I have thought up.

 

 

if abs(hdist_i - hdist_{i+1}) < 5 then sum(lat) and sum(lat1) and sum(long) and sum(long1) and cnt + 1;

 

Therefore, I would sum lat, long, lat1, long1 for lines 1-3 and have a cnt of 3 and sum lat, long, lat1, long1 for lines 4-5 and have a cnt of 2 so I can take the sums and divide by their respective counts to obtain the average.

 

From that data, I want to return:

 

avg(lat) avg(long) avg(lat1) avg(long1) 20222 20235 for rows 1-3
avg(lat) avg(long) avg(lat1) avg(long1) 20300 20315 for rows 4-5

 

6 REPLIES 6
ballardw
Super User

It sounds like you want to have a classification variable (distance) and then use Proc means or proc summary on the remaining variables to get a mean.

Easiest might be to create a custom format to group your distance variable into your desired groups. You don't provide a lot of example but something like:

 

proc format;

value DistGroup

0.005 - 0.07 = 'Group 1'

54.5 -54.6   = 'Group 2'

;

 

proc summary data=have;

   class hdist;

   format hdist DistGroup.;

   var lat long lat1 long1 dt date;

   output out= want mean( lat long lat1 long1 )=   min(date)=   max(dt)= ;

run;

 

guessing on the min and max for the date variables.

 

dwsmith
Obsidian | Level 7

Setting groups would be impossible for this set of data. I just pulled out a quick example but we are talking about 100million rows. Also, I would rather have this in a table then a proc means output.

ballardw
Super User

So, what are the actual rules for which records are used to calculate a mean? You did not provide any rules just an abitrary first 3 rows and second 2 rows.

dwsmith
Obsidian | Level 7
I did. I said if the absolute value of ith hdist - ith +1 hdist is less than 5
Reeza
Super User

@dwsmith Please recall that you're asking from free help for volunteers. If you review the code suggested, you'll notice that it also produces a dataset called, ironically, WANT. 

 

Change the proc format to categorize the distance into less than 5 or higher than 5. 

 

 

proc format;
value DistGroup
0 - 5 = 'Group 1'
5 - high  = 'Group 2'
other = 'CHECKME'
;
 
proc means data=test nway noprint;
   class hdist;
   format hdist DistGroup.;
   var lat long lat1 long1 dt date;
   output out= want mean( lat long lat1 long1 )=   min(date)=   max(dt)= ;
run;

If this doesn't work please post your code, log and explain what is not working. 

 

ballardw
Super User
Unfortunately I am using Internet Explorer and code posted with braces as the start { does not render properly from this forum. And since your example did not have a variable hdist_i I couldn't tell what impact that had on the problem.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1798 views
  • 0 likes
  • 3 in conversation