BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Garzaluis998
Calcite | Level 5

Dear Forum,

 

I have yet been able to devise a working piece of code that will create a cumulative total column that resets at each break in another variable. Have over 7k in observations.

 

For example, given the following sample data set.

 

1   NY   ZIPCODE#      long#   lat#

2   NY   ZIPCODE#      long#   lat#

3   NY    ZIPCODE #    long#  lat#

4   NY    ZIPCODE#     long#  lat#

5   NY   ZIPCODE#      long#   lat#

6   LA    ZIPCODE#      long#   lat#

7   LA    ZIPCODE#      long#   lat#

8   LA    ZIPCODE#      long#   lat#

9   LA    ZIPCODE#      long#   lat#

10  LA   ZIPCODE#      long#   lat#

 

I will like to compute the cumulative distance (measured in kilometers) between each of the ZIP codes in each state. With the total cumulative distance being shown in the last observation of the given state. 

 

Thank you for any help!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26
data want;
    set have;
    by state;
    if first.state then cum_dist=0;
    cum_dist+geodist(lat,long,lag(lat),lag(long));
run;

Assumes your data is sorted properly by state.

--
Paige Miller

View solution in original post

4 REPLIES 4
PaigeMiller
Diamond | Level 26
data want;
    set have;
    by state;
    if first.state then cum_dist=0;
    cum_dist+geodist(lat,long,lag(lat),lag(long));
run;

Assumes your data is sorted properly by state.

--
Paige Miller
mkeintz
PROC Star

Shouldn't the two statements:

    if first.state then cum_dist=0;
    cum_dist+geodist(lat,long,lag(lat),lag(long));

be reversed?

 

Otherwise I think the 2nd through last states will start out with the distance between their first zipcode and the last zipcode of the prior state, no?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

@mkeintz wrote:

Shouldn't the two statements:

    if first.state then cum_dist=0;
    cum_dist+geodist(lat,long,lag(lat),lag(long));

be reversed?

 

Otherwise I think the 2nd through last states will start out with the distance between their first zipcode and the last zipcode of the prior state, no?


Correct! Good catch!

--
Paige Miller
ballardw
Super User

Or possibly

data want;
    set have;
    by state;
    if first.state then cum_dist=0;
    else cum_dist+zipcitydistance(zipcode,lag(zipcode));
run;

Depending on where you obtained the lat and long values the ZIPCITYDISTANCE likely varies as it uses the lat, long values from the SASHELP.ZIPCODE data set. However if you have Zipcodes that do not appear in that set you get a missing values which will screw this up significantly.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 537 views
  • 1 like
  • 4 in conversation