Dear Forum,
I have yet been able to devise a working piece of code that will create a cumulative total column that resets at each break in another variable. Have over 7k in observations.
For example, given the following sample data set.
1 NY ZIPCODE# long# lat#
2 NY ZIPCODE# long# lat#
3 NY ZIPCODE # long# lat#
4 NY ZIPCODE# long# lat#
5 NY ZIPCODE# long# lat#
6 LA ZIPCODE# long# lat#
7 LA ZIPCODE# long# lat#
8 LA ZIPCODE# long# lat#
9 LA ZIPCODE# long# lat#
10 LA ZIPCODE# long# lat#
I will like to compute the cumulative distance (measured in kilometers) between each of the ZIP codes in each state. With the total cumulative distance being shown in the last observation of the given state.
Thank you for any help!
data want;
set have;
by state;
if first.state then cum_dist=0;
cum_dist+geodist(lat,long,lag(lat),lag(long));
run;
Assumes your data is sorted properly by state.
data want;
set have;
by state;
if first.state then cum_dist=0;
cum_dist+geodist(lat,long,lag(lat),lag(long));
run;
Assumes your data is sorted properly by state.
Shouldn't the two statements:
if first.state then cum_dist=0;
cum_dist+geodist(lat,long,lag(lat),lag(long));
be reversed?
Otherwise I think the 2nd through last states will start out with the distance between their first zipcode and the last zipcode of the prior state, no?
@mkeintz wrote:
Shouldn't the two statements:
if first.state then cum_dist=0; cum_dist+geodist(lat,long,lag(lat),lag(long));
be reversed?
Otherwise I think the 2nd through last states will start out with the distance between their first zipcode and the last zipcode of the prior state, no?
Correct! Good catch!
Or possibly
data want; set have; by state; if first.state then cum_dist=0; else cum_dist+zipcitydistance(zipcode,lag(zipcode)); run;
Depending on where you obtained the lat and long values the ZIPCITYDISTANCE likely varies as it uses the lat, long values from the SASHELP.ZIPCODE data set. However if you have Zipcodes that do not appear in that set you get a missing values which will screw this up significantly.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.