I have cost information that gives me the cost of transporting between locations, for example A and B, in both directions. For example, the cost of transportation from A to B is $.05 and the cost of transportation from B to A is $0.1. I would like to add these two costs, $0.05 and $0.1 to create round trip cost.
Year Month Company Origin Destination Cost
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
I would like it to be:
Year Month Company Origin Destination Cost
2015 Jan A5 CLE CHI 0.15
2015 Jan A6 COL DCA 0.19
The data lists only routes (one-way) and average costs. I have considered sql, merge, modify. I am not sure which one would give the output I want without mistakes.
Get the round trip costs with:
data oneWayCosts;
input Year Month $ Company $ Origin $ Destination $ Cost;
datalines;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
;
proc sql;
create table roundTripCosts as
select
a.year,
a.month,
a.company,
a.origin,
a.destination,
a.cost + b.cost as roundTripCost
from
oneWayCosts as a inner join
oneWayCosts as b
on a.year=b.year and a.month=b.month and a.company=b.company and
a.origin=b.destination and a.destination=b.origin
where a.origin < a.destination;
quit;
Ok, so what do you need help with?
@larusso522 wrote:
I have cost information that gives me the cost of transporting between locations, for example A and B, in both directions. For example, the cost of transportation from A to B is $.05 and the cost of transportation from B to A is $0.1. I would like to add these two costs, $0.05 and $0.1 to create round trip cost.
How can I do that for 400000 rows of data that is listed by year, month, and company using SAS?
That depends on how your data is arranged. Without knowing that it's impossible to answer your question. It could be a lookup problem - which I suspect it is.
Can you provide some mocked up data that reflects your problem, with the desired results?
If you can provide sample data and anything you've tried so far, you'll get a lot more responses, faster responses and code.
In general, vague question = vague answer.
Is this setup a reasonable facsimile of the problem?
* one way travel costs from point1 to point2;
data travel_cost;
input point1:$1. point2:$1. cost;
datalines;
a b .05
a c .75
a d .3
a e 5
b a 1
b d 2
b e 4
c a .56
c b 45
c d 1
c e 2
e a 1
e b .24
e c .01
e d 10
;
run;
* given an starting point and a destination point, what is round trip cost?;
* assume there can be multiple destinations;
* assume trip always ends with a return to start;
* assume trips for costing are presented in a string format;
* a trip from A to B will be presented as AB;
* a trip from A to B to E will be presented as ABE;
data trips_to_calculate;
input trips $.;
datalines;
AB
ABE
EBC
BC
;
run;
??
data have;
input Year (Month Company Origin Destination) ($) Cost;
cards;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
;
data want;
if 0 then set have;
_cost=0;
do until(last.company);
set have;
by year month company;
_cost+cost;
end;
do _n_=1 by 1 until(last.company);
set have;
by year month company;
if _n_=1 then output;
end;
drop cost;
rename _cost=cost;
run;
@larusso522 if every group are sets of two, consider merge have have(firstobs=2) lookahead..There are tons of examples online and here. I am too lazy
data have;
input Year (Month Company Origin Destination) ($) Cost;
cards;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
;
data want;
set have have(in=b);
by year month company;
if first.company then do; _cost=0;_flag=0;end;
if not b then _cost+cost;
else if b then _flag+1;
if _flag=1;
drop cost _flag;
rename _cost=cost;
run;
Get the round trip costs with:
data oneWayCosts;
input Year Month $ Company $ Origin $ Destination $ Cost;
datalines;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
;
proc sql;
create table roundTripCosts as
select
a.year,
a.month,
a.company,
a.origin,
a.destination,
a.cost + b.cost as roundTripCost
from
oneWayCosts as a inner join
oneWayCosts as b
on a.year=b.year and a.month=b.month and a.company=b.company and
a.origin=b.destination and a.destination=b.origin
where a.origin < a.destination;
quit;
HI can you make me understand this please.
a.origin=b.destination and a.destination=b.origin
where a.origin < a.destination;
The query joins two copies of the cost table. The records that are joined have the origin from the first copy equal to the destination of the second, and vice-versa. The a.origin < a.destination condition is to prevent two instances of the same round trip from appearing: O -> D -> O and D -> O -> D.
Do you know what happens if there are no matches? Does this program simply delete those?
@larusso522 wrote:
Do you know what happens if there are no matches? Does this program simply delete those?
An inner join would delete any non matches. If you want everything from one table you can use a LEFT or RIGHT join.
Yes one-ways are omitted from the results. If you wanted to signal their absence, you could use a full join:
data oneWayCosts;
input Year Month $ Company $ Origin $ Destination $ Cost;
datalines;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
2016 Jan A6 DCA COL 0.11
;
proc sql;
create table roundTripCosts as
select
coalesce(a.year, b.year) as year,
coalesce(a.month, b.month) as month,
coalesce(a.company, b.company) as company,
coalesce(a.origin, b.destination) as origin,
coalesce(a.destination, b.origin) as destination,
a.cost + b.cost as roundTripCost
from
oneWayCosts as a full join
oneWayCosts as b
on a.year=b.year and a.month=b.month and a.company=b.company and
a.origin=b.destination and a.destination=b.origin
where calculated origin < calculated destination;
select * from roundTripCosts;
quit;
data oneWayCosts;
input Year Month $ Company $ Origin $ Destination $ Cost;
datalines;
2015 Jan A5 CLE CHI 0.05
2015 Jan A5 CHI CLE 0.1
2015 Jan A6 COL DCA 0.08
2015 Jan A6 DCA COL 0.11
;
data temp;
set oneWayCosts;
call sortc(Origin , Destination);
run;
proc sql;
select Year, Month, Company,Origin , Destination,sum(Cost) as Cost
from temp
group by Year, Month, Company,Origin , Destination;
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.