BookmarkSubscribeRSS Feed
BCNAV
Quartz | Level 8

I have some airport departure and destination data, along with charges.  Suppose I have:

 

depart     dest        OCU

cyyz       egll        500

cyvr       cyyz         10

egll       cyyz        500

 

 

 

What I would like to do is to sum OCU by city pair, which is the concatenation of depart and dest, hopefully in the format of depart-dest (dash in between them).

 

The price is that reverse flights should be included as one...so above the final output dataset would be:

 

City_Pair     OCU

cyyz-egll     1000

cyvr-cyyz       10

 

 

thanks for any advice!

 

 

5 REPLIES 5
yabwon
Amethyst | Level 16

Hi,

 

Do you mean something like this:

 

data have;
input depart $ dest $ OCU;
cards;
cyyz       egll        500
cyvr       cyyz         10
egll       cyyz        500
;
run;
proc print;
run;

data have2 / view = have2;
  set have;
  array dd[*] depart dest;
  call sortc(of dd[*]);
  grp = catx("-", of dd[*]);
run;

proc sql;
  select grp, sum(ocu) as sumOcu
  from have2
  group by grp
  ;
quit;

 

All the best

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



PGStats
Opal | Level 21

As a single query:

 

proc sql;
create table want as
select
	catx("-", a.depart, a.dest) as cityPair,
	sum(a.OCU, b.OCU) as OCU
from 
	have as a left join
	have as b on a.depart=b.dest and a.dest=b.depart
where b.depart < a.dest or b.depart is missing;
quit;
PG
hashman
Ammonite | Level 13

@BCNAV:

 

Yet another variation - hash-based:

data have ;                                                                                                                                                                                                                                                     
  input depart $ dest $ ocu ;                                                                                                                                                                                                                                   
  cards ;                                                                                                                                                                                                                                                       
cyyz  egll  500                                                                                                                                                                                                                                                 
cyvr  cyyz   10                                                                                                                                                                                                                                                 
egll  cyyz  500                                                                                                                                                                                                                                                 
;                                                                                                                                                                                                                                                               
data _null_ ;                                                                                                                                                                                                                                                   
  dcl hash h () ;                                                                                                                                                                                                                                               
  h.definekey  ("city_pair") ;                                                                                                                                                                                                                                  
  h.definedata ("city_pair", "ocu") ;                                                                                                                                                                                                                           
  h.definedone () ;                                                                                                                                                                                                                                             
  do until (z) ;                                                                                                                                                                                                                                                
    set have (rename=ocu=_ocu) end = z ;                                                                                                                                                                                                                        
    city_pair = put (catx ("-", depart, dest), $9.) ;                                                                                                                                                                                                           
    if dest < depart then city_pair = catx ("-", dest, depart) ;                                                                                                                                                                                                
    if h.find() ne 0 then ocu = _ocu ;                                                                                                                                                                                                                          
    else                  ocu + _ocu ;                                                                                                                                                                                                                          
    h.replace() ;                                                                                                                                                                                                                                               
  end ;                                                                                                                                                                                                                                                         
  h.output (dataset:"want") ;                                                                                                                                                                                                                                   
run ;            

Kind regards

Paul D.

mkeintz
PROC Star

@BCNAV 

 

You've had three good answer, yet none uses PROC SUMMARY, which I think of as the tool SAS intended to solve this problem. So, for completeness sake:

 

data have ;
  input depart $ dest $ ocu ;
  cards ;
cyyz  egll  500
cyvr  cyyz   10
egll  cyyz  500
;

data vneed/ view=vneed;
  set have;
  call sortc (depart,dest);
  length pair $9;
  pair=catx('-',depart,dest);
run;
proc summary data=vneed nway;
  class pair;
  var ocu;
  output out=want (drop=_type_ _freq_) sum=sumocu;
run;

The nice thing about PROC SUMMARY is that it offers keywords that generate a lot of statistics in addition to SUM (e.g. mean, std, median, other percentiles, …).  And you could simultaneously do it not only for all pairs, but also for all DEPART values, and all DEST values.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
hashman
Ammonite | Level 13

@mkeintz: Mark, a nice complement. Truth be told, though, the crux of this problem isn't which particular tool to use for producing the sums but rather how to knead the input data to enable the  summation. I absolutely agree with what you've said of the broader functionality of SUMMARY, but all by itself it's powerless to face this problem head-on. Which, of course, can be said of any other aggregation method used in this thread.A truck may be able to carry a huge tree by its sheer weight; but it cannot do that before the tree is cut into shapes that can fill its bed properly.

 

Kind regards

Paul D.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1276 views
  • 2 likes
  • 5 in conversation