BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
trphelps15
Fluorite | Level 6

I'm using SAS Enterprise Guide 7.1...I have a dataset that I am working to aggregate variables from using two methods: 1) sum and 2) weighted by observation total. A sample of the sort of data that I'm working with is below. So I have multiple IDs that are associated with a single Code variable. Ideally, I'm trying to combine the data to look like the output dataset below (there are something like 2,000 codes with anywhere from 2 to 15 IDs associated with each). Some of the variables are combined simply by summing each observation (i.e. delay) and some are combined by summing the length % times the variable (i.e. at_rur, at_sub, and postspd).

 

Is there a way to use something like proc means with a by statement to produce the new dataset, but still be able to define how the variables are combined? Thank you in advance for your help!

 

i.e. 

input dataset

IDCodeLengthLength %at_rurat_subpostspddelay
100380107-10114          1.29          0.281055              -  
100494107-10114          0.55          0.121055          0.17
100382107-10114          0.39          0.081055              -  
100341107-10114          0.59          0.131055              -  
100386107-10114          0.56          0.120155              -  
100539107-10114          0.61          0.130155              -  
103878107-10114          0.37          0.080150              -  
100551107-10114          0.18          0.040150              -  
103029107-10114          0.13          0.030150              -  
 Total          4.66                  0.61                 0.39              54.28          0.17

 

output dataset

CodeLengthat_rurat_subpostspddelay
107-10114           4.66          0.61          0.39              54.28                 0.17
1 ACCEPTED SOLUTION

Accepted Solutions
Satish_Parida
Lapis Lazuli | Level 10

 

proc sql;
create table summery_table as 
select Code, 

sum(Length) as Length,

sum(Length_percentage*at_rur) as at_rur, 

sum(Length_percentage*at_sub) as at_sub, 

sum(Length_percentage*postspd) as postspd ,

sum(delay) as delay from input_dataset group by Code; quit;

View solution in original post

3 REPLIES 3
Satish_Parida
Lapis Lazuli | Level 10
It is not clear how you want the output DS to look like, could you please elaborate.
trphelps15
Fluorite | Level 6

I've updated the post to include what the final dataset should look like using the input dataset. I also updated some of the text, so hopefully it is more clear. The input dataset has anywhere from 2 to 15 IDs associated with each Code and there are 1,992 Codes. I'm basically just trying to compile the ID variables into the Codes, but I want to use multiple methods when compiling them. I know I can use proc means with a BY group to output a new dataset that sums all of the variables, but I want some of the variables (i.e. at_rur in my example) to sum by weights. So in my example, at_rur =

 

(0.28*1)+(0.12*1)+(0.08*1)+(0.13*1)+(0.12*0)...(0.03*0) = 0.61

 

Same idea for the at_sub and postspd variables. However, the length and delay variables are just a standard sum.

 

Hopefully this helps! Thank you.

Satish_Parida
Lapis Lazuli | Level 10

 

proc sql;
create table summery_table as 
select Code, 

sum(Length) as Length,

sum(Length_percentage*at_rur) as at_rur, 

sum(Length_percentage*at_sub) as at_sub, 

sum(Length_percentage*postspd) as postspd ,

sum(delay) as delay from input_dataset group by Code; quit;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2614 views
  • 0 likes
  • 2 in conversation