BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kurt_Bremser
Super User

With some made-up data:

data games;
input team $ score;
datalines;
A 2
A 3
A 0
A 4
A 2
A 1
A 1
A 3
A 5
A 2
A 1
A 3
A 0
A 2
A 2
A 1
A 1
A 3
A 2
A 1
B 2
B 3
B 2
B 1
B 0
B 4
B 2
B 1
B 1
B 1
B 3
B 2
B 3
B 1
B 0
B 2
B 1
B 2
B 2
B 1
;

data avg;
set games;
by team;
array window{0:18} _temporary_;
if first.team
then do;
  do count = 0 to 18;
    window{count} = .;
  end;
  count = 1;
end;
else count + 1;
window{mod(count,19)} = score;
score_avg = sum(of window{*}) / min(count,19);
drop count;
run;

SCORE must be on the incoming dataset; if your source for the average is called differently, change the variable name.

Taasby
Calcite | Level 5

Now I got how it was constructed! That is a very neat way to do it!

 

My code (with my variables):

data avg;
set PL_Poisson_1;
by HomeTeam;
array window{0:18} _temporary_;
if first.team
then do;
  do count = 0 to 18;
    window{count} = .;
  end;
  count = 1;
end;
else count + 1;
window{mod(count,19)} = FTHG;
scr_avg_home = sum(of window{*}) / min(count,19);
drop count;
run;

So I am rolling the window on the "FTHG" (Finish Time Home Goals) variable. So now the "Score_avg" variable contains the average of the Homegoals for the 19 last homegames - that is great!

 

Is it possible within the same bit of code to make a "score_avg" variable in the same manner but for the goals conceded - that would be for the "FTAG" (Finish Time Away Goals) variable?

 

Now my output looks like this - and if possible I would like another column to the right (another variable) on the goals conceded..

image.png

 

Thanks

 

/Frederik

Kurt_Bremser
Super User

You need to setup a second array and use it in the same manner as the first one is used; only use the other variable to set the array values, and, of course, the new variable for receiving the averaged score.

Taasby
Calcite | Level 5

Hi Kurt

 

Got it! THANKS! I am almost there now!

This is my code:

data avg;
set PL_Poisson_1;
by HomeTeam;

array window{0:18} _temporary_;
if first.team
then do;
  do count = 0 to 18;
    window{count} = .;
  end;
  count = 1;
end;
else count + 1;
window{mod(count,19)} = FTHG;
Goals_For = sum(of window{*});
Avg_Goals_For = sum(of window{*}) / min(count,19);
drop count;

array window1{0:18} _temporary_;
if first.team
then do;
  do count = 0 to 18;
    window1{count} = .;
  end;
  count = 1;
end;
else count + 1;
window1{mod(count,19)} = FTAG;
Goals_Against = sum(of window1{*});
Avg_Goals_Against = sum(of window1{*}) / min(count,19);
drop count;

run;

Which gives me output like this:

 

image.png

However when there is no more observations for Arsenal and the HomeTeam variable takes on the next value (this would be Aston Villa) the window (up until observation number 19 for Aston Villa) will include some info on Arsenal. Is there someway to restrict the array to restart when the HomeTeam takes on a new value? 

 

Best,

Frederik

Kurt_Bremser
Super User

Maxim 2: Read the Log. It should give you a clue about your mistake.

Spoiler
You use HomeTeam in your BY statement. You need to use the same variable in the first. references.

You can also streamline that code:

data avg;
set PL_Poisson_1;
by HomeTeam;

array window{0:18} _temporary_;
array window1{0:18} _temporary_;
if first.hometeam
then do;
  do count = 0 to 18;
    window{count} = .;
    window1{count} = .;
  end;
  count = 1;
end;
else count + 1;
window{mod(count,19)} = FTHG;
Goals_For = sum(of window{*});
Avg_Goals_For = sum(of window{*}) / min(count,19);
window1{mod(count,19)} = FTAG;
Goals_Against = sum(of window1{*});
Avg_Goals_Against = sum(of window1{*}) / min(count,19);
drop count;

run;
Taasby
Calcite | Level 5

Cheers Kurt!

 

This bit of code did for me: 

data avg_2;
 do _n_=1 by 1 until(last.hometeam);
  set avg;
  by HomeTeam;
  if _n_ > 18 then output;
 end;
run;

And thanks for the suggestion on streamlining my code!

 

The only thing that is left for me to get into the table is the sum and average across all teams for the past 19 games. So when Arsenal have played 19 games and scored 36 goals how many goals are scored in total for all teams for last 19 games (a total of 380 games) in the same timeperiod? This is tricky as the data is now ordered by HomeTeam and I want to cut it by Date. I aim to get something like the last two rows of this table:

image.png

 

 

 

 

 

 

 

 

 

 

 

I need this info to obtain relative measures of the teams. For instance here Arsenal would have a "home attacking strength" of 1,89/1,57 = 1.20. This will be for a later step.

 

/Frederik

 

 

 

Taasby
Calcite | Level 5

Figured this out my self - running the code before moving any observations:

 

proc sort data=avg;
by date;
run;

data avg_1;
set avg;
by Date;

array window{0:379} _temporary_;
array window1{0:379} _temporary_;
if first.hometeam
then do;
  do count = 0 to 379;
    window{count} = .;
    window1{count} = .;
  end;
  count = 1;
end;
else count + 1;
window{mod(count,380)} = FTHG; Goals_For_Total = sum(of window{*}); Avg_Goals_For_Total = sum(of window{*}) / min(count,380); window1{mod(count,380)} = FTAG; Goals_Against_Total = sum(of window1{*}); Avg_Goals_Against_Total = sum(of window1{*}) / min(count,380); drop count; run;

/Frederik

Taasby
Calcite | Level 5

I think i solved it myself using this code:

 

data avg_2;
 do _n_=1 by 1 until(last.hometeam);
  set avg;
  by HomeTeam;
  if _n_ > 18 then output;
 end;
run;

/Frederik

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 22 replies
  • 1512 views
  • 0 likes
  • 3 in conversation