Hi everyone
I got 3800 observations on Premier League games for the last 10 seasons. For each season I want to delete the first 60 observations. To elaborate
- for season = 2011 I want to delete the first 60 obs (by Date).
- for season = 2012 I want to delete the first 60 obs (by Date).
- etc.
I've tried the following code - but this only works for the 2011-season... Proberbly because this is the first season and thus the first 60 obs. of dataset contains info on this season..
data merged_file_3200;
set merged_file_3800;
if _n_ le 60 and season=2011 then delete;
run;
My data looks like this (a subset of it):
"HomeTeam","AwayTeam","Season","Date","FTHG","FTAG","FTR","B365H","B365D","B365A","B365H_True","B365D_True","B365A_True","Score","run_score","score_before_game7","HomeTeam_score_before_game7","AwayTeam_score_before_game7"
"Aston_Villa","West_Ham","2011","08/14/2010","3","0","H","2","3.3","4","2.07329843","3.50442478","4.30434783","-0.23232323232323","-0.23232323232323",".","0","0"
"Blackburn","Everton","2011","08/14/2010","1","0","H","2.88","3.25","2.5","3.04027718","3.45557470","2.61989177","-0.38169515669515","-0.38169515669515",".","0","0"
"Bolton","Fulham","2011","08/14/2010","0","0","D","2.2","3.3","3.4","2.28668478","3.49896050","3.61158798","-0.27688651218062","-0.27688651218062",".","0","0"
"Chelsea","West_Brom","2011","08/14/2010","6","0","H","1.17","7","17","1.19630531","8.06040139","24.98145933","-0.04002968708851","-0.04002968708851",".","0","0"
"Sunderland","Birmingham","2011","08/14/2010","2","2","D","2.1","3.3","3.6","2.18726986","3.52074513","3.86431227","-0.45719095719095","-0.45719095719095",".","0","0"
"Tottenham","Man_City","2011","08/14/2010","0","0","D","2.4","3.3","3","2.50632911","3.50442478","3.16800000","-0.39898989898989","-0.39898989898989",".","0","0"
"Wigan","Blackpool","2011","08/14/2010","0","4","A","1.83","3.5","4.5","1.89279264","3.73711507","4.89970251","-0.52831988897562","-0.52831988897562",".","0","0"
"Wolves","Stoke","2011","08/14/2010","2","1","H","2.3","3.25","3.2","2.40120462","3.45581507","3.39933681","0.58354236343366","0.58354236343366",".","0","0"
"Liverpool","Arsenal","2011","08/15/2010","1","1","D","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","-0.38169515669515","-0.38169515669515",".","0","0"
"Man_United","Newcastle","2011","08/16/2010","3","0","H","1.25","5.5","15","1.27577320","6.03658537","19.80000000","-0.05050505050505","-0.05050505050505",".","0","0"
"Arsenal","Blackpool","2011","08/21/2010","6","0","H","1.17","7","19","1.19335873","7.92849864","27.85381862","-0.03590172011224","0.76000425612159","0.79590597623384","-0.32891737891737","0.79590597623384"
"Birmingham","Blackburn","2011","08/21/2010","2","1","H","2.2","3.25","3.4","2.29483952","3.46131941","3.63197211","-0.27533251062662","0.39575011045599","0.67108262108262","-0.25877825877825","0.67108262108262"
"Everton","Wolves","2011","08/21/2010","1","1","D","1.5","4","7","1.54601227","4.34482759","8.12903226","-0.12301587301587","0.46052649041779","0.58354236343366","-0.38169515669515","0.58354236343366"
"Stoke","Tottenham","2011","08/21/2010","1","2","A","3.4","3.3","2.2","3.61158798","3.49896050","2.28668478","0.56268568033273","0.16369578134284","-0.39898989898989","-0.29417502787068","-0.39898989898989"
"West_Brom","Sunderland","2011","08/21/2010","1","0","H","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","0.61830484330484","0.57827515621633","-0.04002968708851","-0.04002968708851","-0.45719095719095"
"West_Ham","Bolton","2011","08/21/2010","1","3","A","2.25","3.3","3.25","2.34711246","3.51319381","3.45658013","-0.42605542605542","-0.65837865837865","-0.23232323232323","-0.23232323232323","-0.43731431966726"
"Wigan","Chelsea","2011","08/21/2010","0","6","A","15","6","1.22","20.40892193","6.71149144","1.24687713","-0.04899817850637","-0.577318067482","-0.52831988897562","-0.52831988897562","0.16409298762239"
"Fulham","Man_United","2011","08/22/2010","2","2","D","6","4","1.57","6.72057075","4.30792683","1.61531866","-0.6190728945506","-0.40291127838898","0.21616161616161","-0.27688651218062","0.21616161616161"
"Newcastle","Aston_Villa","2011","08/22/2010","6","0","H","3.2","3.25","2.3","3.39933681","3.45581507","2.40120462","0.70582497212932","0.65531992162426","-0.05050505050505","-0.05050505050505","0.51767676767676"
"Man_City","Liverpool","2011","08/23/2010","3","0","H","2.38","3.25","3.1","2.47920939","3.43785992","3.27046441","0.59664560613327","0.2809890404767","-0.31565656565656","-0.31565656565656","-0.38169515669515"
"Blackburn","Arsenal","2011","08/28/2010","1","2","A","6","4","1.57","6.72057075","4.30792683","1.61531866","-0.14879688605803","0.24695322439796","0.39575011045599","0.39575011045599","-0.16688837478311"
"Blackpool","Fulham","2011","08/28/2010","2","2","D","3.2","3.25","2.3","3.39933681","3.45581507","2.40120462","-0.41645763656633","-0.84214103480499","-0.42568339823866","0.76000425612159","-0.42568339823866"
"Chelsea","Stoke","2011","08/28/2010","2","0","H","1.11","9","23","1.13326757","10.79746241","40.02961672","-0.02498150324237","-0.59604304329368","-0.57106154005131","0.36208934463515","-0.57106154005131"
"Man_United","West_Ham","2011","08/28/2010","3","0","H","1.14","8","19","1.16425532","9.36986301","29.10638298","-0.03435672514619","-0.69273538352485","-0.65837865837865","-0.40291127838898","-0.65837865837865"
"Tottenham","Wigan","2011","08/28/2010","0","1","A","1.25","5.75","13","1.27705011","6.37073864","16.67286245","0.94002229654403","0.36270422906203","-0.577318067482","0.16369578134284","-0.577318067482"
"Wolves","Newcastle","2011","08/28/2010","1","1","D","2.4","3.2","3.1","2.50364554","3.38694992","3.27512839","-0.39941756272401","0.06110892769378","0.46052649041779","0.46052649041779","0.65531992162426"
"Aston_Villa","Everton","2011","08/29/2010","1","0","H","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","-0.32891737891737","-1.35743793243793","-1.02852055352055","0.10121913111043","-1.02852055352055"
"Bolton","Birmingham","2011","08/29/2010","2","2","D","2.1","3.3","3.6","2.18726986","3.52074513","3.86431227","-0.45719095719095","-0.1838085661615","0.27338239102944","0.27338239102944","0.30546142310848"
"Liverpool","West_Brom","2011","08/29/2010","1","0","H","1.22","6","15","1.24687713","6.71149144","20.40892193","-0.04899817850637","0.52927697770995","0.57827515621633","-0.68746212849628","0.57827515621633"
Any help is much appreciated!
Best, Frederik
Hi @Taasby Assuming your dataset is sorted by season and each season does have more than 60 observations-
data want;
 do _n_=1 by 1 until(last.season);
  set have;
  by season;
  if _n_>60 then output;
 end;
run;Compute a within group sequence number and filter using that.
Example:
* if needed, sort data before further processing;
proc sort data=have; by season date; run;
* The BY statement includes date in order to ensure the data being processed is already sorted correctly;
* An ERROR will occur if the rows within a season are not sorted by date;
data want(drop=counter); have have; by season date; if first.season then counter=1; else counter+1; * within group counter; if counter > 60; * subsetting if; run;
Cheers Richard!
I got the answer from another suggestion but your code worked perfectly fine as well! So thank you very much!
/Frederik
Hi,
Do you need something like the following (untested):
/* sort in required order */
proc sort data = merged_file_3800
          out  = sorted;
   by season date;
run;
/* by each season delete the first 60 obs */
data merged_file_3200(drop = counter);
   counter = 0;
   do until (last.season);
      set sorted;
      by season;
      counter + 1;
      if counter le 60 then
         delete;
      else
         output;
   end;
run;
Kind regards,
Amir.
Cheers Amir!
Doesn't quite do the job unfortunately even though the code makes sense to me.. I get the following message and a dataset of 0 observations..

Best, Frederik
Hi @Taasby Assuming your dataset is sorted by season and each season does have more than 60 observations-
data want;
 do _n_=1 by 1 until(last.season);
  set have;
  by season;
  if _n_>60 then output;
 end;
run;Hi,
The delete statement returns processing back to the start of the data step, so a fix would be to change the if condition:
/* by each season delete the first 60 obs */
data merged_file_3200(drop = counter);
   counter = 0;
   do until (last.season);
      set sorted;
      by season;
      counter + 1;
      if counter gt 60 then
         output;
   end;
run;
Kind regards,
Amir.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
