BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Taasby
Calcite | Level 5

Hi everyone

 

I got 3800 observations on Premier League games for the last 10 seasons. For each season I want to delete the first 60 observations. To elaborate

- for season = 2011 I want to delete the first 60 obs (by Date). 

- for season = 2012 I want to delete the first 60 obs (by Date).

- etc.

 

I've tried the following code - but this only works for the 2011-season... Proberbly because this is the first season and thus the first 60 obs. of dataset contains info on this season..

 

data merged_file_3200;
set merged_file_3800;
if _n_ le 60 and season=2011 then delete;
run;

 

My data looks like this (a subset of it):

 

 

"HomeTeam","AwayTeam","Season","Date","FTHG","FTAG","FTR","B365H","B365D","B365A","B365H_True","B365D_True","B365A_True","Score","run_score","score_before_game7","HomeTeam_score_before_game7","AwayTeam_score_before_game7"
"Aston_Villa","West_Ham","2011","08/14/2010","3","0","H","2","3.3","4","2.07329843","3.50442478","4.30434783","-0.23232323232323","-0.23232323232323",".","0","0"
"Blackburn","Everton","2011","08/14/2010","1","0","H","2.88","3.25","2.5","3.04027718","3.45557470","2.61989177","-0.38169515669515","-0.38169515669515",".","0","0"
"Bolton","Fulham","2011","08/14/2010","0","0","D","2.2","3.3","3.4","2.28668478","3.49896050","3.61158798","-0.27688651218062","-0.27688651218062",".","0","0"
"Chelsea","West_Brom","2011","08/14/2010","6","0","H","1.17","7","17","1.19630531","8.06040139","24.98145933","-0.04002968708851","-0.04002968708851",".","0","0"
"Sunderland","Birmingham","2011","08/14/2010","2","2","D","2.1","3.3","3.6","2.18726986","3.52074513","3.86431227","-0.45719095719095","-0.45719095719095",".","0","0"
"Tottenham","Man_City","2011","08/14/2010","0","0","D","2.4","3.3","3","2.50632911","3.50442478","3.16800000","-0.39898989898989","-0.39898989898989",".","0","0"
"Wigan","Blackpool","2011","08/14/2010","0","4","A","1.83","3.5","4.5","1.89279264","3.73711507","4.89970251","-0.52831988897562","-0.52831988897562",".","0","0"
"Wolves","Stoke","2011","08/14/2010","2","1","H","2.3","3.25","3.2","2.40120462","3.45581507","3.39933681","0.58354236343366","0.58354236343366",".","0","0"
"Liverpool","Arsenal","2011","08/15/2010","1","1","D","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","-0.38169515669515","-0.38169515669515",".","0","0"
"Man_United","Newcastle","2011","08/16/2010","3","0","H","1.25","5.5","15","1.27577320","6.03658537","19.80000000","-0.05050505050505","-0.05050505050505",".","0","0"
"Arsenal","Blackpool","2011","08/21/2010","6","0","H","1.17","7","19","1.19335873","7.92849864","27.85381862","-0.03590172011224","0.76000425612159","0.79590597623384","-0.32891737891737","0.79590597623384"
"Birmingham","Blackburn","2011","08/21/2010","2","1","H","2.2","3.25","3.4","2.29483952","3.46131941","3.63197211","-0.27533251062662","0.39575011045599","0.67108262108262","-0.25877825877825","0.67108262108262"
"Everton","Wolves","2011","08/21/2010","1","1","D","1.5","4","7","1.54601227","4.34482759","8.12903226","-0.12301587301587","0.46052649041779","0.58354236343366","-0.38169515669515","0.58354236343366"
"Stoke","Tottenham","2011","08/21/2010","1","2","A","3.4","3.3","2.2","3.61158798","3.49896050","2.28668478","0.56268568033273","0.16369578134284","-0.39898989898989","-0.29417502787068","-0.39898989898989"
"West_Brom","Sunderland","2011","08/21/2010","1","0","H","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","0.61830484330484","0.57827515621633","-0.04002968708851","-0.04002968708851","-0.45719095719095"
"West_Ham","Bolton","2011","08/21/2010","1","3","A","2.25","3.3","3.25","2.34711246","3.51319381","3.45658013","-0.42605542605542","-0.65837865837865","-0.23232323232323","-0.23232323232323","-0.43731431966726"
"Wigan","Chelsea","2011","08/21/2010","0","6","A","15","6","1.22","20.40892193","6.71149144","1.24687713","-0.04899817850637","-0.577318067482","-0.52831988897562","-0.52831988897562","0.16409298762239"
"Fulham","Man_United","2011","08/22/2010","2","2","D","6","4","1.57","6.72057075","4.30792683","1.61531866","-0.6190728945506","-0.40291127838898","0.21616161616161","-0.27688651218062","0.21616161616161"
"Newcastle","Aston_Villa","2011","08/22/2010","6","0","H","3.2","3.25","2.3","3.39933681","3.45581507","2.40120462","0.70582497212932","0.65531992162426","-0.05050505050505","-0.05050505050505","0.51767676767676"
"Man_City","Liverpool","2011","08/23/2010","3","0","H","2.38","3.25","3.1","2.47920939","3.43785992","3.27046441","0.59664560613327","0.2809890404767","-0.31565656565656","-0.31565656565656","-0.38169515669515"
"Blackburn","Arsenal","2011","08/28/2010","1","2","A","6","4","1.57","6.72057075","4.30792683","1.61531866","-0.14879688605803","0.24695322439796","0.39575011045599","0.39575011045599","-0.16688837478311"
"Blackpool","Fulham","2011","08/28/2010","2","2","D","3.2","3.25","2.3","3.39933681","3.45581507","2.40120462","-0.41645763656633","-0.84214103480499","-0.42568339823866","0.76000425612159","-0.42568339823866"
"Chelsea","Stoke","2011","08/28/2010","2","0","H","1.11","9","23","1.13326757","10.79746241","40.02961672","-0.02498150324237","-0.59604304329368","-0.57106154005131","0.36208934463515","-0.57106154005131"
"Man_United","West_Ham","2011","08/28/2010","3","0","H","1.14","8","19","1.16425532","9.36986301","29.10638298","-0.03435672514619","-0.69273538352485","-0.65837865837865","-0.40291127838898","-0.65837865837865"
"Tottenham","Wigan","2011","08/28/2010","0","1","A","1.25","5.75","13","1.27705011","6.37073864","16.67286245","0.94002229654403","0.36270422906203","-0.577318067482","0.16369578134284","-0.577318067482"
"Wolves","Newcastle","2011","08/28/2010","1","1","D","2.4","3.2","3.1","2.50364554","3.38694992","3.27512839","-0.39941756272401","0.06110892769378","0.46052649041779","0.46052649041779","0.65531992162426"
"Aston_Villa","Everton","2011","08/29/2010","1","0","H","2.5","3.25","2.88","2.61989177","3.45557470","3.04027718","-0.32891737891737","-1.35743793243793","-1.02852055352055","0.10121913111043","-1.02852055352055"
"Bolton","Birmingham","2011","08/29/2010","2","2","D","2.1","3.3","3.6","2.18726986","3.52074513","3.86431227","-0.45719095719095","-0.1838085661615","0.27338239102944","0.27338239102944","0.30546142310848"
"Liverpool","West_Brom","2011","08/29/2010","1","0","H","1.22","6","15","1.24687713","6.71149144","20.40892193","-0.04899817850637","0.52927697770995","0.57827515621633","-0.68746212849628","0.57827515621633"

 

Any help is much appreciated!

 

Best, Frederik

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @Taasby Assuming your dataset is sorted by season and each season does have more than 60 observations-

 

data want;
 do _n_=1 by 1 until(last.season);
  set have;
  by season;
  if _n_>60 then output;
 end;
run;

View solution in original post

7 REPLIES 7
RichardDeVen
Barite | Level 11

Compute a within group sequence number and filter using that.

 

Example:

* if needed, sort data before further processing;
proc sort data=have; by season date; run;
* The BY statement includes date in order to ensure the data being processed is already sorted correctly;
* An ERROR will occur if the rows within a season are not sorted by date;

data want(drop=counter); have have; by season date; if first.season then counter=1; else counter+1; * within group counter; if counter > 60; * subsetting if; run;

 

Taasby
Calcite | Level 5

Cheers Richard! 

 

I got the answer from another suggestion but your code worked perfectly fine as well! So thank you very much!

 

/Frederik

Amir
PROC Star

Hi,

 

Do you need something like the following (untested):

 

/* sort in required order */
proc sort data = merged_file_3800
          out  = sorted;
   by season date;
run;

/* by each season delete the first 60 obs */
data merged_file_3200(drop = counter);
   counter = 0;

   do until (last.season);
      set sorted;
      by season;

      counter + 1;

      if counter le 60 then
         delete;
      else
         output;
   end;
run;

Kind regards,

Amir.

Taasby
Calcite | Level 5

Cheers Amir!

 

Doesn't quite do the job unfortunately even though the code makes sense to me.. I get the following message and a dataset of 0 observations..

 

image.png

 

Best, Frederik

novinosrin
Tourmaline | Level 20

Hi @Taasby Assuming your dataset is sorted by season and each season does have more than 60 observations-

 

data want;
 do _n_=1 by 1 until(last.season);
  set have;
  by season;
  if _n_>60 then output;
 end;
run;
Taasby
Calcite | Level 5
That worked! Thank you so much!

/Frederik
Amir
PROC Star

Hi,

 

The delete statement returns processing back to the start of the data step, so a fix would be to change the if condition:

 

/* by each season delete the first 60 obs */
data merged_file_3200(drop = counter);
   counter = 0;

   do until (last.season);
      set sorted;
      by season;

      counter + 1;

      if counter gt 60 then
         output;
   end;
run;

 

Kind regards,

Amir.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1704 views
  • 1 like
  • 4 in conversation