Deleting observations by witH same date NBA Data cleaning

Conleyw18 · Posted 03-08-2023 03:15 PM

I have game stats from the 2018-2019 NBA season but some data values are double counted with the team names switching variables. This is what the data looks like. Not all observations have this issue this is just what It looks like at certain points in the data

22	UTA	10/22/2018	MEM
23	OKC	10/22/2018	GSW
24	MEM	10/22/2018	UTA
25	GSW	10/22/2018	UTA

ballardw · Posted 03-08-2023 04:50 PM

If this were my data I would be tempted to place the team names in order and then sort the data, removing duplicates:

Something like:

data have;
   input num teama $ date :mmddyy10. teamb $;
   format date mmddyy10.;
datalines;
22	UTA	10/22/2018	MEM
23	OKC	10/22/2018	GSW
24	MEM	10/22/2018	UTA
25	GSW	10/22/2018	UTA
;

data need;
   set have;
   array t(*) teama teamb;
   call sortc(of t(*));
run;

proc sort data=need out=want nodupkey;
   by date teama teamb;
run;

You should provide example data in form of a data step as above. That way we do not have to guess variable names or properties.

If the ORDER of the team names is important, such as winner / loser then you would need to add additional information in the NEED data step to capture that information into a new variable.

vijaypratap0195 · Posted 04-21-2023 01:42 PM

Could you please provide us with the sample data and explain with any example where the values are doubling up?

Deleting observations by witH same date NBA Data cleaning

Re: Deleting observations by witH same date NBA Data cleaning

Re: Deleting observations by witH same date NBA Data cleaning

Deleting observations by witH same date NBA Data cleaning

Re: Deleting observations by witH same date NBA Data cleaning

Re: Deleting observations by witH same date NBA Data cleaning

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away