BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dapenDaniel
Obsidian | Level 7

Hi,

 

I have a dataset, which has three variables (FirmID, Year and Var1). I would like to delete the duplicate so that the same Var1 is only shown once within 3 years for each firm. What programs do I need to use? Thanks.

 

have

 

FirmIDYearVar1
10011995AA
10011995AB
10011995AC
10011995AD
10011996AA
10011996CS
10011996SW
10011997AC
10011997FD
10011998OL
10011999AD
10011999SC
10011999QW

 

 

want

 

FirmIDYearVar1
10011995AA
10011995AB
10011995AC
10011995AD
10011996CS
10011996SW
10011997FD
10011998OL
10011999AD
10011999SC
10011999QW

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

If you don't have a big table ,try SQL,otherwise try Hash Table.

 

data x;
infile cards expandtabs;
input FirmID	Year	Var1 $;
cards;
1001	1995	AA
1001	1995	AB
1001	1995	AC
1001	1995	AD
1001	1996	AA
1001	1996	CS
1001	1996	SW
1001	1997	AC
1001	1997	FD
1001	1998	OL
1001	1999	AD
1001	1999	SC
1001	1999	QW
;


proc sql;
select *
 from x as a 
  where not exists(
   select * from x 
    where FirmID=a.FirmID and 
     year between a.year-3 and a.year-1 and
      Var1=a.Var1);
quit;

View solution in original post

2 REPLIES 2
Ksharp
Super User

If you don't have a big table ,try SQL,otherwise try Hash Table.

 

data x;
infile cards expandtabs;
input FirmID	Year	Var1 $;
cards;
1001	1995	AA
1001	1995	AB
1001	1995	AC
1001	1995	AD
1001	1996	AA
1001	1996	CS
1001	1996	SW
1001	1997	AC
1001	1997	FD
1001	1998	OL
1001	1999	AD
1001	1999	SC
1001	1999	QW
;


proc sql;
select *
 from x as a 
  where not exists(
   select * from x 
    where FirmID=a.FirmID and 
     year between a.year-3 and a.year-1 and
      Var1=a.Var1);
quit;
ed_sas_member
Meteorite | Level 14

Hi @dapenDaniel 

 

Your question looks very similar to a previous one you posted.

Here is an attempt to do that:

proc sort data=have;
	by FirmID Var1 Year;
run;

data have1;
	set have;
	by FirmID Var1 Year;

	retain Year_ref;
	if first.Var1 then Year_ref = year;

	if first.Var1 or Year >= (Year_ref + 3) then output;
	
	drop Year_ref;
run;

proc sort data=have1 out=want;
	by FirmID Year;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 355 views
  • 1 like
  • 3 in conversation