BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dapenDaniel
Obsidian | Level 7

Hello. I have a data set as following below.

 

ID              Year              Var1

1               1999                aa

2               1998                ab

2               1998                ac   

2               1998                ab

2               1998                ac

3               1998                ad

 

The red lines are the same and the blue line are the same. If all three variables (ID, year and Var1) are the same, then retain the first one and delete the duplicated one.

 

The data set following below is what I expect

 

ID              Year              Var1

1               1999                aa

2               1998                ab

2               1998                ac   

 

3               1998                ad

 

 

What code do I need? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

data have;
input ID              Year              Var1 $;
cards;
1               1999                aa
2               1998                ab
2               1998                ac   
2               1998                ab
2               1998                ac
3               1998                ad
;

proc sql;
create table want as
select distinct *
from have;
quit;

View solution in original post

6 REPLIES 6
novinosrin
Tourmaline | Level 20

data have;
input ID              Year              Var1 $;
cards;
1               1999                aa
2               1998                ab
2               1998                ac   
2               1998                ab
2               1998                ac
3               1998                ad
;

proc sql;
create table want as
select distinct *
from have;
quit;
dapenDaniel
Obsidian | Level 7

Thank you.It works well!

dapenDaniel
Obsidian | Level 7

If I have 4 variables and want to retain the first one for those that are same for the three variables (ID, Year and Var1), what code do I need? I have not used SQL before and am still learning it. Thanks for your patience.

 

original Data set:

ID              Year              Var1      Var2

1               1999                aa        ba

2               1998                ab        bb

2               1998                ac        bc

2               1998                ab        bd

2               1998                ac        be

3               1998                ad        bf

 

Expected Data set:

ID              Year              Var1      Var2

1               1999                aa        ba

2               1998                ab        bb

2               1998                ac        bc 

3               1998                ad        bf

 

novinosrin
Tourmaline | Level 20
data have;
input ID              Year      var1 $        Var2 $;
cards;
1               1999                aa        ba
2               1998                ab        bb
2               1998                ac        bc
2               1998                ab        bd
2               1998                ac        be
3               1998                ad        bf
;
proc sort data=have out=want nodupkey;
by id year var1;
run;

Just a simple proc sort 

dapenDaniel
Obsidian | Level 7

Thank you so much! Have a great thanksgiving day!

novinosrin
Tourmaline | Level 20

You too. Enjoy black friday shopping

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 889 views
  • 0 likes
  • 2 in conversation