BookmarkSubscribeRSS Feed
tejeshwar
Calcite | Level 5
I have a dataset which has multiple duplicate entries, say for variable "abc". Now, in the dataset, some values for this variable will have 3/4/6/7 duplicate entries.

What I want to do is only pick up the first 2 duplicate entires, so in the new dataset every value for variable "abc" has 2 duplicate entries.

Can someone help?

Thanks
2 REPLIES 2
Olivier
Pyrite | Level 9
What you have to do is : 1) sort your dataset 2) read the sorted data with a Data step and create a new variable 3) this variable will be zeroed for each new value of ABC 4) this variable will be added 1 on every observation 5) keep the observation if the new variable is less or equal to 2.
[pre]
PROC SORT DATA = myData OUT = sortedData ;
BY abc whatEver ;
RUN ;
DATA first2dup (WHERE=(countObs LE 2)) ;
SET sortedData ;
BY abc ;
IF FIRST.abc THEN countObs = 0 ;
countObs + 1 ;
RUN ;
[/pre]
Regards.
Olivier
tejeshwar
Calcite | Level 5
thanks Oliver, this works great!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 726 views
  • 0 likes
  • 2 in conversation