BookmarkSubscribeRSS Feed
arunams30
Calcite | Level 5

hi,

 

read the table if we have the duplicates it should write to the separate dataset and without duplicates to other dataset.please help

8 REPLIES 8
pearsoninst
Pyrite | Level 9
This can be done in lots of different ways ...Pls specify what you are looking for .
arunams30
Calcite | Level 5

hi,

 

thanks for your reply.

 

The table has duplicates,from that table i have read the records,the records which have duplictes should go one dataset and the records without duplicates should go to other data set. 

 

Steelers_In_DC
Barite | Level 11

There's a few ways to do this, it's best to give an example of the data you have and what you want to see.  Here's one solution:

 

data have;
input variable;
cards;
1
1
1
2
3
4
5
;

data dup nodup;
do until(last.variable);
set have;
by variable;
count + 1;
if first.variable then count = 1;
if count > 1 then dup = 'Y';
end;
do until(last.variable);
set have;
by variable;
if dup = 'Y' then output dup;
else output nodup;
end;
run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Or, to save yourself all that typing, you could do:

proc sort data=have out=uniques dupout=dups nodupkey;

  by <id variables>;

run;

 

And then if you want only those singles (as wasn't clear from the original post):

proc sql; 

  create table WANT as 

  select * from UNIQUES 

  where <id variables> not in (select <id vars> from DUPS);

quit;

slchen
Lapis Lazuli | Level 10


It is easy to do with proc sort.

If you want to keep one record from duplicate records, you could use nodupkey,
out dataset is Non_dup,remaining duplicate records go to up.
If you want to keep all unique record, use nouniqueley,all records go to uniqueout, othwise go to out.

 

data have;
input x $ y;
cards;
A 1
A 2
B 1
C 3
C 3
D 3
E 9
;


proc sort data=have out=Non_dup dupout=dup nodupkey ;
by x;
run;


proc sort data=have uniqueout=Unique out=all_dup nouniquekey;
by x;
run;

 

Astounding
PROC Star

Here are the pieces that you didn't tell us:

 

(a) What constitutes a duplicate?  Is just one variable the same, or are all variables the same?

(b) If there are duplicates, should all of them go into the same data set?  Or should the first one go into a separate data set and any additional duplicates go into a different data set?

arunams30
Calcite | Level 5
hi,

thanks for your reply.

The table has duplicates,from that table i have read the records,the records which have duplictes should go one dataset and the records without duplicates should go to other data set.
Steelers_In_DC
Barite | Level 11

In that case I think you want to use the solution I provided earlier.  Run that with a subset and see if you get the desired results.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 890 views
  • 0 likes
  • 6 in conversation