cancel
Showing results for 
Search instead for 
Did you mean: 

Reg :Duplicates

R_Win
Calcite | Level 5

Reg :Duplicates

i have a data set

Data m;
input id name $;
cards;
1 Raju
2 nani
2 nani
2 kool
3 india
4 usa
4usa
4usa
5 uk
run;

Now i want the out put into other dataset with out coming of one duplicates to another dataset.the output shd be like this

id name
1 Raju
2 kool
3 india
5 uk

As the 2nani, and 4,usa are alredy repeted that observations shd not come to the new dataset
Message was edited by: Main Message was edited by: Main
5 REPLIES 5
deleted_user
Not applicable

Re: Reg :Duplicates

Hi,

u will get the desired output by using the following code.


Data m;
input id name $;
cards;
1 Raju
2 nani
2 nani
2 nani
3 india
4 usa
4 usa
4 usa
5 uk
run;
proc sort data=m;
by id;
run ;

data m_new;
set m;
by id;
if first.id ne 1 or last.id ne 1 then delete;
else if first.id then output;
run;
sbb
Lapis Lazuli | Level 10
Lapis Lazuli | Level 10

Re: Reg :Duplicates

Message contains a hyperlink
After sorting your data file in desired order with a BY variable list, then use a DATA step with the two statements IF FIRST. and also IF LAST. to identify totally unique one-occurence data conditions, and output those to one file, while also outputting the others to a separate file.

Have a look at the SAS support http://support.sas.com/ website and its SEARCH facility, to search on the words:

duplicate by variable

and you will find both SAS product documentation and technical reference papers on the topic.


Scott Barry
SBBWorks, Inc.
GertNissen
Barite | Level 11

Re: Reg :Duplicates

proc summary data=m n;
by id name;
output out=freq(where=(_FREQ_=1);
run;
ssas
Calcite | Level 5

Re: Reg :Duplicates

HI,
why can't you try this

proc sort data=a nodupkey out=b;
by id name;
run;

data c;
set b;
if first.id eq 1 then output c;
run;

or


a simple way

proc sql;
create table c as select distinct id,name from b;
quit;


hope it may help you
sams
deleted_user
Not applicable

Re: Reg :Duplicates

as your data appear to be approximately in order, you could try[pre] data reduced ;
set m ;
by id NOTSORTED ;
if last.id ;
run ;[/pre]This returns the last row in the current order, for each ID.

PeterC