Hi, I'm obtained a error because my dataset have two equal observations. I need to delete it is thats the case.
Example
_LABEL_ X1 X2 X3
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
I need to delete Obs1 because its repeat
Any idea?
data have;
input _LABEL_ $ X1 X2 X3;
datalines;
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
;
run;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey('_LABEL_');
h.definedone();
end;
set have ;
if h.check()=0 then delete;
else h.add();
run;
Like this?
data have;
input _LABEL_ $ X1 X2 X3;
datalines;
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
;
proc sort data = have;
by _LABEL_;
run;
data want;
set have;
by _LABEL_;
if first._LABEL_;
run;
Or do you need to delete both Obs1 observations?
In that case
data want;
set have;
by _LABEL_;
if first._LABEL_ = last._LABEL_;
run;
@mariange8282 wrote:
Hi, I'm obtained a error because my dataset have two equal observations. I need to delete it is thats the case.
Example
_LABEL_ X1 X2 X3
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
I need to delete Obs1 because its repeat
Any idea?
Since the "Obs1" observations contain different data in the other columns, the question arises: which values need to be kept?
Simply sorting away with nodupkey might lose valuable information.
Hello,
data have;
input _LABEL_ $ X1 X2 X3;
datalines;
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
;
run;
data want;
if _n_=1 then
do;
declare hash h(dataset:"have", multidata :'Y');
h.definekey('_LABEL_');
h.definedata(all:'YES');
h.definedone();
end;
do until (last);
set have end=last;
rc=h.find();
if h.find_next() ne 0 then output;
end;
drop rc;
run;
data have;
input _LABEL_ $ X1 X2 X3;
datalines;
Obs1 1 2 3
Obs2 4 5 6
Obs1 7 8 9
Obs3 4 5 6
;
run;
data want;
if _n_=1 then do;
if 0 then set have;
declare hash h();
h.definekey('_LABEL_');
h.definedone();
end;
set have ;
if h.check()=0 then delete;
else h.add();
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.