Hello
I have a data set that consist of a column, 2 consecutive obs can have the same value just the second has a '/'
DATA HAVE;
INPUT COLUMN $200;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
I want to build a program that check if 2 consecutive rows have the same value and delete both.
To have :
DATA WANT;
INPUT COLUMN $200;
DATALINES;
AAAA
CCCC
DDDD
GGGG
;
RUN;
Do you have any idea how to do that ?
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
data want;
merge have have(firstobs=2 rename=COLUMN=C);
COLUMN=compress(COLUMN,'/');
C=compress(c,'/');
if column=c or lag(COLUMN)=COLUMN then delete;
drop c;
run;
I like the BY statement with first-DOT and last-DOT's. This doesn't delete records, but instead just doesn't output them if the values aren't unique. You can see where I remove the "/" character in the second dataset. This also assumes that the original data is sorted.
DATA HAVE;
length column $200;
INPUT COLUMN;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;;;;;
RUN;
data temp;
set have;
if substr(column,1,1)="/" then column=substr(column,2, 200);
run;
data want;
set temp;
by column;
if first.column and last.column;
run;
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
data want;
merge have have(firstobs=2 rename=COLUMN=C);
COLUMN=compress(COLUMN,'/');
C=compress(c,'/');
if column=c or lag(COLUMN)=COLUMN then delete;
drop c;
run;
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
proc sql;
create table want(drop=c) as
select *,compress(COLUMN,'/') as c
from have
group by c
having count(c)=1;
quit;
data want;
merge
have
have (rename=(column=_column) firstobs=2)
;
if substr(column,1,1) ne '/' and column ne substr(_column,2);
drop _column;
run;
proc print data=want noobs;
run;
Result:
column AAAA CCCC DDDD GGGG
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.