Hello
I have a data set that consist of a column, 2 consecutive obs can have the same value just the second has a '/'
DATA HAVE;
INPUT COLUMN $200;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;I want to build a program that check if 2 consecutive rows have the same value and delete both.
To have :
DATA WANT;
INPUT COLUMN $200;
DATALINES;
AAAA
CCCC
DDDD
GGGG
;
RUN;Do you have any idea how to do that ?
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
data want;
merge have have(firstobs=2 rename=COLUMN=C);
COLUMN=compress(COLUMN,'/');
C=compress(c,'/');
if column=c or lag(COLUMN)=COLUMN then delete;
drop c;
run;
I like the BY statement with first-DOT and last-DOT's. This doesn't delete records, but instead just doesn't output them if the values aren't unique. You can see where I remove the "/" character in the second dataset. This also assumes that the original data is sorted.
DATA HAVE;
length column $200;
INPUT COLUMN;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;;;;;
RUN;
data temp;
set have;
if substr(column,1,1)="/" then column=substr(column,2, 200);
run;
data want;
set temp;
by column;
if first.column and last.column;
run;
Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF
View now: on-demand content for SAS users
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
data want;
merge have have(firstobs=2 rename=COLUMN=C);
COLUMN=compress(COLUMN,'/');
C=compress(c,'/');
if column=c or lag(COLUMN)=COLUMN then delete;
drop c;
run;
DATA HAVE;
infile cards truncover;
INPUT COLUMN $200.;
DATALINES;
AAAA
BBBB
/BBBB
CCCC
DDDD
EEEE
/EEEE
FFFF
/FFFF
GGGG
;
RUN;
proc sql;
create table want(drop=c) as
select *,compress(COLUMN,'/') as c
from have
group by c
having count(c)=1;
quit;
data want;
merge
have
have (rename=(column=_column) firstobs=2)
;
if substr(column,1,1) ne '/' and column ne substr(_column,2);
drop _column;
run;
proc print data=want noobs;
run;
Result:
column AAAA CCCC DDDD GGGG
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.