Hi,
I'm trying to remove rows where key is the same and add_delete is both "A" and "D"
data have ;
infile datalines delimiter=',';
input key $50. add_delete $1. ;
datalines;
1234_1234567654321_P_L720,A
1234_1234567654321_P_L738,A
1234_1234567654321_P_L738,D
1234_1234567654321_P_L821,A
1234_1234567654321_P_L821,D
1234_1234567654321_P_R209,A
1234_1234567654321_P_R209,D
1234_7654321234567_P_L720,A
1234_7654321234567_P_L720,D
1234_7654321234567_P_L738,A
1234_7654321234567_P_L738,D
1234_7654321234567_P_L821,D
1234_7654321234567_P_R209,A
1234_7654321234567_P_R209,D
;
The output of this example should be
key | add_delete |
1234_1234567654321_P_L720 | A |
1234_7654321234567_P_L821 | D |
It might be as simple as counting the occurrences by key and if it's greater than 1, delete. I'm testing that now. Thanks in advance.
Like this?
data have ;
infile datalines dsd;
length key $50 add_delete $1;
input key add_delete;
datalines;
1234_1234567654321_P_L720,A
1234_1234567654321_P_L738,A
1234_1234567654321_P_L738,D
1234_1234567654321_P_L821,A
1234_1234567654321_P_L821,D
1234_1234567654321_P_R209,A
1234_1234567654321_P_R209,D
1234_7654321234567_P_L720,A
1234_7654321234567_P_L720,D
1234_7654321234567_P_L738,A
1234_7654321234567_P_L738,D
1234_7654321234567_P_L821,D
1234_7654321234567_P_R209,A
1234_7654321234567_P_R209,D
;
proc sql;
select
*
from have as a
where not exists (select * from have as b where a.key=b.key and a.add_delete ne b.add_delete);
quit;
Like this?
data have ;
infile datalines dsd;
length key $50 add_delete $1;
input key add_delete;
datalines;
1234_1234567654321_P_L720,A
1234_1234567654321_P_L738,A
1234_1234567654321_P_L738,D
1234_1234567654321_P_L821,A
1234_1234567654321_P_L821,D
1234_1234567654321_P_R209,A
1234_1234567654321_P_R209,D
1234_7654321234567_P_L720,A
1234_7654321234567_P_L720,D
1234_7654321234567_P_L738,A
1234_7654321234567_P_L738,D
1234_7654321234567_P_L821,D
1234_7654321234567_P_R209,A
1234_7654321234567_P_R209,D
;
proc sql;
select
*
from have as a
where not exists (select * from have as b where a.key=b.key and a.add_delete ne b.add_delete);
quit;
If the
then a DATA step with a BY statement will work, by keeping only those KEY's with a single observation:
data have ;
infile datalines delimiter=',';
input key :$50. add_delete $1. ;
datalines;
1234_1234567654321_P_L720,A
1234_1234567654321_P_L738,A
1234_1234567654321_P_L738,D
1234_1234567654321_P_L821,A
1234_1234567654321_P_L821,D
1234_1234567654321_P_R209,A
1234_1234567654321_P_R209,D
1234_7654321234567_P_L720,A
1234_7654321234567_P_L720,D
1234_7654321234567_P_L738,A
1234_7654321234567_P_L738,D
1234_7654321234567_P_L821,D
1234_7654321234567_P_R209,A
1234_7654321234567_P_R209,D
;
data want;
set have;
by key;
if first.key=1 and last.key=1;
run;
Alternatively, using a condition more analogous to @PGStats's suggestion.
data want;
merge have (where=(add_delete='A') in=ina)
have (where=(add_delete='D') in=ind);
by key;
where ina=0 or ind=0;
run;
which just says to keep those KEY's in which either A never appears or D never appears.
For large datasets, this may be faster than the SQL solution because it only compares contiguous records for matching KEYs. But again, it requires the data to be sorted by KEY.
My first try was to use a similar datastep which didn't work:
data want;
set have;
by key;
if first.add_delete='A' and last.add_delete='D' then delete;
run;
Can you help me understand what's wrong with this setup?
data have ;
infile datalines dsd;
length key $50 add_delete $1;
input key add_delete;
datalines;
1234_1234567654321_P_L720,A
1234_1234567654321_P_L738,A
1234_1234567654321_P_L738,D
1234_1234567654321_P_L821,A
1234_1234567654321_P_L821,D
1234_1234567654321_P_R209,A
1234_1234567654321_P_R209,D
1234_7654321234567_P_L720,A
1234_7654321234567_P_L720,D
1234_7654321234567_P_L738,A
1234_7654321234567_P_L738,D
1234_7654321234567_P_L821,D
1234_7654321234567_P_R209,A
1234_7654321234567_P_R209,D
;
proc sql;
select
*
from have as a
group by key
having count(distinct add_delete)=1;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Follow along as SAS technical trainer Dominique Weatherspoon expertly answers all your questions about SAS Libraries.
Find more tutorials on the SAS Users YouTube channel.