BookmarkSubscribeRSS Feed
deleted_user
Not applicable
I would like to delete certain small subsets of my data.

For example say I have a set like this:
A B C D
1 0 2 3
1 10 2 2
1 4 3 2
2 2 3 3
2 6 7 3
2 3 3 2
2 7 3 2
3 1 0 3
3 5 3 2
3 7 1 3

I would like to delete the sets of observations with matching A values for which the there is an observation with the B value equal to zero. So it would look like this:

A B C D
2 2 3 3
2 6 7 3
2 3 3 2
2 7 3 2
3 1 0 3
3 5 3 2
3 7 1 3


something like

data data_set;
set data_set;
if B='0' then ??????;
run;

Thanks for the help.
10 REPLIES 10
deleted_user
Not applicable
Something like:

*flag the observations with a B=0;
data flags;
set input;
if b=0 then flag=1;
else flag=0;
run;

*keep all the observations without a flagged observation in their group;
proc sql;
create table no_zeros as
select *, max(flag) as zeros
from flags
group by A
having calculated zeros=0;
quit;

I think that will do it. Correct me if I'm wrong.
Flip
Fluorite | Level 6
proc sql;
delete from data_set where a in (select distinct a from data_set where b = 0);
quit;
deleted_user
Not applicable
I tried this and received the error below. I will try the first example now. Thanks.

15 proc sql;
116 delete from aptest where date and racenum in (select distinct date and racenum from aptest where
116! lifestarts = 0);
WARNING: This DELETE/INSERT statement recursively references the target table. A consequence of this
is a possible data integrity problem.
NOTE: 3071 rows were deleted from WORK.APTEST.


but there still remain observations for which lifestarts=0.
deleted_user
Not applicable
also received this error for the first attempt:

140 proc sql;
141 create table no_zeros as
142 select *, max(flag) as zeros
143 from flags
144 group by date and racenum
145 having calculated zeros=0;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.NO_ZEROS created, with 0 rows and 1479 columns.
deleted_user
Not applicable
140 proc sql;
141 create table no_zeros as
142 select *, max(flag) as zeros
143 from flags
144 group by date and racenum
145 having calculated zeros=0;

Firstly it's not an ERROR, it's a note. However you need to do

group by date, racenum

not

group by date and racenum

Flip's version is a much better one. You just need to alter it slightly. Instead of deleting records, just keep them.

proc sql;
create table no_zeros as
select * from your_dataset
where a not in (select distinct a from your_dataset where b=0);
quit;
deleted_user
Not applicable
Sorry for the error comment, just ment it wasnt doing what we wanted. Pznew, your code will do the job now, but if the other is more efficient I would like to give it a shot.


proc sql;
create table no_zeros as
select * from aptest
where date and racenum not in (select distinct date and racenum from aptest where lifestarts = 0);
quit;

This still does not seem to work. It just deleted the first racenum of each date. I think it is selecting a distinct date and then racenum=1 is the first racenum it comes to so it deltes it.


I have not used proc sql so I am kind of lost. The date, racenum fix from above doesnt work in this setting.
Flip
Fluorite | Level 6
Try where date || racenum not in (select distinct date || racenum from aptest where lifestarts = 0);

SQL will often give warnings, but once you understand what is happening, you can usually tell if they are a problem. Of course if you are validating the system you may need to code so that you do not get them. ie. creating a new data set or using a work dataset to do the selection.
deleted_user
Not applicable
283 proc sql;
284 create table no_zeros as
285 select * from aptest
286 where date || racenum not in (select distinct date || racenum from aptest where lifestarts = 0);
ERROR: Concatenation (||) requires character operands.
ERROR: Concatenation (||) requires character operands.
Flip
Fluorite | Level 6
Sorry;

how about:
create table no_zeros as select a.* from aptest as a, (select a, c from aptest where b=0) as b
where not(a.date = b.date and b.racenum = a.racenum);
deleted_user
Not applicable
Got it, Thanks!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1026 views
  • 0 likes
  • 2 in conversation