turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- delete sets of observations

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-24-2009 12:04 PM

I would like to delete certain small subsets of my data.

For example say I have a set like this:

A B C D

1 0 2 3

1 10 2 2

1 4 3 2

2 2 3 3

2 6 7 3

2 3 3 2

2 7 3 2

3 1 0 3

3 5 3 2

3 7 1 3

I would like to delete the sets of observations with matching A values for which the there is an observation with the B value equal to zero. So it would look like this:

A B C D

2 2 3 3

2 6 7 3

2 3 3 2

2 7 3 2

3 1 0 3

3 5 3 2

3 7 1 3

something like

data data_set;

set data_set;

if B='0' then ??????;

run;

Thanks for the help.

For example say I have a set like this:

A B C D

1 0 2 3

1 10 2 2

1 4 3 2

2 2 3 3

2 6 7 3

2 3 3 2

2 7 3 2

3 1 0 3

3 5 3 2

3 7 1 3

I would like to delete the sets of observations with matching A values for which the there is an observation with the B value equal to zero. So it would look like this:

A B C D

2 2 3 3

2 6 7 3

2 3 3 2

2 7 3 2

3 1 0 3

3 5 3 2

3 7 1 3

something like

data data_set;

set data_set;

if B='0' then ??????;

run;

Thanks for the help.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 12:14 PM

Something like:

*flag the observations with a B=0;

data flags;

set input;

if b=0 then flag=1;

else flag=0;

run;

*keep all the observations without a flagged observation in their group;

proc sql;

create table no_zeros as

select *, max(flag) as zeros

from flags

group by A

having calculated zeros=0;

quit;

I think that will do it. Correct me if I'm wrong.

*flag the observations with a B=0;

data flags;

set input;

if b=0 then flag=1;

else flag=0;

run;

*keep all the observations without a flagged observation in their group;

proc sql;

create table no_zeros as

select *, max(flag) as zeros

from flags

group by A

having calculated zeros=0;

quit;

I think that will do it. Correct me if I'm wrong.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 12:23 PM

proc sql;

delete from data_set where a in (select distinct a from data_set where b = 0);

quit;

delete from data_set where a in (select distinct a from data_set where b = 0);

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Flip

03-24-2009 01:01 PM

I tried this and received the error below. I will try the first example now. Thanks.

15 proc sql;

116 delete from aptest where date and racenum in (select distinct date and racenum from aptest where

116! lifestarts = 0);

WARNING: This DELETE/INSERT statement recursively references the target table. A consequence of this

is a possible data integrity problem.

NOTE: 3071 rows were deleted from WORK.APTEST.

but there still remain observations for which lifestarts=0.

15 proc sql;

116 delete from aptest where date and racenum in (select distinct date and racenum from aptest where

116! lifestarts = 0);

WARNING: This DELETE/INSERT statement recursively references the target table. A consequence of this

is a possible data integrity problem.

NOTE: 3071 rows were deleted from WORK.APTEST.

but there still remain observations for which lifestarts=0.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 01:05 PM

also received this error for the first attempt:

140 proc sql;

141 create table no_zeros as

142 select *, max(flag) as zeros

143 from flags

144 group by date and racenum

145 having calculated zeros=0;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.NO_ZEROS created, with 0 rows and 1479 columns.

140 proc sql;

141 create table no_zeros as

142 select *, max(flag) as zeros

143 from flags

144 group by date and racenum

145 having calculated zeros=0;

NOTE: The query requires remerging summary statistics back with the original data.

NOTE: Table WORK.NO_ZEROS created, with 0 rows and 1479 columns.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 01:10 PM

140 proc sql;

141 create table no_zeros as

142 select *, max(flag) as zeros

143 from flags

144 group by date and racenum

145 having calculated zeros=0;

Firstly it's not an ERROR, it's a note. However you need to do

group by date, racenum

not

group by date and racenum

Flip's version is a much better one. You just need to alter it slightly. Instead of deleting records, just keep them.

proc sql;

create table no_zeros as

select * from your_dataset

where a not in (select distinct a from your_dataset where b=0);

quit;

141 create table no_zeros as

142 select *, max(flag) as zeros

143 from flags

144 group by date and racenum

145 having calculated zeros=0;

Firstly it's not an ERROR, it's a note. However you need to do

group by date, racenum

not

group by date and racenum

Flip's version is a much better one. You just need to alter it slightly. Instead of deleting records, just keep them.

proc sql;

create table no_zeros as

select * from your_dataset

where a not in (select distinct a from your_dataset where b=0);

quit;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 01:53 PM

Sorry for the error comment, just ment it wasnt doing what we wanted. Pznew, your code will do the job now, but if the other is more efficient I would like to give it a shot.

proc sql;

create table no_zeros as

select * from aptest

where date and racenum not in (select distinct date and racenum from aptest where lifestarts = 0);

quit;

This still does not seem to work. It just deleted the first racenum of each date. I think it is selecting a distinct date and then racenum=1 is the first racenum it comes to so it deltes it.

I have not used proc sql so I am kind of lost. The date, racenum fix from above doesnt work in this setting.

proc sql;

create table no_zeros as

select * from aptest

where date and racenum not in (select distinct date and racenum from aptest where lifestarts = 0);

quit;

This still does not seem to work. It just deleted the first racenum of each date. I think it is selecting a distinct date and then racenum=1 is the first racenum it comes to so it deltes it.

I have not used proc sql so I am kind of lost. The date, racenum fix from above doesnt work in this setting.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 02:43 PM

Try where date || racenum not in (select distinct date || racenum from aptest where lifestarts = 0);

SQL will often give warnings, but once you understand what is happening, you can usually tell if they are a problem. Of course if you are validating the system you may need to code so that you do not get them. ie. creating a new data set or using a work dataset to do the selection.

SQL will often give warnings, but once you understand what is happening, you can usually tell if they are a problem. Of course if you are validating the system you may need to code so that you do not get them. ie. creating a new data set or using a work dataset to do the selection.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Flip

03-24-2009 02:54 PM

283 proc sql;

284 create table no_zeros as

285 select * from aptest

286 where date || racenum not in (select distinct date || racenum from aptest where lifestarts = 0);

ERROR: Concatenation (||) requires character operands.

ERROR: Concatenation (||) requires character operands.

284 create table no_zeros as

285 select * from aptest

286 where date || racenum not in (select distinct date || racenum from aptest where lifestarts = 0);

ERROR: Concatenation (||) requires character operands.

ERROR: Concatenation (||) requires character operands.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to deleted_user

03-24-2009 03:12 PM

Sorry;

how about:

create table no_zeros as select a.* from aptest as a, (select a, c from aptest where b=0) as b

where not(a.date = b.date and b.racenum = a.racenum);

how about:

create table no_zeros as select a.* from aptest as a, (select a, c from aptest where b=0) as b

where not(a.date = b.date and b.racenum = a.racenum);

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Flip

03-24-2009 03:52 PM

Got it, Thanks!