BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Junyong
Pyrite | Level 9

The following have has 50000 observations. The data step deletes the second half of have, so the last sql returns 25000.

data have;
do i=1 to 50000;
output;
end;
run;

data have;
set have;
if i>25000 then delete;
run;

proc sql;
select nobs from dictionary.tables where libname="WORK" and memname="HAVE";
quit;

The following have identically has 50000 observations. The second sql identically deletes the second half of have, but the third sql still returns 50000.

data have;
do i=1 to 50000;
output;
end;
run;

proc sql;
delete from have where i>25000;
quit;

proc sql;
select nobs from dictionary.tables where libname="WORK" and memname="HAVE";
quit;

I have never thought about the second possibility because I don't delete observations using sql. I found this issue from the advanced programming sample questions—the nobs and nlobs of dictionary.columns are different. How are the data delete and sql delete different? Why does SAS distinguish these two deletes?

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

If you delete one row in a 10-billion-row table, you might want to avoid rewriting the table.

It's a matter of balance: waste due to rewriting vs waste due to unused space.

 

I don't think the rows can be un-deleted.

 

View solution in original post

4 REPLIES 4
ChrisNZ
Tourmaline | Level 20

In the first case you are overwriting the data set.

In the last two cases you keep the data set and you flag some observations as deleted, but the physical number of observations is unchanged.

Look at the log messages to see that.

data T1 T2 T3;
  do I=1 to 50000;
    output;
  end;
run;
     
data T1;
  set T1;
  if I > 25000 then delete;
run;

data T2;
  modify T2;
  if I > 25000 then delete;
run;

proc sql;
  delete from T3 where I > 25000;
quit;

data _null_;
  set T1 nobs=N1;
  set T2 nobs=N2;
  set T3 nobs=N3;
  putlog N1= N2= N3=;
  stop;
run;

NOTE: The data set WORK.T1 has 25000 observations and 1 variables.


NOTE: The data set WORK.T2 has been updated. There were 25000 observations rewritten, 0 observations added and 0 observations deleted.


NOTE: 25000 rows were deleted from WORK.T3.


N1=25000 N2=50000 N3=50000

 

Junyong
Pyrite | Level 9

Thanks for these details. Then, (1) what is the advantage of the modify delete and the sql delete over the set delete? It seems they are still occupying the space so there is no storage benefit. Why do we need these? (2) Can a user countermand the modify delete or the sql delete to rescue the observations seemingly deleted?

ChrisNZ
Tourmaline | Level 20

If you delete one row in a 10-billion-row table, you might want to avoid rewriting the table.

It's a matter of balance: waste due to rewriting vs waste due to unused space.

 

I don't think the rows can be un-deleted.

 

ChrisNZ
Tourmaline | Level 20

Actually my example is misleading. Apologies about this.

data T2;
  modify T2;
  if I > 20000 then delete;
run;

data T2;
  modify T2;
  if I > 20000 then remove;
run;

The first data step does nothing. The delete statement just stops the processing of the step for the flagged observations.

The second data step flags the rows as deleted.

 

NOTE: The data set WORK.T2 has been updated. There were 20000 observations rewritten, 0 observations added and 0 observations deleted.

NOTE: The data set WORK.T2 has been updated. There were 0 observations rewritten, 0 observations added and 30000 observations deleted.

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 777 views
  • 2 likes
  • 2 in conversation