BookmarkSubscribeRSS Feed
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Hi,

I need help with the following:
In the SAS Help files, I got the code:

data accounts;
input AcctNumber Credit;
datalines;
1001 1500
1002 4900
1003 3000
;

data accounts;
modify accounts;
if AcctNumber=1002 then remove;
run;


NOW I ADDED THE CODE:

data accounts2;
input AcctNumber Credit;
datalines;
1004 3800
1005 9200
;
proc append base = accounts data = accounts2; run;


THE OBSERVATION NUMBERS IN THE TABLE accounts ARE 1,3,4,5. THIS IS NOT WHAT I DESIRE TO HAVE; I WANT TO HAVE: 1,2,3,4.

NOW, IF I USE THE POINT= FUNCTION AS FOLLOWS:

data test;
do n = 1 to 5;
set accounts point = n; output;
end;
stop;
run;

THERE IS STILL AN OBSERVATION 2, BUT ALSO AN OBSERVATION 5.

IS THERE ANY WAY TO CORRECT THE TABLE VECTOR AFTER A REMOVE STATEMENT?
18 REPLIES 18
SPR
Quartz | Level 8 SPR
Quartz | Level 8
Hello PHK,

What do you want to achieve? If you want to simply renumber you accounts after deleting it is very easy to achieve using n+1 statement. Or you need something else?

Sincerely,
SPR
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Hi SPR,

I don't really understand how the n+1 statement could be used to solve the problem, for I don't really want to add a variable n in the dataset. (Or how does one use the n+1 statement without creating a variable named "n"?) My problem is that, whereas the final dataset has only 4 observations, SAS somehow has a reference to 5 observations (as illustrated by the last piece of code).

The example provided is only an illustration of the problem. Actually I am working with a large dataset, from which I remove the last 5% or so of the data. I then add new data using PROC APPEND, because I don't want to rewrite all the data in the original dataset. Afterwards, I am using a custom binary search using the POINT = statement. But this is exactly where the problem comes in that is illustrated by the example.

The problem, as in the example, is that when the second observation is removed from the first dataset, if you open the table in SAS Viewer, the row numbers of the remaining to records are 1 and 3, and not 1 and 2 as one would have expected. And when using the POINT = statement, it corresponds to what you see in the SAS Viewer: there still are row number references to 3 observations.

Kind regards,
PHK
Ksharp
Super User
You delte 2 ,so it will not appear any more.
if AcctNumber=1002 then remove;

So,change

data accounts2;
modify accounts2;
if AcctNumber=1005 then remove;
run;
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Sorry, I don't understand how this will solve the problem: I don't want to remove the observation in which AcctNumber=1005. I need the row number references (or table vector or whatever it is called) of the final accounts-table to be 1, 2, 3, 4 rather than: 1,2,4,5.

Kind regards,
PHK
Ksharp
Super User
Oh.I understand what you mean.It is very interesting which i have never imaged.
Then you need this:
[pre]
data accounts;
input AcctNumber Credit;
datalines;
1001 1500
1002 4900
1003 3000
;

data accounts;
set accounts;
if AcctNumber=1002 then delete;
run;
[/pre]


Ksharp
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Hi Ksharp,

Thank you. But as far as I know, the code that you are suggesting:

data accounts;
set accounts;
if AcctNumber=1002 then delete;
run;

... rewrites the accounts table (due to the SET statement). I know that the code you are suggesting will work, but my problem is that I have a fairly large dataset and I only want to remove the last part (about 5%) of the dataset. If I use the delete statement with the set statement, the other 95% of the data still gets rewritten (is that not so?) - and that is what I want to avoid, but without getting the observation references mixed up.

By the way:
My problem is also illustrated (but not even mentioned or addressed) on:
http://v8doc.sas.com/sashtml/lgref/z0202648.htm
See the output of the PROC PRINT: there is an observation 1 and 3, and not 1 and 2.
PHK
Calcite | Level 5 PHK
Calcite | Level 5
((... I was wondering whether this in any way relates to what is called a "rollback segment" in IT jargon?))
PHK
Calcite | Level 5 PHK
Calcite | Level 5
SQL ALTERNATIVES SEEM TO FAIL AS WELL:

data accounts;
input AcctNumber Credit;
datalines;
1001 1500
1002 4900
1003 3000
;
run;

proc sql;
delete from accounts
where AcctNumber=1002;
quit;

... SAME PROBLEM, AS ALSO INDICATED BY:

data test;
do n = 1 to 3;
set accounts point = n; output;
end;
stop;
run;

----
proc sql;
delete from accounts
where MONOTONIC()=2;
quit;
... ALSO DOES NOT WORK
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Closest I could get to resolving the problem thus far:

There is something like a PURGE statement, which seems to renumber records.
The only problem is: the statement is used in PROC IML
(http://support.sas.com/documentation/cdl/en/imlug/59656/HTML/default/langref_sect207.htm) and we do not have PROC IML.

The function can also be used elsewhere:
(http://support.sas.com/documentation/cdl/en/imlug/59656/HTML/default/worksasdatasets_sect16.htm).

However, I can see no where that it can be used in a datastep without the SET statement or in PROC SQL. Message was edited by: PHK
data_null__
Jade | Level 19
When you MODIFY a data set and REMOVE an obs the record is not physically removed but somehow internally flagged as being deleted.

When reading a SAS data set using direct access with SET and the POINT option when you point to a REMOVED record the SET fails and sets _ERROR_ = 1. The NOBS option gives the “physical” record count which includes any REMOVED records. You can detect the removed records by examination of _ERROR_ and take appropriate action. Note also that when SET fails for a REMOVED record the variables being read by the SET statement will retain the values from the last successful SET.

I don’t know if this will help you or not. If you explain in detail what you need to do with the data with removed records perhaps someone can help you.

[pre]
data accounts;
input AcctNumber Credit @@;
datalines;
1001 1500 1002 4900 1003 3000
;;;;
run;
data accounts;
modify accounts;
if AcctNumber=1002 then remove;
run;
data accounts2;
input AcctNumber Credit @@;
datalines;
1004 3800 1005 9200
;;;;
run;
proc append base = accounts data = accounts2;
run;

data _null_;
do point = 1 to nobs;
_error_ = 0;
set accounts point=point nobs=nobs;
if _error_ eq 1 then do;
put 'NOTE: OBS ' point 'is not Accessible';
call missing(of _all_);
end;
put _all_;
end;
stop;
run;
[/pre]
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Thank you, this is useful information, even though I won't be able to make use of it in what I need to do.
data_null__
Jade | Level 19
You haven't full explained what you "need to do".
Ksharp
Super User
Hey.I found very useful options,SAS is genius.
[pre]
options compress=yes reuse=yes;
data accounts;
input AcctNumber Credit;
datalines;
1001 1500
1002 4900
1003 3000
;

data accounts;
modify accounts;
if AcctNumber=1002 then remove;
run;



data accounts2;
input AcctNumber Credit;
datalines;
1004 3800
1005 9200
;
proc append base = accounts data = accounts2; run;
[/pre]


Or You can code like this:

[pre]

data accounts(compress=yes reuse=yes);
input AcctNumber Credit;
datalines;
1001 1500
1002 4900
1003 3000
;

data accounts;
modify accounts;
if AcctNumber=1002 then remove;
run;



data accounts2;
input AcctNumber Credit;
datalines;
1004 3800
1005 9200
;
proc append base = accounts data = accounts2; run;
[/pre]


Ksharp
PHK
Calcite | Level 5 PHK
Calcite | Level 5
Brilliant! Thanks! This seems to solve the problem in a very simple way!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 1912 views
  • 0 likes
  • 5 in conversation