Dataset not saving deleted observations

Reply
Occasional Contributor
Posts: 12

Dataset not saving deleted observations

I have deleted a couple of observations from my dataset. When I run the code it works fine and deletes the cases. However, when I close out of SAS and open it up the next day to continue working, I run my libname statement and go to where I left off in my code to begin working. When I run new code, it adds the deleted cases back in. It is not saving that I deleted them early on in my code. Do I have to rerun all my code everytime I log in? Shouldn't I just be able to rerun the libname and pick up where I left off? Here is the code I used to delete the cases:

DATA hemo.PLAY;

        SET hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

                DOB='29SEP2009'd;

                _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

                DOB='02AUG2010'd;

                _Age = DateofDeath - DOB;

        END;

        IF UniqueKey in (23,24) THEN delete;

        IF _CaseNumber=11.04184 THEN delete;

        IF _CaseNumber=11.02221 THEN delete;

        IF _CaseNumber=11.04744 THEN delete;

        IF _CaseNumber=10.04942 THEN delete;

RUN;

Super User
Posts: 11,343

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

The cases are deleted, or probably better phrased as not ever written to, the OUTPUT data set hemo.play. Since you have not deleted them from hemo.hemo_mrg2 they are there the next time the code runs.

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

What would you suggest? Would I need to do this?

DATA hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

                DOB='29SEP2009'd;

                _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

                DOB='02AUG2010'd;

                _Age = DateofDeath - DOB;

        END;

        IF UniqueKey in (23,24) THEN delete;

        IF _CaseNumber=11.04184 THEN delete;

        IF _CaseNumber=11.02221 THEN delete;

        IF _CaseNumber=11.04744 THEN delete;

        IF _CaseNumber=10.04942 THEN delete;

RUN;

Data hemo.play;

set hemo.hemo_mrg2;

run;

Super User
Posts: 19,770

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

Your original code appears correct, assuming it ran correctly.

Your hemo.play data set will exist with the deleted records removed. Your hemo.mrg2 datasets will still have the records, as it was not modified.

Explain what isn't happening that you expect to happen more clearly perhaps.

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

So my original dataset has 78 observations in it. After running my code above and deleting 4 cases, my dataset now has 74 cases in it. However if I run:

proc print data=hemo.play;

var variablename;

run;

Then my data goes back to having 78 observations, so somehow the 4 observations that I deleted above are not being saved in hemo.play.

Super User
Posts: 19,770

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

Can you post the full log, showing those results?

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

LIBNAME hemo "C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012-Hemosiderin\Hemo SAS Data";

NOTE: Libref HEMO was successfully assigned as follows:

      Engine:        V9

      Physical Name: C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012Hemosiderin\Hemo SAS Data

2    DATA hemo.PLAY;

3        SET hemo.hemo_mrg2;

4        IF _CaseNumber="10.02849" THEN DO;

5            DOB='29SEP2009'd;

6            _Age = DateofDeath - DOB;

7        END;

8        IF _CaseNumber="10.03520" THEN DO;

9            DOB='02AUG2010'd;

10           _Age = DateofDeath - DOB;

11       END;

12       IF UniqueKey in (23,24) THEN delete;

13       IF _CaseNumber=11.04184 THEN delete;

14       IF _CaseNumber=11.02221 THEN delete;

15       IF _CaseNumber=11.04744 THEN delete;

16       IF _CaseNumber=10.04942 THEN delete;

17   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      13:8   14:8   15:8   16:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.04 seconds

      cpu time            0.04 seconds

18   proc print data=hemo.play;

NOTE: Writing HTML Body file: sashtml.htm

19   format LastPlacedTimeTest time.;

WARNING: Variable LASTPLACEDTIMETEST not found in data set HEMO.PLAY.

20   run;

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: PROCEDURE PRINT used (Total process time):

      real time           2.36 seconds

      cpu time            2.27 seconds

21   DATA hemo.play;

22       SET hemo.hemo_mrg2;

23       IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

24   Run;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      23:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 78 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

Super User
Posts: 19,770

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

The proc print shows that you have 74 in hemo.play. You then replace hemo.play with the original data set (hemo_mrg2) which still has 78 observations so your next version of hemo.play has 78 observations.

The output is correct, there's a flaw in your expectations. If you want to modify hemo_mrg permanently overwrite the output, but then you lose the table which may or may not be okay.

ie last proc should be:

data hemo.play2;

set hemo.play;

blah blah;

run;

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

I made the changes as you suggested and I seem to still be having trouble. I have included my log file. It seems to run fine initially but then when I try to do a second if then statement it overrides the previous one. So while my number of observations is at 74, like I want, it no longer recognizes the new variable I created. 

    LIBNAME hemo "C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012

  ! -Hemosiderin\Hemo SAS Data";

NOTE: Libref HEMO was successfully assigned as follows:

      Engine:        V9

      Physical Name: C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012

      -Hemosiderin\Hemo SAS Data

    DATA hemo.PLAY;

        SET hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

            DOB='29SEP2009'd;

            _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

            DOB='02AUG2010'd;

           _Age = DateofDeath - DOB;

       END;

       IF UniqueKey in (23,24) THEN delete;

       IF _CaseNumber=11.04184 THEN delete;

       IF _CaseNumber=11.02221 THEN delete;

       IF _CaseNumber=11.04744 THEN delete;

       IF _CaseNumber=10.04942 THEN delete;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      13:8   14:8   15:8   16:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.03 seconds

      cpu time            0.03 seconds

   Data hemo.play2;

   Set hemo.play;

   hemoscore=.;

   IF (_CaseNumber=11.02088) OR (_CaseNumber=11.04817) OR (_CaseNumber=11.03348) OR

! (_CaseNumber=12.04653) Then hemoscore=3;

   Else IF (_CaseNumber=10.01867) OR (_CaseNumber=10.02274) OR (_CaseNumber=11.01668) OR

! (_CaseNumber=11.01889) OR (_CaseNumber=11.03437) OR (_CaseNumber=12.00581) THEN hemoscore=2;

   Else hemoscore=1;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      21:5     21:31    21:57    21:83    22:10    22:36    22:62    22:88    22:114   22:140

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 478 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

  Data hemo.play2;

  set hemo.play;

  IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

  RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      54:4

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.03 seconds

   PROC FREQ data=hemo.play2;

   tables hemoscore * Presenceofpetechiae;

ERROR: Variable HEMOSCORE not found.

  RUN;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE FREQ used (Total process time):

      real time           0.00 seconds

      cpu time            0.00 seconds

Super User
Super User
Posts: 7,039

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

Why do you keep overwriting the same datasets?

1) DATA hemo.PLAY;

2) Data hemo.play2;

3) Data hemo.play2;

The computer is just doing what you tell it to do.

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

So I should not be using the set statement each time?

Super User
Posts: 19,770

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

It does recognize your new variable, it doesn't find a different variable hemoscore.

Yes, you need to use a set statement each time.

In a data step:

A set statement points to input data, the DATA statements points to the output dataset. They do need to line up though.

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

But I made hemoscore in the step above and it was recognized (see here)

  Data hemo.play2;

   Set hemo.play;

  hemoscore=.;

   IF (_CaseNumber=11.02088) OR (_CaseNumber=11.04817) OR (_CaseNumber=11.03348) OR

! (_CaseNumber=12.04653) Then hemoscore=3;

   Else IF (_CaseNumber=10.01867) OR (_CaseNumber=10.02274) OR (_CaseNumber=11.01668) OR

! (_CaseNumber=11.01889) OR (_CaseNumber=11.03437) OR (_CaseNumber=12.00581) THEN hemoscore=2;

   Else hemoscore=1;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      21:5     21:31    21:57    21:83    22:10    22:36    22:62    22:88    22:114   22:140

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 478 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

Its then when I run this next that it no longer recognizes that I made hemoscore

Data hemo.play2;

  set hemo.play;

  IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

  RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      54:4

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.03 seconds

   PROC FREQ data=hemo.play2;

   tables hemoscore * Presenceofpetechiae;

ERROR: Variable HEMOSCORE not found.

  RUN;

Super User
Posts: 19,770

Re: Dataset not saving deleted observations

Posted in reply to uwmsasuser

Draw some diagrams of your input/output data sets and I think you'll begin to see the issues.

Occasional Contributor
Posts: 12

Re: Dataset not saving deleted observations

Ok, Hopefully I am understanding what I need to do and not overwriting anymore datasets. I ran the code below and was able to get the results I was looking for with no errors. Thank you all for your help!

Data temp;

Set hemo.hemo_mrg2;

hemoscore=1; 

if _CaseNumber=10.01867 or _CaseNumber=10.02274 or _CaseNumber=11.01668 or _CaseNumber=11.01889 or _CaseNumber=11.03437 or _CaseNumber=12.00581 then hemoscore=2;  

if _CaseNumber=11.02088 or _CaseNumber=11.04817 or _CaseNumber=11.03348 or _CaseNumber=12.04653 then hemoscore=3;

if _CaseNumber=11.04184 or _CaseNumber=11.02221 or _CaseNumber=11.04744 or _CaseNumber=10.04942 then delete;

if _CaseNumber= 11.02088 then Presenceofpetechiae="Not Reported";

run;

Data hemo.play2;

set temp;

run;

proc print data=hemo.play2;

var Presenceofpetechiae;

run;

PROC FREQ data=hemo.play2;

tables hemoscore * Presenceofpetechiae;

  RUN;

Ask a Question
Discussion stats
  • 18 replies
  • 352 views
  • 6 likes
  • 4 in conversation