SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
uwmsasuser
Calcite | Level 5

I have deleted a couple of observations from my dataset. When I run the code it works fine and deletes the cases. However, when I close out of SAS and open it up the next day to continue working, I run my libname statement and go to where I left off in my code to begin working. When I run new code, it adds the deleted cases back in. It is not saving that I deleted them early on in my code. Do I have to rerun all my code everytime I log in? Shouldn't I just be able to rerun the libname and pick up where I left off? Here is the code I used to delete the cases:

DATA hemo.PLAY;

        SET hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

                DOB='29SEP2009'd;

                _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

                DOB='02AUG2010'd;

                _Age = DateofDeath - DOB;

        END;

        IF UniqueKey in (23,24) THEN delete;

        IF _CaseNumber=11.04184 THEN delete;

        IF _CaseNumber=11.02221 THEN delete;

        IF _CaseNumber=11.04744 THEN delete;

        IF _CaseNumber=10.04942 THEN delete;

RUN;

18 REPLIES 18
ballardw
Super User

The cases are deleted, or probably better phrased as not ever written to, the OUTPUT data set hemo.play. Since you have not deleted them from hemo.hemo_mrg2 they are there the next time the code runs.

uwmsasuser
Calcite | Level 5

What would you suggest? Would I need to do this?

DATA hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

                DOB='29SEP2009'd;

                _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

                DOB='02AUG2010'd;

                _Age = DateofDeath - DOB;

        END;

        IF UniqueKey in (23,24) THEN delete;

        IF _CaseNumber=11.04184 THEN delete;

        IF _CaseNumber=11.02221 THEN delete;

        IF _CaseNumber=11.04744 THEN delete;

        IF _CaseNumber=10.04942 THEN delete;

RUN;

Data hemo.play;

set hemo.hemo_mrg2;

run;

Reeza
Super User

Your original code appears correct, assuming it ran correctly.

Your hemo.play data set will exist with the deleted records removed. Your hemo.mrg2 datasets will still have the records, as it was not modified.

Explain what isn't happening that you expect to happen more clearly perhaps.

uwmsasuser
Calcite | Level 5

So my original dataset has 78 observations in it. After running my code above and deleting 4 cases, my dataset now has 74 cases in it. However if I run:

proc print data=hemo.play;

var variablename;

run;

Then my data goes back to having 78 observations, so somehow the 4 observations that I deleted above are not being saved in hemo.play.

Reeza
Super User

Can you post the full log, showing those results?

uwmsasuser
Calcite | Level 5

LIBNAME hemo "C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012-Hemosiderin\Hemo SAS Data";

NOTE: Libref HEMO was successfully assigned as follows:

      Engine:        V9

      Physical Name: C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012Hemosiderin\Hemo SAS Data

2    DATA hemo.PLAY;

3        SET hemo.hemo_mrg2;

4        IF _CaseNumber="10.02849" THEN DO;

5            DOB='29SEP2009'd;

6            _Age = DateofDeath - DOB;

7        END;

8        IF _CaseNumber="10.03520" THEN DO;

9            DOB='02AUG2010'd;

10           _Age = DateofDeath - DOB;

11       END;

12       IF UniqueKey in (23,24) THEN delete;

13       IF _CaseNumber=11.04184 THEN delete;

14       IF _CaseNumber=11.02221 THEN delete;

15       IF _CaseNumber=11.04744 THEN delete;

16       IF _CaseNumber=10.04942 THEN delete;

17   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      13:8   14:8   15:8   16:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.04 seconds

      cpu time            0.04 seconds

18   proc print data=hemo.play;

NOTE: Writing HTML Body file: sashtml.htm

19   format LastPlacedTimeTest time.;

WARNING: Variable LASTPLACEDTIMETEST not found in data set HEMO.PLAY.

20   run;

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: PROCEDURE PRINT used (Total process time):

      real time           2.36 seconds

      cpu time            2.27 seconds

21   DATA hemo.play;

22       SET hemo.hemo_mrg2;

23       IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

24   Run;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      23:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 78 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

Reeza
Super User

The proc print shows that you have 74 in hemo.play. You then replace hemo.play with the original data set (hemo_mrg2) which still has 78 observations so your next version of hemo.play has 78 observations.

The output is correct, there's a flaw in your expectations. If you want to modify hemo_mrg permanently overwrite the output, but then you lose the table which may or may not be okay.

ie last proc should be:

data hemo.play2;

set hemo.play;

blah blah;

run;

uwmsasuser
Calcite | Level 5

I made the changes as you suggested and I seem to still be having trouble. I have included my log file. It seems to run fine initially but then when I try to do a second if then statement it overrides the previous one. So while my number of observations is at 74, like I want, it no longer recognizes the new variable I created. 

    LIBNAME hemo "C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012

  ! -Hemosiderin\Hemo SAS Data";

NOTE: Libref HEMO was successfully assigned as follows:

      Engine:        V9

      Physical Name: C:\Users\gajeski\Desktop\ALTE Study\Hemosiderin\CEHSCC pilot Grant 2012

      -Hemosiderin\Hemo SAS Data

    DATA hemo.PLAY;

        SET hemo.hemo_mrg2;

        IF _CaseNumber="10.02849" THEN DO;

            DOB='29SEP2009'd;

            _Age = DateofDeath - DOB;

        END;

        IF _CaseNumber="10.03520" THEN DO;

            DOB='02AUG2010'd;

           _Age = DateofDeath - DOB;

       END;

       IF UniqueKey in (23,24) THEN delete;

       IF _CaseNumber=11.04184 THEN delete;

       IF _CaseNumber=11.02221 THEN delete;

       IF _CaseNumber=11.04744 THEN delete;

       IF _CaseNumber=10.04942 THEN delete;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      13:8   14:8   15:8   16:8

NOTE: There were 78 observations read from the data set HEMO.HEMO_MRG2.

NOTE: The data set HEMO.PLAY has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.03 seconds

      cpu time            0.03 seconds

   Data hemo.play2;

   Set hemo.play;

   hemoscore=.;

   IF (_CaseNumber=11.02088) OR (_CaseNumber=11.04817) OR (_CaseNumber=11.03348) OR

! (_CaseNumber=12.04653) Then hemoscore=3;

   Else IF (_CaseNumber=10.01867) OR (_CaseNumber=10.02274) OR (_CaseNumber=11.01668) OR

! (_CaseNumber=11.01889) OR (_CaseNumber=11.03437) OR (_CaseNumber=12.00581) THEN hemoscore=2;

   Else hemoscore=1;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      21:5     21:31    21:57    21:83    22:10    22:36    22:62    22:88    22:114   22:140

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 478 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

  Data hemo.play2;

  set hemo.play;

  IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

  RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      54:4

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.03 seconds

   PROC FREQ data=hemo.play2;

   tables hemoscore * Presenceofpetechiae;

ERROR: Variable HEMOSCORE not found.

  RUN;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE FREQ used (Total process time):

      real time           0.00 seconds

      cpu time            0.00 seconds

Tom
Super User Tom
Super User

Why do you keep overwriting the same datasets?

1) DATA hemo.PLAY;

2) Data hemo.play2;

3) Data hemo.play2;

The computer is just doing what you tell it to do.

uwmsasuser
Calcite | Level 5

So I should not be using the set statement each time?

Reeza
Super User

It does recognize your new variable, it doesn't find a different variable hemoscore.

Yes, you need to use a set statement each time.

In a data step:

A set statement points to input data, the DATA statements points to the output dataset. They do need to line up though.

uwmsasuser
Calcite | Level 5

But I made hemoscore in the step above and it was recognized (see here)

  Data hemo.play2;

   Set hemo.play;

  hemoscore=.;

   IF (_CaseNumber=11.02088) OR (_CaseNumber=11.04817) OR (_CaseNumber=11.03348) OR

! (_CaseNumber=12.04653) Then hemoscore=3;

   Else IF (_CaseNumber=10.01867) OR (_CaseNumber=10.02274) OR (_CaseNumber=11.01668) OR

! (_CaseNumber=11.01889) OR (_CaseNumber=11.03437) OR (_CaseNumber=12.00581) THEN hemoscore=2;

   Else hemoscore=1;

   RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      21:5     21:31    21:57    21:83    22:10    22:36    22:62    22:88    22:114   22:140

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 478 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.01 seconds

Its then when I run this next that it no longer recognizes that I made hemoscore

Data hemo.play2;

  set hemo.play;

  IF _CaseNumber= 11.02088 THEN Presenceofpetechiae="Not Reported";

  RUN;

NOTE: Character values have been converted to numeric values at the places given by:

      (Line):(Column).

      54:4

NOTE: There were 74 observations read from the data set HEMO.PLAY.

NOTE: The data set HEMO.PLAY2 has 74 observations and 477 variables.

NOTE: DATA statement used (Total process time):

      real time           0.02 seconds

      cpu time            0.03 seconds

   PROC FREQ data=hemo.play2;

   tables hemoscore * Presenceofpetechiae;

ERROR: Variable HEMOSCORE not found.

  RUN;

Reeza
Super User

Draw some diagrams of your input/output data sets and I think you'll begin to see the issues.

uwmsasuser
Calcite | Level 5

Ok, Hopefully I am understanding what I need to do and not overwriting anymore datasets. I ran the code below and was able to get the results I was looking for with no errors. Thank you all for your help!

Data temp;

Set hemo.hemo_mrg2;

hemoscore=1; 

if _CaseNumber=10.01867 or _CaseNumber=10.02274 or _CaseNumber=11.01668 or _CaseNumber=11.01889 or _CaseNumber=11.03437 or _CaseNumber=12.00581 then hemoscore=2;  

if _CaseNumber=11.02088 or _CaseNumber=11.04817 or _CaseNumber=11.03348 or _CaseNumber=12.04653 then hemoscore=3;

if _CaseNumber=11.04184 or _CaseNumber=11.02221 or _CaseNumber=11.04744 or _CaseNumber=10.04942 then delete;

if _CaseNumber= 11.02088 then Presenceofpetechiae="Not Reported";

run;

Data hemo.play2;

set temp;

run;

proc print data=hemo.play2;

var Presenceofpetechiae;

run;

PROC FREQ data=hemo.play2;

tables hemoscore * Presenceofpetechiae;

  RUN;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 2225 views
  • 6 likes
  • 4 in conversation