BookmarkSubscribeRSS Feed
Mike018
Fluorite | Level 6

I have to initialize a variable in some precise situations but not at every loop of a data . So, I have used a retain statement on this variable. I initialize it to 0 (only in the case when I need to), but it didn't work.

Is anybody knows is it possible to initialize a variable when using a retain statement on it?

17 REPLIES 17
Reeza
Super User

Yes its possible.

If you show provide an example of you code, someone can probably show you where you're going wrong.

art297
Opal | Level 21

I agree with Fareeza that you have probably just overlooked something in your code and, to get anyone to give you feedback, you'll have to post your code.

But, as you can see from the following example, one can definitely do what you are trying to do:

proc sort data=sashelp.class out=retain_example;

  by sex;

run;

data retain_example;

  set retain_example;

  by sex;

  retain gender;

  if _n_ eq 1 then gender="Female";

  else if first.sex then gender="Male";

run;

ChrisNZ
Tourmaline | Level 20

2 things you might not know:

- A retain statement is not executable, so it makes no difference if you put it after a test: it will always be used.

- A retain initialisation is only done at the first data step iteration, when _N_=1.

art297
Opal | Level 21

Chris,

Yes, retain will work in either case, but its location definitely does affect the result.  Run the following code and take a look at the order of the resulting variables in each case:

proc sort data=sashelp.class out=retain_example;

  by sex;

run;

data retain_example1;

  set retain_example;

  by sex;

  retain gender;

  if _n_ eq 1 then gender="Female";

  else if first.sex then gender="Male";

run;

 

data retain_example2;

  retain gender;

  set retain_example;

  by sex;

  if _n_ eq 1 then gender="Female";

  else if first.sex then gender="Male";

run;

ChrisNZ
Tourmaline | Level 20

Indeed, retain will only affect the variables already defined/known when the statement is encountered as the data step is compiled.

The point I was trying to make, since the OP wants to use retain and initialise values only when needed, is that

if TEST then do;

retain A 0;

...

end;

is the same as

retain A 0;

if TEST then do;

...

end;

Mike018
Fluorite | Level 6

Thanks for your replies everybody, I'll prepare a simplified version of my program and I'll post it so maybe that will be able to find where I went wrong.

mojerry2
Fluorite | Level 6

data test;

set original;

retain gender;

format gender $6.;/* used if new column*/

gender="male";

if var1='F' then gender="female";

run;

in this example gender is always "male" but with condition of var1='F' gender is "female"

art297
Opal | Level 21

You aren't initializing gender in your retain statement and then overriding the value anyhow.  Given the following sample dataset, what result to you want for gender for each record?

data original;

  input var1 $;

  cards;

M

F

M

F

M

;

run;

Mike018
Fluorite | Level 6

This is a very light version of my program. I remove a lot of stuff that is useless to the comprehension of the problem.

The program have to generates a report of cashing for every bank.
In the infile, there are 2 types of record
  G : The positive amounts of money
  H : The negative amounts of money

So, I have to sort these records by type, bank_id, reference_id. I need to make a total for every bank. The totals of the types are already calculated and I recuperate theirs values via macro variables. At first, I print and make totals for the G records. After, I made totals for the H records. Also, I give you some explications about the variable cause theirs names are in french and maybe you're not going to understand Smiley Happy

TOTAL_ENCAISSEMENTS_CAISSE, TOTAL_FRAIS_ENCAISSEMENTS_CAISSE

                                : Variables I use to conserve the bank totals      
MONTANT_REEL, FRAIS_REEL : Variables that contain the monetary data of the current record
NO_REF : That's the reference_id
NO_CAISSE2 : That's the formatted bank_id

DATA AA;                                                            
  FILE RAPPORT;                                                                                                           
  FORMAT TOTAL_ENCAISSEMENTS_CAISSE                COMMA19.2;                               
  FORMAT TOTAL_FRAIS_ENCAISSEMENTS_CAISSE  COMMA14.2;                                                          
  FORMAT SOMME_ENC_FORMAT                                     COMMA19.2;                              
  FORMAT SOMME_CORRECTIONS_FORMAT                COMMA19.2;               
  FORMAT SOMME_FRAIS_FORMAT                                  COMMA14.2;                                              
  RETAIN NO_REF NO_CAISSE2 TOTAL_ENCAISSEMENTS_CAISSE,  TOTAL_FRAIS_ENCAISSEMENTS_CAISSE, SOMME_ENC_FORMAT;         
  SET DESJARD END=EOF;                                                 
  BY TYPE NO_CAISSE NO_REF;                                                                                    
  IF _N_ = 1 THEN                                                      
  DO;                                                                                                                                              TOTAL_ENCAISSEMENTS_CAISSE =0;                                                                                                                                                                                                                                                                        
    // Print the report column header                                           
  END;                                                               
  SELECT (TYPE);                                                     
     WHEN ('G', 'H')                                                 
     DO;                                                                                                             
          // Print the current record                                                  
           FRAIS_ENC_TEMP =                                            
                       SUM(TOTAL_ENCAISSEMENTS_CAISSE,MONTANT_REEL);               
           TOTAL_FRAIS_ENCAISSEMENTS_CAISSE = SUM    (TOTAL_FRAIS_ENCAISSEMENTS_CAISSE, FRAIS_REEL);                                                                                                                                                                                                                                                                                                                                                                              

   IF LAST.NO_CAISSE THEN    
     DO;  
    // It's the last record of this bank, so I print totals and I want to initialize the
   variables for the next bank.
       PUT @045 FRAIS_ENC_TEMP                   COMMA19.2             
               @064 TOTAL_FRAIS_ENCAISSEMENTS_CAISSE COMMA15.2;                                          
       TOTAL_ENCAISSEMENTS_CAISSE = 0;                                         
       TOTAL_FRAIS_ENCAISSEMENTS_CAISSE = 0;                                                 
       FRAIS_REEL = 0;                                                                    
     END;                                                             
     IF LAST.TYPE THEN                                                
       DO;                                                            
       IF (TYPE='G') THEN                                             
         DO;          
   // Positive numbers 
         SOMME_ENC_FORMAT = INPUT("&SOMME_ENCAISSEMENTS.",COMMA19.2);
         PUT @045 SOMME_ENC_FORMAT           COMMA19.2              
             @064 "&SOMME_FRAIS_ENCAISSEMENT.";                                                                     
         END;                                                                                                     
     ELSE IF (TYPE='H') THEN                                         
       DO;       
   // Negative numbers   
         SOMME_CORRECTIONS_FORMAT =                                
                         INPUT("&SOMME_CORRECTIONS.",19.2) * -1;     
         SOMME_FRAIS_FORMAT =                                      
                         INPUT("&SOMME_FRAIS_CORRECTION.",12.2) * -1;
         PUT @046 SOMME_CORRECTIONS_FORMAT COMMA19.2               
                  @064 SOMME_FRAIS_FORMAT COMMA15.2;                                                                                                                                                                                                                                                                                  
         END;                                                          
     END;                                                              
   ELSE                                                                
     DO;                                                                 
       TOTAL_ENCAISSEMENTS_CAISSE =                                                
                      SUM(TOTAL_ENCAISSEMENTS_CAISSE,MONTANT_REEL);                
       TOTAL_FRAIS_ENCAISSEMENTS_CAISSE =                              
                      SUM(TOTAL_FRAIS_ENCAISSEMENTS_CAISSE,FRAIS_REEL);
    END;           
END;              
  OTHERWISE;       
END;                

Mike018
Fluorite | Level 6

Here is an example of the output of the program

39,872,103.06

    654,321.08

5,036,428.90

   ----------

45,562,853.04    (This first total is ok)

            

30,000,003.85

   ----------

35,036,432.75    (The total is supposed to be 30,000,003.85 instead the program has calculated

                                5,036,428.90 (the previous) + 30,000,003.85)

            

            8.07

       412.84

   ----------

30,000,424.76   (The total is supposed to be 420.91 instead the program has calculated

                              30,000,003.85 (the previous) + 8.07 + 412.84)

Thanks is advance Smiley Happy

art297
Opal | Level 21

I don't have time to review your full code, and without a sample dataset (with intended resulting values) can't test the code, but I did notice some things that were questionable:

1. In your retain statement you separate some of the variables with commas rather than spaces.  I would think that would result in a fatal error with the datastep never even running.

2. It appears that you are retaining more variables than you really want/need to retain

3. You appear to output records at a point when you probably first want to include the data in the calculations, thus each time you get a bank's last record (within a type) you output the record, initialize some values, and then continue with the calculations.

4. Is there a reason why you are trying to calculate each type separately?  Since you have sorted bank within type, each banks values for a given type will be calculated, followed by the next type.

5. Should some of the other retained values be initialized at the point where you initialize two of them?

ChrisNZ
Tourmaline | Level 20

My advice would be for you to create a small (say 5- or 20- record) dummy dataset, write a small data step that does sums exactly what you want and apply the knowledge to your larger problem.

If you can't solve the issue, submit this small dataset and data step here.

As it is, you haven't reduced the problem enough in your sample code for us to help, especially since we can't see the data going in and what you want going out.

DBailey
Lapis Lazuli | Level 10

Couple of questions:

Why are you retaining the variables NO_REF NO_CAISSE?  Should the by variable allow you to do what you need?

You can also set the initial value of a retain variable directly without using the _N_ variable as in:

RETAIN

      TOTAL_ENCAISSEMENTS_CAISSE 0

      TOTAL_FRAIS_ENCAISSEMENTS_CAISSE 0

      SOMME_ENC_FORMAT 0;         

I'm having some difficulty following as I'm not conversant in French....but...It looks to me that the by variables might be out of order.  Shouldn't the logic by caisse then type then refno? 

Your code seems to reset the variables if it is the last caisse.  But it might be the last caisse and not the last type.  If that is the case, then your code goes on to re-add montant_reel to total_encaissements.

Sorry if that's no help...but its all I could see that might be an issue.

Mike018
Fluorite | Level 6

@ art297

1. It's an error when I simplify the code - in my program, space is the only separator between the retain variables

2. Ya, there were 3 variables that I retain unnecessarily.


@ Chris@NewZealand  I'll try it

@ DBailey

1. I didn't need to retain these 2 variables. I remove the retain statement on them.

2. They want that I print totals of the G type records and after print totals of the H type records. So, a bank may have totals in the first section and others in the second section.

Thanks, I'll try with a smaller dataset and I'll keep you informed!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 12590 views
  • 8 likes
  • 6 in conversation