I have to initialize a variable in some precise situations but not at every loop of a data . So, I have used a retain statement on this variable. I initialize it to 0 (only in the case when I need to), but it didn't work.
Is anybody knows is it possible to initialize a variable when using a retain statement on it?
Yes its possible.
If you show provide an example of you code, someone can probably show you where you're going wrong.
I agree with Fareeza that you have probably just overlooked something in your code and, to get anyone to give you feedback, you'll have to post your code.
But, as you can see from the following example, one can definitely do what you are trying to do:
proc sort data=sashelp.class out=retain_example;
by sex;
run;
data retain_example;
set retain_example;
by sex;
retain gender;
if _n_ eq 1 then gender="Female";
else if first.sex then gender="Male";
run;
2 things you might not know:
- A retain statement is not executable, so it makes no difference if you put it after a test: it will always be used.
- A retain initialisation is only done at the first data step iteration, when _N_=1.
Chris,
Yes, retain will work in either case, but its location definitely does affect the result. Run the following code and take a look at the order of the resulting variables in each case:
proc sort data=sashelp.class out=retain_example;
by sex;
run;
data retain_example1;
set retain_example;
by sex;
retain gender;
if _n_ eq 1 then gender="Female";
else if first.sex then gender="Male";
run;
data retain_example2;
retain gender;
set retain_example;
by sex;
if _n_ eq 1 then gender="Female";
else if first.sex then gender="Male";
run;
Indeed, retain will only affect the variables already defined/known when the statement is encountered as the data step is compiled.
The point I was trying to make, since the OP wants to use retain and initialise values only when needed, is that
if TEST then do;
retain A 0;
...
end;
is the same as
retain A 0;
if TEST then do;
...
end;
Thanks for your replies everybody, I'll prepare a simplified version of my program and I'll post it so maybe that will be able to find where I went wrong.
data test;
set original;
retain gender;
format gender $6.;/* used if new column*/
gender="male";
if var1='F' then gender="female";
run;
in this example gender is always "male" but with condition of var1='F' gender is "female"
You aren't initializing gender in your retain statement and then overriding the value anyhow. Given the following sample dataset, what result to you want for gender for each record?
data original;
input var1 $;
cards;
M
F
M
F
M
;
run;
This is a very light version of my program. I remove a lot of stuff that is useless to the comprehension of the problem.
The program have to generates a report of cashing for every bank.
In the infile, there are 2 types of record
G : The positive amounts of money
H : The negative amounts of money
So, I have to sort these records by type, bank_id, reference_id. I need to make a total for every bank. The totals of the types are already calculated and I recuperate theirs values via macro variables. At first, I print and make totals for the G records. After, I made totals for the H records. Also, I give you some explications about the variable cause theirs names are in french and maybe you're not going to understand
TOTAL_ENCAISSEMENTS_CAISSE, TOTAL_FRAIS_ENCAISSEMENTS_CAISSE
: Variables I use to conserve the bank totals
MONTANT_REEL, FRAIS_REEL : Variables that contain the monetary data of the current record
NO_REF : That's the reference_id
NO_CAISSE2 : That's the formatted bank_id
DATA AA;
FILE RAPPORT;
FORMAT TOTAL_ENCAISSEMENTS_CAISSE COMMA19.2;
FORMAT TOTAL_FRAIS_ENCAISSEMENTS_CAISSE COMMA14.2;
FORMAT SOMME_ENC_FORMAT COMMA19.2;
FORMAT SOMME_CORRECTIONS_FORMAT COMMA19.2;
FORMAT SOMME_FRAIS_FORMAT COMMA14.2;
RETAIN NO_REF NO_CAISSE2 TOTAL_ENCAISSEMENTS_CAISSE, TOTAL_FRAIS_ENCAISSEMENTS_CAISSE, SOMME_ENC_FORMAT;
SET DESJARD END=EOF;
BY TYPE NO_CAISSE NO_REF;
IF _N_ = 1 THEN
DO; TOTAL_ENCAISSEMENTS_CAISSE =0;
// Print the report column header
END;
SELECT (TYPE);
WHEN ('G', 'H')
DO;
// Print the current record
FRAIS_ENC_TEMP =
SUM(TOTAL_ENCAISSEMENTS_CAISSE,MONTANT_REEL);
TOTAL_FRAIS_ENCAISSEMENTS_CAISSE = SUM (TOTAL_FRAIS_ENCAISSEMENTS_CAISSE, FRAIS_REEL);
IF LAST.NO_CAISSE THEN
DO;
// It's the last record of this bank, so I print totals and I want to initialize the
variables for the next bank.
PUT @045 FRAIS_ENC_TEMP COMMA19.2
@064 TOTAL_FRAIS_ENCAISSEMENTS_CAISSE COMMA15.2;
TOTAL_ENCAISSEMENTS_CAISSE = 0;
TOTAL_FRAIS_ENCAISSEMENTS_CAISSE = 0;
FRAIS_REEL = 0;
END;
IF LAST.TYPE THEN
DO;
IF (TYPE='G') THEN
DO;
// Positive numbers
SOMME_ENC_FORMAT = INPUT("&SOMME_ENCAISSEMENTS.",COMMA19.2);
PUT @045 SOMME_ENC_FORMAT COMMA19.2
@064 "&SOMME_FRAIS_ENCAISSEMENT.";
END;
ELSE IF (TYPE='H') THEN
DO;
// Negative numbers
SOMME_CORRECTIONS_FORMAT =
INPUT("&SOMME_CORRECTIONS.",19.2) * -1;
SOMME_FRAIS_FORMAT =
INPUT("&SOMME_FRAIS_CORRECTION.",12.2) * -1;
PUT @046 SOMME_CORRECTIONS_FORMAT COMMA19.2
@064 SOMME_FRAIS_FORMAT COMMA15.2;
END;
END;
ELSE
DO;
TOTAL_ENCAISSEMENTS_CAISSE =
SUM(TOTAL_ENCAISSEMENTS_CAISSE,MONTANT_REEL);
TOTAL_FRAIS_ENCAISSEMENTS_CAISSE =
SUM(TOTAL_FRAIS_ENCAISSEMENTS_CAISSE,FRAIS_REEL);
END;
END;
OTHERWISE;
END;
Here is an example of the output of the program
39,872,103.06
654,321.08
5,036,428.90
----------
45,562,853.04 (This first total is ok)
30,000,003.85
----------
35,036,432.75 (The total is supposed to be 30,000,003.85 instead the program has calculated
5,036,428.90 (the previous) + 30,000,003.85)
8.07
412.84
----------
30,000,424.76 (The total is supposed to be 420.91 instead the program has calculated
30,000,003.85 (the previous) + 8.07 + 412.84)
Thanks is advance
I don't have time to review your full code, and without a sample dataset (with intended resulting values) can't test the code, but I did notice some things that were questionable:
1. In your retain statement you separate some of the variables with commas rather than spaces. I would think that would result in a fatal error with the datastep never even running.
2. It appears that you are retaining more variables than you really want/need to retain
3. You appear to output records at a point when you probably first want to include the data in the calculations, thus each time you get a bank's last record (within a type) you output the record, initialize some values, and then continue with the calculations.
4. Is there a reason why you are trying to calculate each type separately? Since you have sorted bank within type, each banks values for a given type will be calculated, followed by the next type.
5. Should some of the other retained values be initialized at the point where you initialize two of them?
My advice would be for you to create a small (say 5- or 20- record) dummy dataset, write a small data step that does sums exactly what you want and apply the knowledge to your larger problem.
If you can't solve the issue, submit this small dataset and data step here.
As it is, you haven't reduced the problem enough in your sample code for us to help, especially since we can't see the data going in and what you want going out.
Couple of questions:
Why are you retaining the variables NO_REF NO_CAISSE? Should the by variable allow you to do what you need?
You can also set the initial value of a retain variable directly without using the _N_ variable as in:
RETAIN
TOTAL_ENCAISSEMENTS_CAISSE 0
TOTAL_FRAIS_ENCAISSEMENTS_CAISSE 0
SOMME_ENC_FORMAT 0;
I'm having some difficulty following as I'm not conversant in French....but...It looks to me that the by variables might be out of order. Shouldn't the logic by caisse then type then refno?
Your code seems to reset the variables if it is the last caisse. But it might be the last caisse and not the last type. If that is the case, then your code goes on to re-add montant_reel to total_encaissements.
Sorry if that's no help...but its all I could see that might be an issue.
@ art297
1. It's an error when I simplify the code - in my program, space is the only separator between the retain variables
2. Ya, there were 3 variables that I retain unnecessarily.
@ Chris@NewZealand I'll try it
@ DBailey
1. I didn't need to retain these 2 variables. I remove the retain statement on them.
2. They want that I print totals of the G type records and after print totals of the H type records. So, a bank may have totals in the first section and others in the second section.
Thanks, I'll try with a smaller dataset and I'll keep you informed!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.