I need to retain a value from a separate dataset within a data step:
 
Here is what I have:
message   value
new_good 0.5
retain         .
new_good 0.7
retain         .
retain         .
new_bad   0.6
retain         .
new_good 0.3
retain         .
retain         .
retain         .
and I also have a separate dataset:
data ultimately_good_value;
x=1;
run;
This dataset is generated by a macro that is invoked through call execute in each instance where the message is new_bad. I do not include this macro for simplicity.
 
This is what I would like to have:
message   value   true_value
new_good 0.5       0.5
retain         .          0.5
new_good 0.7       0.7
retain         .          0.7
retain         .          0.7
new_bad   0.6       1
retain         .           1
new_good 0.3       0.3
retain         .          0.3
retain         .          0.3
retain         .          0.3
So when the message is new_good or retain I want to retain the true_value as value. But when the message is new_bad I want to retain value x from the dataset ultimately_good_value.
Here is my code:
data values; infile datalines; input message $ value; datalines; new_good 0.5 retain . new_good 0.7 retain . retain . new_bad 0.6 retain . new_good 0.3 retain . retain . retain . ; run; data ultimately_good_value; x=1; run; *My intuition is that the solution lies in a call execute statement with proc sql inside; data values_want; set values;
retain true_value . ; if message = "new_good" then true_value=value; if message = "new_bad" then true_value = call execute ("proc sql; select x from ultimately_good_value; quit; run;"); run; *But since I do not see how to replace the true_value in the dataset with the outcome of
the proc execute statement, this approach does not work;
I would appreciate any insight.
@Financier: I am not quite sure why you want to create a dataset containing the "ultimately good value" and not just create a macro variable or a function to get it.
For instance, you can create a function with PROC FCMP:
proc fcmp outlib=work.funcs.test;
  function ultimately_good_value();
    return(1);
    endsub;
run;
options cmplib=work.funcs;
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  if message = "new_bad" then 
  true_value = ultimately_good_value();
run;Or, even simpler, put the value in a macro variable:
%let ultimately_good_value=1; data values_want; set values; retain true_value .; if message = "new_good" then true_value=value; if message = "new_bad" then true_value = &ultimately_good_value.; run;
But if, for some obscure reason, you have to invoke a datastep and use that, you can do it with DOSUBL():
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  else if message = "new_bad" then do;
    _RC=dosubl('Data ultimately_good_value;x=1;run;');
    _dsid=open('ultimately_good_value');
    _RC=fetch(_dsid);
    true_value = getvarn(_dsid,varnum(_dsid,'x'));
    _RC=close(_dsid);
    end;
  drop _:;
run;Only remember to call close(_dsid); otherwise the next call to DOSUBL will fail.
If you only need the ultimately_good_value data set created once, you can check if it exists:
proc delete data=ultimately_good_value;run;
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  else if message = "new_bad" then do;
    if exist('ultimately_good_value')=0 then 
      _RC=dosubl('Data ultimately_good_value;x=1;run;');
    _dsid=open('ultimately_good_value');
    _RC=fetch(_dsid);
    true_value = getvarn(_dsid,varnum(_dsid,'x'));
    _RC=close(_dsid);
    end;
  drop _:;
run;I put the PROC DELETE in in case this was not the first time the code was run.
Regards,
Søren
I don't understand your example. If you want to combine data from two tables then merge them or use SQL join.
Thanks for reply!
Unfortunately, I can't simply merge them because the data ultimately_good_value does not exist at the beginning of the data step in my original code. I just assume that it exists here to be more clear what exactly I need to do.
Here is the code that does not assume that ultimately_good_value exists at the beginning of data step execution:
data values;
infile datalines; 
input message $ value;
datalines;                      
new_good 0.5
retain   .
new_good 0.7
retain   .
retain   .
new_bad  0.6
retain   .
new_good 0.3
retain   .
retain   .
retain   .
;
run;
%macro X;
data ultimately_good_value;
x=1;
run;
%mend;
data values_want;
set values; 
retain true_value .;
if message = "new_good" then true_value=value;
if message = "new_bad" then 
true_value = call execute ("%X;");
run;
So when the message is "new_bad" the code should do two things:
First, invoke a macro that generates a dataset ultimately_good_value (this macro depends on many things, including the current retained true_value and other variables, but for simplicity I just set x=1 here).
Second, replace the currently retained true_value with the value x from the newly created dataset while retain statement is running.
I do not know how to do the second step.
While it may be possible to get the result you want, it would not be possible to following the programming path you describe. SAS is not able to put a DATA step on hold, switch over to another DATA step, do some computations, then switch back to the original DATA step.
You don't show some of the details, so it's not possible to tell if this would work for you, but it should at least be considered. Have your DATA step compute the different possible values for X:
data values_want;
if _n_=1 then do;
*** calculations for every possible value for X, using as many variable names as needed to hold them all;
end;
set values;
...
if message = "new_bad" then do;
*** figure out which of the previously calculated values to use;
end;
run;
It helps to think it terms of operations that can be applied to full datasets rather than thinking in terms of step-by-step instructions. SAS is not really an asembly language programming system.
So given this description:
So when the message is "new_bad" the code should do two things: First, invoke a macro that generates a dataset ultimately_good_value (this macro depends on many things, including the current retained true_value and other variables, but for simplicity I just set x=1 here). Second, replace the currently retained true_value with the value x from the newly created dataset while retain statement is running.
I would turn it around and say first calculate the "ultimately_good_value" for every observation (or at least those that will need it) and then when the current value is "bad" then use the "ultimately_good_value".
Or perhaps you want to split the data into good and bad parts. Then run the bad part through the logic to generate the "good" values and then merge them together.
This can sometimes be done in a single data step. For example if you just wanted to implement a last observation carried forward method.
Or perhaps it basically just involves a lookup of valid values from another table. Which you can do with a merge, join, or even using a SAS format.
@Financier: I am not quite sure why you want to create a dataset containing the "ultimately good value" and not just create a macro variable or a function to get it.
For instance, you can create a function with PROC FCMP:
proc fcmp outlib=work.funcs.test;
  function ultimately_good_value();
    return(1);
    endsub;
run;
options cmplib=work.funcs;
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  if message = "new_bad" then 
  true_value = ultimately_good_value();
run;Or, even simpler, put the value in a macro variable:
%let ultimately_good_value=1; data values_want; set values; retain true_value .; if message = "new_good" then true_value=value; if message = "new_bad" then true_value = &ultimately_good_value.; run;
But if, for some obscure reason, you have to invoke a datastep and use that, you can do it with DOSUBL():
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  else if message = "new_bad" then do;
    _RC=dosubl('Data ultimately_good_value;x=1;run;');
    _dsid=open('ultimately_good_value');
    _RC=fetch(_dsid);
    true_value = getvarn(_dsid,varnum(_dsid,'x'));
    _RC=close(_dsid);
    end;
  drop _:;
run;Only remember to call close(_dsid); otherwise the next call to DOSUBL will fail.
If you only need the ultimately_good_value data set created once, you can check if it exists:
proc delete data=ultimately_good_value;run;
data values_want;
  set values; 
  retain true_value .;
  if message = "new_good" then true_value=value;
  else if message = "new_bad" then do;
    if exist('ultimately_good_value')=0 then 
      _RC=dosubl('Data ultimately_good_value;x=1;run;');
    _dsid=open('ultimately_good_value');
    _RC=fetch(_dsid);
    true_value = getvarn(_dsid,varnum(_dsid,'x'));
    _RC=close(_dsid);
    end;
  drop _:;
run;I put the PROC DELETE in in case this was not the first time the code was run.
Regards,
Søren
Thank you all for valuable suggestions. I modified the first code by Søren, so it actually does the trick. It puts the data step with a retain statement on hold, starts another data step, gets the value created through this auxiliary data step and sends it to the main data step without interruption of the retain statement. Using DOSUBL may be faster.
Here is the code:
data values;
infile datalines; 
input message $ value;
datalines;                      
new_good 0.5
retain   .
new_good 0.7
retain   .
retain   .
new_bad  0.6
retain   .
new_good 0.3
retain   .
retain   .
retain   .
;
run;
data ultimately_good_value;
x=1;
run;
%macro data_step();
data temp;
set ultimately_good_value;
call symput('macro_x',x);
run;
%mend data_step;
proc fcmp outlib=work.funcs.test;
function getval();
rc = run_macro('data_step', macro_x ); 
return (macro_x);
endsub;
quit;
options cmplib=work.funcs;
data values_want;
set values; 
retain true_value .;
if message = "new_good" then true_value=value;
if message = "new_bad" then 
true_value = getval() ;
run;
The dataset ultimately_good_value could also be created within the data_step macro.
Sorry for not giving out all the details. I can't create a macro variable before the data step because in my original code I do not know what x is going to be equal to. Every time x is determined based on several retained values, message flag and some other variables from a different dataset. This created a viscious circle: I do not know where to look for x in each case before I run a data step with a retain statement, but if I start this data step, I do not know how to fetch x once I learn where it is (and future values of x depend on the previous ones).
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
