Hi all. I have a data set with measured concentrations of biomarkers in serum. For values below the limit of detection (LOD), I would like to impute random values from the distribution of 0-LOD using maximum likelihood estimation. The LOD changes with each observation (based on amount of serum available). I've read some SAS blog posts on computing MLEs but this is all a bit over my head (statistics does not come easily to me!).
Here is some example data. "abovelod' is a binary indicator of if the concentration is above the LOD or not. "lod" is the LOD value for each person. "concentration" is the value I'd like to impute.
data biomarkers; input id abovelod lod concentration;
datalines;
1 1 0.5 0.6
2 0 0.6 .
3 1 0.4 1.2
4 1 0.8 0.9
5 0 0.2 .
6 0 0.7 .
7 1 0.4 1.5
8 1 0.3 0.8
9 1 0.2 1.1
10 1 0.5 0.7
;
run;
data biomarkers; set biomarkers;
ln_concentration=log(concentration);
run;
Any direction appreciated, I know this is a larger ask.
Also, I would appreciate keeping the discussion away from how to handle values below the LOD. Thank you!
Edit: Variable "concentration" can be assumed to be log normal.
Since you have different limits for each subject, and at least as shown, a very small sample of LOD values I'm not sure what you would actually mean by "mle" in this case.
From the range 0 to LOD is easy:
data biomarkers; input id abovelod lod concentration; if not(abovelod) then do; concentration = rand('uniform'); if concentration > lod then do until (concentration le lod); concentration = rand('uniform'); end; end; datalines; 1 1 0.5 0.6 2 0 0.6 . 3 1 0.4 1.2 4 1 0.8 0.9 5 0 0.2 . 6 0 0.7 . 7 1 0.4 1.5 8 1 0.3 0.8 9 1 0.2 1.1 10 1 0.5 0.7 ;
Back in the reptilian part of my brain, I thought there was more to "rand('uniform'". I checked. And there is: you can use the 2nd and 3rd argument of "rand('uniform',...,...)" to replace the do group
if not(abovelod) then do;
with
id not(abovelod) then concentration=rand('uniform',0,lod);
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.