Hi,
I am trying to calculate the frequency of patients who had >10% weight loss among 2 weights measured between 2-4 months before diagnosis. I have attached the pseudo-data.
Each patient has multiple visits with weight measurements. Their diagnosis of the disease is marked by "1" in the diagnosis column, and the date of encounter is the "dateoftest". I am trying to look at a 2-4 months window before the diagnosis date (diagnosis = 1), pull at least 2 weight measures, and compare if there is a >10% weight loss between those two weight measures. If >10% weight loss then flag the patients. There should be two wt measured between 2-4 months before diagnosis, If only 1 wt measure then it is marked as missing. Or if no weight is measured, it is also marked as missing.
Finally, calculate the frequency of patients with >10% weight loss and those with missing weight within 2-4 months.
I would really appreciate the help.
Thanks,
Sandyzman
I am not going to do the work of converting your XLSX file into a SAS data set, so the following code is untested:
data want (drop=nxt_: );
merge have (keep=id dateoftest diagnosis rename=(dateoftest=diag_date) where=(diagnosis=1)
have ;
by id;
set have (firstobs=2 keep=id dateoftest rename=(id=nxt_id dateoftest=nxt_date));
retain wgt1 wgt2 .;
if first.id then call missing(of wgt:);
if intnx('month',diag_date,-4,'sameday') <= dateoftest <= intnx('month',diag_date,-2,'sameday');
if wgt1=. then wgt1=weight;
else wgt2=coalesce(weight,wgt2);
if intck('month',nxt_date,diag_date,'c')>2; **See editted note below**;
if nmiss(wgt1,wgt2)=0 then wgt_change_ratio= wgt2/wgt1-1;
run;
The program assumes your data in dataset HAVE is sorted by ID/DATEOFTEST.
It merges the diagnosis date record (when diagnosis=1) will all the records for a given id. And it keeps only those whose dates fall within the 4-2 month prior window, keeping the earliest weight in the window as WGT1 and the latest within-window weight as WGT2.
Once the within-windows observations are exhausted (i.e. the next obs is outside the window), the weight change is calculated and the data is output.
Editted Note: Corrected the 4th argument to intck from 's' to 'c'. But the code is still untested.
You mention "at least" two weight measure in the 4-month-prior to 2-month-prior time period. Then later, you say that there should be 2 such tests. So are you really excluding the possibiliyt of more than 2 visits in that window? If not, then the case of more than two qualifying tests, which weights should be used to measure weight loss?
Also, what is "two months before" Is April15, 2022 two months before June 10, 2022? Similarly, is April 15, 2022 more than four months before August 17, 2022?
I am not going to do the work of converting your XLSX file into a SAS data set, so the following code is untested:
data want (drop=nxt_: );
merge have (keep=id dateoftest diagnosis rename=(dateoftest=diag_date) where=(diagnosis=1)
have ;
by id;
set have (firstobs=2 keep=id dateoftest rename=(id=nxt_id dateoftest=nxt_date));
retain wgt1 wgt2 .;
if first.id then call missing(of wgt:);
if intnx('month',diag_date,-4,'sameday') <= dateoftest <= intnx('month',diag_date,-2,'sameday');
if wgt1=. then wgt1=weight;
else wgt2=coalesce(weight,wgt2);
if intck('month',nxt_date,diag_date,'c')>2; **See editted note below**;
if nmiss(wgt1,wgt2)=0 then wgt_change_ratio= wgt2/wgt1-1;
run;
The program assumes your data in dataset HAVE is sorted by ID/DATEOFTEST.
It merges the diagnosis date record (when diagnosis=1) will all the records for a given id. And it keeps only those whose dates fall within the 4-2 month prior window, keeping the earliest weight in the window as WGT1 and the latest within-window weight as WGT2.
Once the within-windows observations are exhausted (i.e. the next obs is outside the window), the weight change is calculated and the data is output.
Editted Note: Corrected the 4th argument to intck from 's' to 'c'. But the code is still untested.
@sandyzman1 wrote:
Hi mkeintz,
Thank you so much for the help. I really appreciate it. However, when I imported the xlsx file into SAS and ran the code. I got the following error. I couldn't figure out what is the issue. Would you please suggest how to resolve these issues? Thanks.
NOTE: Invalid argument to function INTCK('month',18235,18270,'s') at line 84 column 6.
id=1 diag_date=01/08/2010 diagnosis=0 dateoftest=11/03/2009 weight=99 FIRST.id=0 LAST.id=0 nxt_id=1
nxt_date=12/04/2009 wgt1=99 wgt2=. wgt_change_ratio=. _ERROR_=1 _N_=2
NOTE: Invalid argument to function INTCK('month',17938,18003,'s') at line 84 column 6.
id=3 diag_date=04/16/2009 diagnosis=0 dateoftest=02/09/2009 weight=119 FIRST.id=1 LAST.id=0 nxt_id=3
nxt_date=02/10/2009 wgt1=119 wgt2=. wgt_change_ratio=. _ERROR_=1 _N_=12
Best,
sandyzman1
This is one of the consequences of submitting untested code, which is my usual response in the absence of sample data in the form of a working data step.
I will edit my earlier response to change the "S" parameter in the intck function to "C". The "S" parameter serves an analogous role in the INTNX function to the "C" that I should have used in the INTCK function.
Hi mkeintz,
Thank you so much for your response. I understand that I should have originally attached the working dataset. I thought the file would be too long to attach. However, I am still getting the error message, and I am new to SAS so I don't totally understand what I going on. I got the following error now.
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column). 4 at 65:6 4 at 65:61
I have attached the SAS data file and also copied the dataline (if the file doesn't work).
data test;
infile datalines delimiter=',';
input id dateoftest :mmddyy8. diagnosis weight;
datalines;
1, 4/8/2009, 0, 98,
1, 11/3/2009, 0, 99,
1, 12/4/2009, 0, ,
1, 1/8/2010, 1, 89,
1, 3/24/2010, 0, 76,
1, 4/4/2011, 0, 75,
1, 7/14/2011, 0, 70,
2, 4/23/2016, 0, 122,
2, 5/20/2016, 0, 124,
2, 7/10/2016, 0, ,
2, 9/25/2017, 0, 120,
3, 2/9/2009, 0, 119,
3, 2/10/2009, 0, ,
3, 4/16/2009, 1, 120,
3, 5/23/2009, 0, 105,
3, 6/17/2010, 0, 98,
3, 5/12/2011, 0, 94,
4, 3/23/2018, 0, ,
4, 2/10/2018, 0, ,
4, 4/18/2018, 0, ,
4, 4/15/2018, 1, ,
4, 7/3/2019, 0, ,
5, 12/4/2014, 0, 130,
5, 1/3/2015, 0, 120,
5, 2/12/2015, 0, 95,
5, 4/23/2015, 1, 96,
5, 5/5/2015, 0, 87,
5, 8/9/2016, 0, 88,
6, 12/4/2013, 0, 98,
6, 1/3/2013, 0, 80,
6, 2/12/2013, 0, 79,
6, 4/23/2013, 1, 78,
6, 5/5/2013, 0, 75,
6, 8/9/2014, 0, 74,
7, 12/4/2018, 0, 81,
7, 1/3/2018, 0, 80,
7, 2/12/2018, 0, 79,
7, 4/23/2018, 1, 78,
7, 5/5/2018, 0, 75,
7, 8/9/2018, 0, 74,
8, 4/8/2012, 0, 140,
8, 11/3/2012, 0, 99,
8, 12/4/2012, 0, ,
8, 1/8/2013, 1, 89,
8, 3/24/2013, 0, 76,
8, 4/4/2013, 0, 75,
8, 7/14/2013, 0, 70,
;
run;
I really appreciate your help. Thanks.
Sandyzman1
One would need to see the log with the code that generated the message you report, but you haven't provided it.
But this is a good opportunity to see the value of the log. You report the following message:
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column). 4 at 65:6 4 at 65:61
This means you should look at the line identified as line 65 in your log. The statement there is presumably calculating a value, but one or more of the inputs to that calulation is a missing value. Also note that this happens exactly 4 times. All the rest of your observations did not have the problem, i.e. they did not have to base a calculation on a missing value.
I notice that the 4 observations for ID=2 contain NO observations with diagnosis=1, so any calculation for ID=2 based on the date of diagnosis would generate the above message. So you could avoid that result by putting the calculation in the THEN clause of an IF not missing(diag_date) THEN .... statement.
BTW, what do you want to do in the WANT dataset for
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.