BookmarkSubscribeRSS Feed
jsd1
Calcite | Level 5

I have a large dataset consisting of gastrointestinal tract pH values that are recorded every 10 minutes over a period of up to 150 days.  There are about 70 patients in the group, representing 4 experimental treatment groups.  Overall there are about 800,000 lines of data, so a bit challenging to interpret.  Consequently, I would like to compare the treatments with respect to the proportion of time that measurements stay below certain critical pH thresholds.  Example data are included in the attached file.  

 

Any thoughts on how to do this?  Note that the measurement periods are not the same for all patients, hence the desire to look at this as the proportion of overall time under the specified pH values.

6 REPLIES 6
PaigeMiller
Diamond | Level 26

For each of the 800,000 records, create a new variable that has the value of 1 if the pH is below the threshold, and 0 if not. I will call this variable by the name UNDER_THRESHOLD.

 

Then PROC SUMMARY will give you the percent of time for each patient, and for each treatment group.

 

proc summary data=have;
    class patient treatment;
    var under_threshold;
    output out=_stats_ mean=;
run;

From there you can perform statistical tests.

--
Paige Miller
ballardw
Super User

If you attempted to attach a file it didn't make it.

Best anyway is to provide a small example data set in the form of data step code so that what you have is conveyed to us accurately. Any attachment of a file in another form such as Excel, csv, or text would require us to make assumptions to create data and we may not make the same decisions since we are not intimately familiar with all the aspects of your project.

 

First thing I would do is make sure that you have actual SAS date, time or datetime values. These are numeric with matching formats. If this does not sound familiar then ask.

With the date/datetime values for each measurement the next step would be to sort the data by patient and date. This will allow determining the time/duration since the previous measurement in a data step.

Simple code like:

Flagvalue = (Phmeasure > 0.5);

Would create a 1/0 numeric coded result for when the comparison is true.

With example data this could be used to create the requested proportions using the duration as weight .

Or create graphs.

There might be some fiddly bits about how you want to treat the intervals when the measure crosses your boundary value.

 

Are there interventions taking place that you are interested about before/after behavior?

jsd1
Calcite | Level 5
There are four treatments. The negative control has no insult and no intervention. For the positive control there is an insult imposed, but no interventions. Treatments Buffer1 and Buffer2 both are subjected to an insult, and also are administered a compound to combat the insult.
StatDave
SAS Super FREQ

Ultimately, this is just a comparison of four proportions. As suggested, create a variable (Below) that indicates whether the observed value is above (0) or below (1) the threshold in each observation. An observation should represent one time for one patient. You can then use PROC LOGISTIC to fit an appropriate model and compare the group proportions. The ILINK option in the LSMEANS statement will report the group proportions and the DIFF option will provide tests of the pairwise comparisons. Note that the reported difference estimate is not a difference of the proportions. If you need estimates of the proportion differences along with confidence intervals for the difference, then use the NLMeans macro after PROC LOGISTIC as illustrated in this note.

 

proc logistic;
class group / param=glm;
model below(event="1")=group;
lsmeans group / ilink diff cl;
run;

 

 

StatDave
SAS Super FREQ

On second thought, since the multiple observations in each patient are correlated, you should aggregate the data down to the patient level with a count, for each patient, of the number of observations below threshold and the total number of observations for that patient - the ratio of which is the proportion below threshold. Assuming that the patients themselves are independent, you can then use the logistic model I suggested before but using the events/trials syntax to model the patient level data. Again, the NLMEans macro can be used if needed. 

proc summary nway;
class group patient; 
var below;
output out=GrpCnts n=total sum=nbelow;
run;
proc logistic data=GrpCnts;
class group / param=glm;
model nbelow/total = group;
lsmeans group / ilink diff cl;
run;
jsd1
Calcite | Level 5
TimestampPatient_IDTreatmentpH
03-19-2021 18:40530PosCntrl6.96
03-19-2021 18:50530PosCntrl6.94
03-19-2021 19:00530PosCntrl7.00
03-19-2021 19:10530PosCntrl6.99
03-19-2021 19:20530PosCntrl7.04
03-19-2021 19:30530PosCntrl7.01
03-19-2021 19:40530PosCntrl7.03
03-19-2021 19:50530PosCntrl7.03
03-19-2021 20:00530PosCntrl7.03
03-19-2021 20:10530PosCntrl7.04
03-19-2021 20:20530PosCntrl7.06
03-19-2021 20:30530PosCntrl7.07
03-19-2021 20:40530PosCntrl7.07
03-19-2021 20:50530PosCntrl7.07
03-19-2021 21:00530PosCntrl7.08
03-19-2021 19:00575NegCntrl7.06
03-19-2021 19:10575NegCntrl6.66
03-19-2021 19:20575NegCntrl6.69
03-19-2021 19:30575NegCntrl6.78
03-19-2021 19:40575NegCntrl6.90
03-19-2021 19:50575NegCntrl6.97
03-19-2021 20:00575NegCntrl6.96
03-19-2021 20:10575NegCntrl7.00
03-19-2021 20:20575NegCntrl7.03
03-19-2021 20:30575NegCntrl7.01
03-19-2021 20:40575NegCntrl7.01
03-19-2021 20:50575NegCntrl7.06
03-19-2021 21:00575NegCntrl7.03
03-19-2021 21:10575NegCntrl7.03
03-19-2021 21:20575NegCntrl7.01
03-19-2021 21:30575NegCntrl7.04
03-19-2021 21:40575NegCntrl7.04
03-19-2021 21:50575NegCntrl6.72
03-19-2021 22:00575NegCntrl6.52
03-19-2021 22:10575NegCntrl6.44
03-19-2021 22:20575NegCntrl6.40
03-19-2021 22:30575NegCntrl6.40
03-19-2021 20:40593NegCntrl7.00
03-19-2021 20:50593NegCntrl6.92
03-19-2021 21:00593NegCntrl6.89
03-19-2021 21:10593NegCntrl6.94
03-19-2021 21:20593NegCntrl6.94
03-19-2021 21:30593NegCntrl6.97
03-19-2021 21:40593NegCntrl6.99
03-19-2021 21:50593NegCntrl6.80
03-19-2021 22:00593NegCntrl6.76
03-19-2021 22:10593NegCntrl6.67
03-19-2021 22:20593NegCntrl6.68
03-19-2021 22:30593NegCntrl6.65
03-19-2021 22:40593NegCntrl6.71
03-19-2021 22:50593NegCntrl6.65
03-19-2021 23:00593NegCntrl6.65
03-19-2021 23:10593NegCntrl6.65
03-19-2021 23:20593NegCntrl6.67
03-19-2021 23:30593NegCntrl6.65
03-19-2021 23:40593NegCntrl6.68
03-19-2021 23:50593NegCntrl6.58
03-20-2021 0:00593NegCntrl6.51
03-20-2021 0:10593NegCntrl6.46
03-20-2021 0:20593NegCntrl6.50
03-20-2021 0:30593NegCntrl6.47
03-19-2021 9:50654Buffer16.80
03-19-2021 10:00654Buffer16.69
03-19-2021 10:10654Buffer16.66
03-19-2021 10:20654Buffer16.66
03-19-2021 10:30654Buffer16.55
03-19-2021 10:40654Buffer16.52
03-19-2021 10:50654Buffer16.58
03-19-2021 11:00654Buffer16.59
03-19-2021 11:10654Buffer16.66
03-19-2021 11:20654Buffer16.70
03-19-2021 11:30654Buffer16.59
03-19-2021 11:40654Buffer16.65
03-19-2021 11:50654Buffer16.73
03-19-2021 12:00654Buffer16.76
03-19-2021 12:10654Buffer16.70
03-19-2021 12:20654Buffer16.69
03-19-2021 12:30654Buffer16.79
03-19-2021 12:40654Buffer16.91
03-19-2021 12:50654Buffer16.63
03-19-2021 13:00654Buffer16.58
03-19-2021 13:10654Buffer16.62
03-19-2021 13:20654Buffer16.69
03-19-2021 13:30654Buffer16.59
03-19-2021 13:40654Buffer16.69
03-19-2021 13:50654Buffer16.65
03-19-2021 10:30664Buffer17.00
03-19-2021 10:40664Buffer16.86
03-19-2021 10:50664Buffer16.76
03-19-2021 11:00664Buffer16.76
03-19-2021 11:10664Buffer16.76
03-19-2021 11:20664Buffer16.73
03-19-2021 11:30664Buffer16.78
03-19-2021 11:40664Buffer16.76
03-19-2021 11:50664Buffer16.79
03-19-2021 12:00664Buffer16.79
03-19-2021 12:10664Buffer16.82
03-19-2021 12:20664Buffer16.82
03-19-2021 12:30664Buffer16.79
03-19-2021 12:40664Buffer16.86
03-19-2021 12:50664Buffer16.82
03-19-2021 13:00664Buffer16.86
03-19-2021 13:10664Buffer16.87
03-19-2021 13:20664Buffer16.92
03-19-2021 13:30664Buffer16.99
03-19-2021 13:40664Buffer16.92
03-19-2021 13:50664Buffer16.86
03-19-2021 14:00664Buffer16.89
03-19-2021 14:10664Buffer16.90
03-19-2021 14:20664Buffer16.93
03-19-2021 18:10672NegCntrl6.99
03-19-2021 18:20672NegCntrl6.98
03-19-2021 18:30672NegCntrl6.92
03-19-2021 18:40672NegCntrl6.97
03-19-2021 18:50672NegCntrl6.94
03-19-2021 19:00672NegCntrl6.92
03-19-2021 19:10672NegCntrl6.85
03-19-2021 19:20672NegCntrl6.90
03-19-2021 19:30672NegCntrl6.91
03-19-2021 19:40672NegCntrl6.94
03-19-2021 19:50672NegCntrl6.91
03-19-2021 20:00672NegCntrl6.81
03-19-2021 20:10672NegCntrl6.94
03-19-2021 20:20672NegCntrl6.92
03-19-2021 20:30672NegCntrl6.92
03-19-2021 20:40672NegCntrl6.94
03-19-2021 20:50672NegCntrl6.87
03-19-2021 21:00672NegCntrl6.92
03-19-2021 21:10672NegCntrl6.91
03-19-2021 21:20672NegCntrl6.92
03-19-2021 21:30672NegCntrl6.92
03-19-2021 21:40672NegCntrl6.91
03-19-2021 21:50672NegCntrl6.92
03-19-2021 9:00772Buffer27.02
03-19-2021 9:10772Buffer26.97
03-19-2021 9:20772Buffer26.94
03-19-2021 9:30772Buffer26.92
03-19-2021 9:40772Buffer26.95
03-19-2021 9:50772Buffer26.97
03-19-2021 10:00772Buffer27.03
03-19-2021 10:10772Buffer26.92
03-19-2021 10:20772Buffer27.01
03-19-2021 10:30772Buffer27.01
03-19-2021 10:40772Buffer27.02
03-19-2021 10:50772Buffer27.01
03-19-2021 11:00772Buffer27.02
03-19-2021 11:10772Buffer27.05
03-19-2021 11:20772Buffer27.08
03-19-2021 11:30772Buffer27.10
03-19-2021 11:40772Buffer27.16
03-19-2021 11:50772Buffer27.10
03-19-2021 12:00772Buffer27.09
03-19-2021 12:10772Buffer27.19
03-19-2021 12:20772Buffer27.24
03-19-2021 12:30772Buffer27.19
03-19-2021 12:40772Buffer27.20
03-19-2021 12:50772Buffer27.24
03-19-2021 13:00772Buffer27.22
03-19-2021 12:20915PosCntrl6.93
03-19-2021 12:30915PosCntrl6.85
03-19-2021 12:40915PosCntrl6.94
03-19-2021 12:50915PosCntrl6.89
03-19-2021 13:00915PosCntrl6.89
03-19-2021 13:10915PosCntrl6.87
03-19-2021 13:20915PosCntrl6.89
03-19-2021 13:30915PosCntrl6.92
03-19-2021 13:40915PosCntrl6.93
03-19-2021 13:50915PosCntrl7.01
03-19-2021 14:00915PosCntrl6.93
03-19-2021 14:10915PosCntrl6.93
03-19-2021 14:20915PosCntrl7.03
03-19-2021 14:30915PosCntrl6.94
03-19-2021 14:40915PosCntrl6.96
03-19-2021 14:50915PosCntrl6.99
03-19-2021 15:00915PosCntrl6.99
03-19-2021 15:10915PosCntrl6.96
03-19-2021 15:20915PosCntrl6.92
03-19-2021 15:30915PosCntrl6.86
03-19-2021 15:40915PosCntrl6.90
03-19-2021 15:50915PosCntrl7.13
03-19-2021 16:00915PosCntrl6.99
03-19-2021 16:10915PosCntrl6.87
03-19-2021 16:20915PosCntrl6.90
03-19-2021 9:00942Buffer27.01
03-19-2021 9:10942Buffer26.78
03-19-2021 9:20942Buffer26.71
03-19-2021 9:30942Buffer26.75
03-19-2021 9:40942Buffer26.72
03-19-2021 9:50942Buffer26.80
03-19-2021 10:00942Buffer26.80
03-19-2021 10:10942Buffer26.71
03-19-2021 10:20942Buffer26.59
03-19-2021 10:30942Buffer26.51
03-19-2021 10:40942Buffer26.62
03-19-2021 10:50942Buffer26.69
03-19-2021 11:00942Buffer26.71
03-19-2021 11:10942Buffer26.71
03-19-2021 11:20942Buffer26.73
03-19-2021 11:30942Buffer26.75
03-19-2021 11:40942Buffer26.72
03-19-2021 11:50942Buffer26.75
03-19-2021 12:00942Buffer26.79
03-19-2021 12:10942Buffer26.78
03-19-2021 12:20942Buffer26.66
03-19-2021 12:30942Buffer26.68

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 393 views
  • 2 likes
  • 4 in conversation