Hi all,
I am preparing a report about the interactions of a bunch of users with an online reporting tool. I have broken the users down into 3 groups (G1, G2, and G3). I am comparing them over about 20 metrics. The data collection period covers from 2019 to 2022 but not every participant has data in all the data collection intervals in this period. It seems like each subject has reporting periods that are unique to them and were not the same as other subjects! I have used Tableau and a date scaffold to create the following plots on three of the metrics that I have to report on. The date scaffold is very similar to the idea presented here: https://tarsolutions.co.uk/blog/tableau-scaffolding-dates-calculating-deferred-revenue/
Some plots are clearly different between the 3 groups (like the green plots). Some are not really that different between the 3 groups (like the yellow line plots). I have attached a sample of my data containing a few users to this post. The original data file included Reporting_Period_Start_Date and Reporting_Period_End_Date and the "Value" was being compared which was "Numerator"/"Denominator". I have created the Date_scaf and corresponding Daily_Numerator and Daily_Denominator and AVGs based on the original data.
So my question is; I can see that the green line plots are much different between the 3 groups but is there a statistical test for this difference?
I tried using ANOVA:
proc ANOVA data=SAS_Stat_Q; class Group; model AVGs = Group; means Groups /hovtest welch; run;
But this is incorrect. What the test interprets as means is that AVGs for each group are added together and then divided by the number of records in that group (a classic mean). What I need is a test that considers each date in the date_scaf for example 11/26/2020 which makes for the following table:
Date_scaf | Reporting_Period_End_Date | Reporting_Period_Start_Date | AVGs | Daily_Denominator | Daily_Numerator | Denominator | Numerator | Value | personID | Group |
11/26/2020 | 11/28/2020 | 11/1/2020 | 10.54166 | 0.285714 | 3.011903 | 8 | 84.3333 | 10.54166 | person009 | G3 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 12.73263 | 0.857143 | 10.91369 | 24 | 305.5832 | 12.73263 | person004 | G3 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 20.76333 | 0.714286 | 14.83095 | 20 | 415.2666 | 20.76333 | person007 | G3 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 1.356139 | 0.678571 | 0.920237 | 19 | 25.76664 | 1.356139 | person011 | G2 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 2.133332 | 0.071429 | 0.152381 | 2 | 4.266664 | 2.133332 | person021 | G2 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 11.18214 | 0.5 | 5.59107 | 14 | 156.5499 | 11.18214 | person010 | G1 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 7.892497 | 0.714286 | 5.637498 | 20 | 157.8499 | 7.892497 | person055 | G1 |
11/26/2020 | 11/28/2020 | 11/1/2020 | 6.466663 | 0.321429 | 2.07857 | 9 | 58.19996 | 6.466663 | person047 | G1 |
so now for G1 the average that I want for this date is sum(Daily_Numerator for persons 010, 055, 047)/sum(Daily_Denominator for persons 010, 055, 047). This would be (13.30713/1.5357)=8.6652
Similar averages should be calculated for groups G2 and G3 and the results of all three to be compared.
Does anyone know what statistical test should be used here?
Hello,
Do you have SAS/ETS (part of SAS Econometrics in SAS VIYA)??
You could use PROC SIMILARITY to compare time series.
proc similarity has dynamic time warping (DTW) so time series can have :
SAS/ETS 15.2 User's Guide
The SIMILARITY Procedure
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/etsug/etsug_similarity_syntax02.htm
See also here :
Fundamentals of Statistical Consulting
Week 10 Comparison of two time series
https://www.maths.usyd.edu.au/u/jchan/Consult/W10_CompareTwoTimeSeries.pdf
BR,
Koen
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.