Good morning all
I have a censored data set as below and would like to calculate the 50th, 93rd, 95th and 99th percentiles. The data below has even count and may vary at times to an odd count.
I am using SAS 9.4 and Enterprise Guide 7.1
Thanking you in advance
The data is as below:
data Aluminium;
input Name $ Result $;
cards;
Al <5
Al <15
Al <5
Al 28
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 25
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 28
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 25
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
;
run;
I think you can do this by using survival analysis, but I am not an expert in that area. You need to create a binary indicator variable that specifies whether the time was observed or censored. You then create a numerical value from the remaining part of the Result string.
data Have;
set Aluminium;
censored = (substr(Result, 1,1)='<');
w = scan(Result, -1, "<"); /* scans from the right */
t = input(w, best.);
drop w;
run;
After you get the data in this form, look at PROC LIFETEST to analyze the data. For example, the following basic analysis give the 25th, 50th, and 75th percentiles of the survival time. I do not know the options to get a table of the percentiles that you want, although you can read it off the graph of the survival probability:
ods graphics on;
proc lifetest data=B;
time t*Censored(1);
run;
@Reeza wrote:
The main problem is that you don't really have a defined point in time, it's more a step type function? I would start by doing a PROC FREQ to get the counts of each level. The percentiles will be based on the frequency table cumulative percents.
You could try a censored approach, but I think in this case the math will work out the same.
But if you want your result to be a particular order you may need to modify values as <10 will come before <5 when using character values. And the mix of < and integer is very problematic.
Consider the following proc freq output from your example data step:
Cumulative Cumulative Result Frequency Percent Frequency Percent ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ 25 2 5.00 2 5.00 28 2 5.00 4 10.00 <12 8 20.00 12 30.00 <15 4 10.00 16 40.00 <19 4 10.00 20 50.00 <2 4 10.00 24 60.00 <3 4 10.00 28 70.00 <5 12 30.00 40 100.00
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.