BookmarkSubscribeRSS Feed
mmohotsi
Obsidian | Level 7

Good morning all

 

I have a censored data set as below and would like to calculate the 50th, 93rd, 95th and 99th percentiles. The data below has even count and may vary at times to an odd count.

 

I am using SAS 9.4 and Enterprise Guide 7.1

 

Thanking you in advance

 

The data is as below:

 

data Aluminium;
input Name $ Result $;
cards;
Al <5
Al <15
Al <5
Al 28
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 25
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 28
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
Al <5
Al <15
Al <5
Al 25
Al <5
Al ❤️
Al <12
Al <19
Al <2
Al <12
;
run;

5 REPLIES 5
Rick_SAS
SAS Super FREQ

I think you can do this by using survival analysis, but I am not an expert in that area. You need to create a binary indicator variable that specifies whether the time was observed or censored. You then create a numerical value from the remaining part of the Result string.

 

data Have;
set Aluminium;
censored = (substr(Result, 1,1)='<');
w = scan(Result, -1, "<");  /* scans from the right */
t = input(w, best.);
drop w;
run;

After you get the data in this form, look at PROC LIFETEST to analyze the data. For example, the following basic analysis give the 25th, 50th, and 75th percentiles of the survival time.  I do not know the options to get a table of the percentiles that you want, although you can read it off the graph of the survival probability:

 

ods graphics on;
proc lifetest data=B;
time t*Censored(1);
run;

 

mmohotsi
Obsidian | Level 7
Good morning SAS Super FREQ

The first part of the solution gave some clue. Thank you for the quick response
Regards
MMohotsi
Reeza
Super User
The main problem is that you don't really have a defined point in time, it's more a step type function? I would start by doing a PROC FREQ to get the counts of each level. The percentiles will be based on the frequency table cumulative percents.

You could try a censored approach, but I think in this case the math will work out the same.
ballardw
Super User

@Reeza wrote:
The main problem is that you don't really have a defined point in time, it's more a step type function? I would start by doing a PROC FREQ to get the counts of each level. The percentiles will be based on the frequency table cumulative percents.

You could try a censored approach, but I think in this case the math will work out the same.

But if you want your result to be a particular order you may need to modify values as <10 will come before <5 when using character values. And the mix of < and integer is very problematic.

Consider the following proc freq output from your example data step:

                                   Cumulative    Cumulative
Result    Frequency     Percent     Frequency      Percent
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
25               2        5.00             2         5.00
28               2        5.00             4        10.00
<12              8       20.00            12        30.00
<15              4       10.00            16        40.00
<19              4       10.00            20        50.00
<2               4       10.00            24        60.00
<3               4       10.00            28        70.00
<5              12       30.00            40       100.00


mmohotsi
Obsidian | Level 7
Good day

I think the challenge with this approach is the order of the field "Result'. Combining the response from Rick_SAS and the two latest might have a light at the end of the tunnel.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1014 views
  • 4 likes
  • 4 in conversation