Hello,
I'm working with a dataset produced from a 5-variable full factorial screening with 3 centerpoints. The raw data is heavily skewed with an exponential distribution. I've tried log, log10, square root and various box-cox transformations and can't seem to get anything even nearly approaching a normal distribution.
Condition | Pattern | data1 |
1 | +−−+− | 0.59 |
2 | −+−−− | 1.6 |
3 | −−−+− | 1.78 |
4 | +−++− | 0.45 |
5 | +−−−− | 1.37 |
6 | −++−− | 0.87 |
7 | ++−+− | 0.05 |
8 | −−−−− | 4.46 |
9 | ++−−− | 0.14 |
10 | −+−+− | 0.36 |
11 | +++−− | 0.11 |
12 | +−+−− | 0.8 |
13 | −+++− | 0.33 |
14 | ++++− | 0.05 |
15 | −−+−− | 2.43 |
16 | −−++− | 1.53 |
17 | 0 | 0.86 |
18 | 0 | 0.79 |
19 | 0 | 0.9 |
20 | −+−−+ | 0.94 |
21 | +−+++ | 0.87 |
22 | −−+++ | 0.91 |
23 | −−−++ | 0.72 |
24 | −+−++ | 0.05 |
25 | −−+−+ | 2.74 |
26 | −++−+ | 0.72 |
27 | −++++ | 0.08 |
28 | +−+−+ | 2.92 |
29 | −−−−+ | 4.08 |
30 | ++−−+ | 0.88 |
31 | +−−−+ | 3.98 |
32 | +−−++ | 0.82 |
33 | +++++ | 0.08 |
34 | +++−+ | 0.78 |
35 | ++−++ | 0.06 |
1. What kind of transform is appropriate to handle the data set?
2. If there aren't any appropriate methods of transforming the data, how can it be modeled? (using Fit Model, etc)
I think you would be better off asking in the JMP community.
I think you would be better off asking in the JMP community.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.