Solved: Re: Help interpreting linear trend

rogersaj · Posted 09-22-2017 03:01 PM

Hi,

Dependent variable = gestational age at first clinic visit (continuous) = p1v1_gest

Independent variable = village distance from clinic = distcat

I'd like to know if the gestational age at first clinic visit varies LINEARLY with village distance from clinic.

Code:

proc glm data = ...;

class distcat;

model p1v1_gest = distcat / solution;

estimate "Linear trend for distcat" distcat -3 -1 1 3;

contrast 'linear' distcat -3 -1 1 3;

run;

Results: (see attached)

I have the following questions (If you can only answer one, the first is most important!):

Can you please help me interpet the output with as regards the "Linear trend for distcat"?
How does the "estimate" result differ from the "contrast" result in terms of interpetation?
Was the use of -3 -1 1 3 appropriate or am I supposed to put in the median values and make them sum to zero?
Once I add in covariates into the model, how will that change my interpretation?
If, instead of distance, I had three marital status categories (Single, Currently Married, Previously Married) could I still test for a linear trend? If so, how would that interpretation work?

Thanks SO much!

AJ

sld · Posted 09-23-2017 03:57 PM

This SAS note discusses the selection of coefficients for linear trend

http://support.sas.com/kb/22/912.html

Note that {-3, -1, 1, 3} are appropriate for a categorical factor with 4 equally spaced levels that are in order from smallest to largest ordinal value. Based on the labels for distcat in your output, it's not apparent that the levels of distcat are evenly spaced, or in fact, what value should be associated with each level of distcat. And the levels clearly are not in order, which is the point made by @PGStats.

The ESTIMATE and CONTRAST statements both test H0: no linear trend: the p-values are the same, and t^2 for ESTIMATE is equal to F for CONTRAST. In other words, they are the same test. The Estimate value reported by ESTIMATE is the linear combination of the distcat means using the specified coefficients; notably, it is not an estimate of the slope.

I won't speculate about the impact of adding covariates. It depends on what model you specify, for example, whether you include interaction between distcat and covariates.

It's not obvious to me that Single, Currently Married, and Previously Married are ordered values. Linear trend makes sense for only ordered values.

If you have actual distance values, you could and most likely should use those values in a regression model rather than categorizing distance to use in an ANOVA model. You needlessly give up information by categorizing, not to mention the arbitrary aspect of deciding how many categories and what cutpoints.

HTH

View solution in original post

PGStats · Posted 09-22-2017 11:40 PM

Your output seems to indicate that your distcat levels are not ordered properly in your tests to represent a linear effect. Specifying option ORDER=DATA in the proc statement might help you solve this problem. Anyway, you should use the E option in your contrast and estimate statements to check the ordering of distcat levels.

PG

sld · Posted 09-23-2017 03:57 PM

This SAS note discusses the selection of coefficients for linear trend

http://support.sas.com/kb/22/912.html

Note that {-3, -1, 1, 3} are appropriate for a categorical factor with 4 equally spaced levels that are in order from smallest to largest ordinal value. Based on the labels for distcat in your output, it's not apparent that the levels of distcat are evenly spaced, or in fact, what value should be associated with each level of distcat. And the levels clearly are not in order, which is the point made by @PGStats.

The ESTIMATE and CONTRAST statements both test H0: no linear trend: the p-values are the same, and t^2 for ESTIMATE is equal to F for CONTRAST. In other words, they are the same test. The Estimate value reported by ESTIMATE is the linear combination of the distcat means using the specified coefficients; notably, it is not an estimate of the slope.

I won't speculate about the impact of adding covariates. It depends on what model you specify, for example, whether you include interaction between distcat and covariates.

It's not obvious to me that Single, Currently Married, and Previously Married are ordered values. Linear trend makes sense for only ordered values.

If you have actual distance values, you could and most likely should use those values in a regression model rather than categorizing distance to use in an ANOVA model. You needlessly give up information by categorizing, not to mention the arbitrary aspect of deciding how many categories and what cutpoints.

HTH

Help interpreting linear trend

Re: Help interpreting linear trend

Re: Help interpreting linear trend

Re: Help interpreting linear trend

Catch up on SAS Innovate 2026