Dear All:
I do need helps with how to use both PROC surveyfreq and PROC surveymeans to do the following analysis.
I really thank you all for your helps.
I have the following variables:
Stratum1 (1,2)
Months (0,1,2)
X1 (1=yes,2=No) ---- nominal categorical
X2 ------ interval continuous
Weight1 ---- weight variable
I need
(1) to test the equality of proportions of ones in months (0,1,2), and conduct a multiple comparison test for these proportions.
(2) to test the equality of the averages in months(0,1,2) and conduct a multiple comparison test for these means.
(3) test for trend: for example if there is increasing or decreasing in averages (X2), if there is increasing or decreasing in proportions of ones (X1).
Any helps will be appreciated.
Please email me copy of your answer.
With many thanks
Steve
Email: sstoline@gmail.com
data1--- as an example
=============================
X2 X1 stratum1 months weight1
29.4822 2 2 0 16.61921705
33.2849 2 2 0 16.61921705
16.9370 1 1 0 2.09612124
19.0055 1 1 0 2.09612124
21.1151 1 2 0 16.61921705
25.4055 2 2 0 16.61921705
24.4164 1 1 0 1.56167161
19.0767 1 1 0 1.56167161
23.1562 2 1 0 2.09612124
23.3479 1 1 0 2.09612124
21.7370 1 1 0 2.09612124
24.9726 2 2 0 16.61921705
20.3836 1 2 0 16.61921705
20.0575 1 2 0 20.88637405
17.6603 1 2 0 16.61921705
25.2274 2 2 0 16.61921705
20.9644 1 1 0 2.09612124
32.8055 2 2 0 14.51497509
39.5233 2 1 0 1.56167161
17.7288 1 1 1 2.09612124
18.3096 1 2 1 16.61921705
25.8055 2 2 1 16.61921705
30.0904 1 2 1 16.61921705
27.8082 2 1 1 2.09612124
37.3863 2 1 1 2.09612124
26.2548 2 1 1 2.09612124
17.8795 1 2 1 20.88637405
19.8000 1 2 1 16.61921705
28.6932 2 1 1 1.56167161
27.1041 2 2 1 16.61921705
36.1836 2 2 1 16.61921705
19.8192 1 1 1 2.09612124
31.9644 2 2 1 20.88637405
27.0740 1 1 1 2.09612124
23.5288 2 1 1 2.09612124
21.9068 1 2 1 14.51497509
20.3534 1 2 1 16.61921705
29.4767 2 2 1 14.51497509
22.8274 1 2 1 16.61921705
21.2685 1 2 1 20.88637405
31.8932 1 1 1 1.56167161
31.8795 2 1 1 2.09612124
21.2630 2 1 1 2.09612124
18.7562 2 1 1 2.09612124
16.8822 1 2 1 20.88637405
22.0164 2 2 2 20.88637405
27.4959 2 1 2 2.09612124
27.4904 2 1 2 1.56167161
2.7096 2 2 2 14.51497509
23.0027 1 2 2 16.61921705
19.2767 1 2 2 16.61921705
30.4466 1 2 2 16.61921705
36.5425 2 2 2 16.61921705
32.2521 2 1 2 1.56167161
36.9534 1 2 2 14.51497509
30.4164 2 2 2 14.51497509
20.2767 2 1 2 2.09612124
17.2356 1 2 2 20.88637405
20.8247 2 2 2 16.61921705
20.8795 1 2 2 16.61921705
27.5260 2 2 16.61921705
19.2493 1 2 2 16.61921705
17.8521 1 2 2 16.61921705
19.6822 1 2 2 16.61921705
36.8356 2 1 2 1.5616716
Steven,
I think the only way to get a plot of just the percents of ones would be to use ODS Output, and PROC SGPLOT.
ods graphics on;
ods output wtfreqplot=wtfreqplot;
proc surveyfreq data = data1;
strata stratum1;
weight weight1;
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot); /* BARCHART */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent); */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical); */
table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical);
run;
proc sgplot data=wtfreqplot;
where _column='1';
highlow x=_row high=u_percent low=l_percent
/ close=_percent legendlabel="Percent";
series x=_row y=_percent;
xaxis label='Month';
yaxis label='Percent';
run;
The bolded part gives a plot that I think looks something like what you want. Unfortunately, I can't figrue out how to suppress the two legends so as to give only one. Maybe someone else can do that.
Steve Denham
Well, rather than SURVEYMEANS, I would check out SURVEYREG. The use of CLASS statments and TEST and LSMESTIMATE statements should go a long way toward giving the answers you are working towards in #2 and #3. For #1, I assume that X1 would be the dependent variable. The SURVEYFREQ documentation, especially Example 90.1 should give you the code you need.
Steve Denham
Dear Steve:
First I really thank you very much for your help.
Clarification:
I am not sure when you said "For #1, I assume that X1 would be the dependent variable". If so, the table statement should look like
table X1*months/ row nowt wchisq;
or
table months*X1/ row nowt wchisq;
many thanks
Steven
I would go with the latter. The example (90.1) puts the response as the last variable.
Steve Denham
Dear Steve:
I tried to use both TEST and LSMESTIMATE statements but I could not make them work. I could not find an example for TEST statement, and I could not make LSMESTIMATE statement. I am wondering if you can help me with a simple code using the above sample data.
thank you very much
steven
Here's a first try. I don't think you will need the TEST statement.
proc surveyreg data=data1 total=<don't know what to put here, but it is the total number of PSUs that you sampled from, could be in a dataset>;
strata stratum1;
class months;
model X2 = months;
lsmeans months / diff;
lsmestimate months "0 v 1" -1 1 0,
'0 v 2' -1 0 1/adjust= <pick your method for adjustment here>;
lsmestimate monts 'Linear trend' -1 0 1;
weight weight1;
run;
This gives the multiple comparison in the first LSMESTIMATE statement, and the linear trend in the second. I set the total PSU to 640 for my analysis, and used adjust=simulate. (Oh and I had to modify the data. In the fifth record from the bottom, there is a missing value, and I had no idea whether it was for X1 or stratum. I inserted a 2).
Header 1 | Header 2 | Header 3 | Header 4 | Header 5 | Header 6 | Header 7 | Header 8 |
---|---|---|---|---|---|---|---|
Effect | Label | Estimate | Standard Error | DF | tValue | Pr > t | Adj P |
months | 0 v 1 | -0.3734 | 1.9941 | 64 | -0.19 | 0.8521 | 0.9760 |
months | 0 v 2 | -1.5843 | 2.3346 | 64 | -0.68 | 0.4998 | 0.7297 |
Adjustment for multiplicity
Linear trend (copied without headers)
months | Linear trend | -1.5843 | 2.3346 | 64 | -0.68 | 0.4998 |
---|
I hope this helps.
Steve Denham
Message was edited by: Steve Denham
Dear Steve:
Thank you very much. It helped me a lot.
Few more things:
1- Is there is a way to plot the confidence intervals for means with "PROC surveymeans". If so, I need help with this.
proc surveymeans data = data1;
strata stratum1;
weight weight1;
domain months;
var x2;
run;
2- Same thing with "PROC surveyfreq", is there is a way to plot the confidence Intervals for proportions. Here, also how to get CIs for proportion in the SAS output.
proc surveyfreq data = data1;
strata stratum1;
weight weight1;
table months*x1;
run;
something like:
once again, I really thank you very much for all your help.
Steven
Hi Steven,
For SURVEYFREQ, check the PLOTS= option in the TABLES statement. To get CI's reported/calculated, add the /CL option to the TABLES statement.
For SURVEYMEANS, I would add the PERCENTILES = or QUANTILES = options in the PROC statement. Unfortunately, SURVEYMEANS doesn't have an obvious PLOTS= option that I could find anywhere. You would probably have to use an ODS OUTPUT statement to export the statistics and quantile datasets, then manipulate them some, and finally use PROC SGPLOT.
Or you could use SURVEYREG and do a domain analysis, or a main effects model and use the CL and PLOTS= options in the LSMEANS statement.
Steve Denham
Dear Steve:
I am able to get the graph of confidence intervals for the means for 11 months in the PROC SURVEYREG. it looks greats, many thanks.
However, I could not got the graph confidence intervals for the percent in the PROC SURVEYFREQ.
I need to get the plot of the graph of confidence intervals of percent of ones in the X1 variables in each months. That is there will be three confidence intervals: CI for percent of ones in X1 in month 0, CI for percent of ones in X1 in month 1, and CI for percent of ones in X1 in month 2.
I am still need help with this part.
Also, if I have 11 month, how to set the test for trend, I could not make it.
Sorry for bothering
many thanks
Steven
Let's do the last first. For 11 months, a linear trend in the means would be:
LSMESTIMATE months 'Linear trend' -5 -4 -3 -2 -1 0 1 2 3 4 5/cl;
To get the plots in SURVEYFREQ, make sure ODS GRAPHICS ON; precedes the call:
ods graphics on;
proc surveyfreq data = data1;
strata stratum1;
weight weight1;
table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical);
run;
Drawing lines to connect looks like it will require major editing of the plots. You may want to ask in the Graphics forum for help on that, if it is really needed.
Steve Denham
Dear Steve:
I think this will be the last thing in this issue (hope so). I tried to use the code you showed me above and I tried my best to play with the plot (option), but it gives me the following graph
Is there is away to just get the graph of the percents of ones in X1 for the three months (0,1,2) in one graph just like the one obtained from PROC SURVEYREG for means shown in my previous post. Also, I could not see what (only) in the plot option added to the plot.
By the way, is it possible to conduct a TREND test in PROC SURVEYFREQ? if so, please let me know how to do it.
once again many thanks
Steven
This what I tried:
data data1;
input X2 X1 stratum1 months weight1;
datalines;
29.4822 2 2 0 16.61921705
33.2849 2 2 0 16.61921705
16.9370 1 1 0 2.09612124
19.0055 1 1 0 2.09612124
21.1151 1 2 0 16.61921705
25.4055 2 2 0 16.61921705
24.4164 1 1 0 1.56167161
19.0767 1 1 0 1.56167161
23.1562 2 1 0 2.09612124
23.3479 1 1 0 2.09612124
21.7370 1 1 0 2.09612124
24.9726 2 2 0 16.61921705
20.3836 1 2 0 16.61921705
20.0575 1 2 0 20.88637405
17.6603 1 2 0 16.61921705
25.2274 2 2 0 16.61921705
20.9644 1 1 0 2.09612124
32.8055 2 2 0 14.51497509
39.5233 2 1 0 1.56167161
17.7288 1 1 1 2.09612124
18.3096 1 2 1 16.61921705
25.8055 2 2 1 16.61921705
30.0904 1 2 1 16.61921705
27.8082 2 1 1 2.09612124
37.3863 2 1 1 2.09612124
26.2548 2 1 1 2.09612124
17.8795 1 2 1 20.88637405
19.8000 1 2 1 16.61921705
28.6932 2 1 1 1.56167161
27.1041 2 2 1 16.61921705
36.1836 2 2 1 16.61921705
19.8192 1 1 1 2.09612124
31.9644 2 2 1 20.88637405
27.0740 1 1 1 2.09612124
23.5288 2 1 1 2.09612124
21.9068 1 2 1 14.51497509
20.3534 1 2 1 16.61921705
29.4767 2 2 1 14.51497509
22.8274 1 2 1 16.61921705
21.2685 1 2 1 20.88637405
31.8932 1 1 1 1.56167161
31.8795 2 1 1 2.09612124
21.2630 2 1 1 2.09612124
18.7562 2 1 1 2.09612124
16.8822 1 2 1 20.88637405
22.0164 2 2 2 20.88637405
27.4959 2 1 2 2.09612124
27.4904 2 1 2 1.56167161
2.7096 2 2 2 14.51497509
23.0027 1 2 2 16.61921705
19.2767 1 2 2 16.61921705
30.4466 1 2 2 16.61921705
36.5425 2 2 2 16.61921705
32.2521 2 1 2 1.56167161
36.9534 1 2 2 14.51497509
30.4164 2 2 2 14.51497509
20.2767 2 1 2 2.09612124
17.2356 1 2 2 20.88637405
20.8247 2 2 2 16.61921705
20.8795 1 2 2 16.61921705
27.5260 1 2 2 16.61921705
19.2493 1 2 2 16.61921705
17.8521 1 2 2 16.61921705
19.6822 1 2 2 16.61921705
36.8356 2 1 2 1.5616716
;
run;
ods graphics on;
proc surveyfreq data = data1;
strata stratum1;
weight weight1;
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot); /* BARCHART */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent); */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical); */
table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical);
run;
ods graphics off;
Steven,
I think the only way to get a plot of just the percents of ones would be to use ODS Output, and PROC SGPLOT.
ods graphics on;
ods output wtfreqplot=wtfreqplot;
proc surveyfreq data = data1;
strata stratum1;
weight weight1;
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot); /* BARCHART */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent); */
/* table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical); */
table months*x1/clwt plots(only)=wtfreqplot(scale=percent type=dotplot orient=vertical);
run;
proc sgplot data=wtfreqplot;
where _column='1';
highlow x=_row high=u_percent low=l_percent
/ close=_percent legendlabel="Percent";
series x=_row y=_percent;
xaxis label='Month';
yaxis label='Percent';
run;
The bolded part gives a plot that I think looks something like what you want. Unfortunately, I can't figrue out how to suppress the two legends so as to give only one. Maybe someone else can do that.
Steve Denham
Dear Steve:
Thank you very much for everything. I think I have what so far needed at this point.
Once again many thanks
Steven
Dear Steve:
I do need some help with similar data. How to compare averages and proportions in Proc Surveymeans and Proc Surveyfreq.
The data contains the following variables:
x1 = mother age
x2 = father age
x3 = (yes, no) = (1,2)
x4 = (yes, no) = (1,2)
x5 = (yes, no) = (1,2)
x6 = (yes, no) = (1,2)
x7 = stratum
x8 = weight
this is part of the data can be used as an example:
=====================================
x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 |
38.8466 | 1 | 1 | 1 | 1 | 2 | 16.61921705 | |
22.9233 | 25 | 1 | 2 | 2 | 16.61921705 | ||
35.4849 | 34 | 2 | 1 | 2 | 2 | 2 | 14.51497509 |
18.9863 | 19 | 1 | 1 | 2 | 2 | 2 | 20.88637405 |
27.9425 | 23 | 2 | 1 | 2 | 2 | 2 | 16.61921705 |
23.0904 | 25 | 1 | 1 | 2 | 2 | 2 | 20.88637405 |
26.0055 | 34 | 2 | 2 | 2 | 2 | 1 | 1.56167161 |
28.6411 | 38 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
38.4986 | 42 | 1 | 2 | 2 | 2 | 2 | 16.61921705 |
31.1288 | 32 | 2 | 2 | 2 | 2 | 1 | 1.56167161 |
36.9534 | 36 | 2 | 2 | 2 | 2 | 2 | 14.51497509 |
22.1425 | 40 | 2 | 2 | 2 | 2 | 2 | 16.61921705 |
29.7890 | 32 | 2 | 2 | 2 | 2 | 2 | 14.51497509 |
28.6904 | 30 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
20.0877 | 32 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
31.6192 | 30 | 2 | 2 | 2 | 2 | 1 | 1.56167161 |
23.9836 | 24 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
18.6110 | 26 | 1 | 1 | 2 | 2 | 2 | 20.88637405 |
26.5151 | 28 | 1 | 1 | 2 | 2 | 1 | 2.09612124 |
29.2630 | 29 | 2 | 1 | 1 | 2 | 1 | 1.56167161 |
19.6630 | 19 | 2 | 2 | 2 | 1 | 2 | 16.61921705 |
17.4466 | 17 | 1 | 2 | 1 | 2.09612124 | ||
35.3342 | 41 | 2 | 2 | 2 | 2 | 2 | 14.51497509 |
29.6466 | 22 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
21.1890 | 30 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
25.4055 | 40 | 1 | 1 | 2 | 2 | 2 | 16.61921705 |
19.0164 | 18 | 1 | 2 | 2 | 16.61921705 | ||
38.4877 | 32 | 2 | 2 | 1 | 2.09612124 | ||
31.0630 | 32 | 2 | 2 | 2 | 2 | 2 | 14.51497509 |
22.5616 | 24 | 2 | 2 | 2 | 2 | 1 | 1.56167161 |
26.3178 | 36 | 2 | 2 | 2 | 2 | 2 | 14.51497509 |
21.6959 | 20 | 2 | 2 | 2 | 2 | 1 | 2.09612124 |
23.7479 | 23 | 2 | 2 | 2 | 1 | 2 | 16.61921705 |
19.2658 | 23 | 1 | 1 | 2 | 2 | 1 | 2.09612124 |
I need help with two things:
(1) compare the means of the variables X1 and X2 (continuous variables) in Proc Surveymeans.
case 1: independent data
case 2: paired (matched) data.
(2) Compare the proportions of ones in the two variables X3 and X4 in Proc Surveyfreq.
(3) Similar to (2) compare the proportions of ones in the two variables X5 and X6 in Proc Surveyfreq. Once I know how to do (2), I will be able to do (3)
thank you very much in advance
steven
email: sstoline@gmail.com
To compare means under model assumptions, you'll need to use PROC SURVEYREG. If you are interested in the raw means, then Example 92.2 in the SURVEYMEANS documentation should cover what you are looking for under (1). For (2), I think you will have to create another variable that indicates pair membership, and use SURVEYREG with it as a CLASS variable.
The latter questions depend on what you are trying to do. If it is just a matter of the proportion of ones in X3 and X4 separately, then Example 90.1 in the SURVEYFREQ documentation is what is needed. However, if you want the proportion of ones in X3 and X4 combined, you will probably have to restructure your dataset to combine the variables.
Steve Denham
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.