Hello I a new to SAS. I am creating a model that looks at the mean HIV viral load over time by gender, race/ethnicity, age group, residency, HIV risk exposure, and first (baseline) cd4 count. I know how the use the CLASS variable and ( "REF= option") to look at the difference in viral load by variable with only 2 levels. I want to compare results with a variable with more than 2 levels. For example race/ethnicity has 3 levels: Black, White, and Other. So I want to compare Black to White, Black to Other, and White to Other. Because I reference white the proc mixed in SAS will compare White vs Blacks and White vs Other. But I am trying to get Black Vs White without creating an new model.
My proc mixed SAS code is below:
*sort data set with average viral load;
proc sort data = avgvlqrt out= avgvlqrt_sorted nodupkey;
by rfa_id time avgvl;
run;
*Proc mix to look at youth mean viral load by residence, cd4, sex, race, and gender ;
PROC MIXED data= avgvlqrt_sorted covtest noclprint noitprint method=reml PLOTS(MAXPOINTS=NONE);
class sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="White") agegroupdx (ref="Adolesce") risk_hars_n (ref="Heterosexual") cd4base_dx (ref="< 200");
model avgvl= time sex_hars res_at_hiv_dx race_combined risk_hars_n agegroupdx cd4base_dx / solution;
random time / subject=rfa_id Type= un vcorr;
RUN;
You can add the LSMEANS statement in your PROC MIXED program. For example,
lsmeans race_combined / diff;
You can add the LSMEANS statement in your PROC MIXED program. For example,
lsmeans race_combined / diff;
Hello,
So after speaking with my PI this is not what I am looking for. I need to to estimates for the fixed effects with more than 2 levels in sas. Do you know how I can achieve this? With LSMEANS computes least squares means (LS-means) of fixed effects.
The LSMEANS statement computes the least square means for each level of the fixed effect. Does that not work for you?
Yes, the LSMEASN output the result. I am very new to SAS so I hope I am not confusing myself. I can upload my results. My SAS code is below. On the class line I added all my fixed variables and reference one value. For example race = race_combined (ref "Black"). In the result I get Solution for Fixed Effects which compared Black vs White and Black vs Other. The random variable is viral load. How do I get White vs Other? Will the output be within my "Solution for Fixed Effects ".
I also use LSMEANS and did it also. Is the Solution for FIXED Effects and Least Squares Means the same thing?
************************************* Proc Mixed Code ***********************************;
**************************************** Mixed Model ************************************;
*********************************Viral Load by Year for Youth********************************;
ods rtf file = "C:\Users\cjacks1\Downloads\Data\Youthprocmixedmodelstest.rtf";
data avegvl_youth;
set Firstvisit;
keep rfa_id test_year testdate year sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup ccd4c ctimeyr days year mn quarter time presentvl result_vl_log10;
by rfa_id test_year testdate year;
if result_vl>=0 then presentvl=1 ;
*else if result_vl="." then delete;
run;
*Code incldues and removes repeated viral load that were averaged by year;
proc sql;
create table youthavgvl_yr as
select *, mean(case when presentvl =1 then result_vl_log10
else .
end) as avgvlyryouth
from avegvl_youth
group by rfa_id, year;
quit;
************************************* 1st Proc Mixed Code *****************************************************;
************************************* Youth Part 1 *****************************************************;
*sort data set with average viral load by rfa_id time and average viral load by year;
*proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
*by rfa_id year time avgvlyryouth;
*run;
*ods rtf;
*Proc mix to look at youth mean viral load by residence, cd4, sex, race, and gender ;
* multilevel (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
*PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml PLOTS(MAXPOINTS=NONE);
*class rfa_id sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
*model avgvlyryouth = year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup/ solution outp = pred;
* random int year / subject=rfa_id Type= un vcorr;
*title 'Mixed Model Results For Viral Load Trend Over Time For Youth Visits (Reference 1)';
*RUN;
*ods rtf close;
*ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4 ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined ctimeyr*hiv_risk ctimeyr*baseline_cd4 ctimeyr*agegroup
************************************* Youth Proc Mixed Models *****************************************************;
*CJacks notes;
*3 proc mixed models where used to determine which TYPE= was the best for the data analysis of mean vl over time;
*TYPE= UN, CS, and AR(1);
*sort data set with average viral load by rfa_id time and average viral load by year;
proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; *remove duplicates;
by rfa_id year time avgvlyryouth;
run;
*When I ran the procedure without adding intercept or int after random the output showed everything but with it some outputs where
cut out;
PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml PLOTS(MAXPOINTS=NONE);
class sex_hars (ref="F") res_at_hiv_dx (ref="Urban") race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 500") agegroup(ref="20-24");
model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4
agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4
ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx ctimeyr*race_combined ctimeyr*hiv_risk
ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = pred_un ;
*random intercept year/ Type= un subject=rfa_id;
random year / type= un subject= rfa_id;
lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk baseline_cd4 agegroup /diff;
title 'Mixed Model Test for Youth Unstructured';
RUN;
ods rtf close;
Output Log:
483 *CJacks notes; 484 *3 proc mixed models where used to determine which TYPE= was the best 484! for the data analysis of mean vl over time; 485 *TYPE= UN, CS, and AR(1); 486 *sort data set with average viral load by rfa_id time and average 486! viral load by year; 487 488 proc sort data = youthavgvl_yr out= youthavgvl_yr_sorted nodupkey; 488! *remove duplicates; 489 by rfa_id year time avgvlyryouth; 490 run; NOTE: There were 57534 observations read from the data set WORK.YOUTHAVGVL_YR. NOTE: 10467 observations with duplicate key values were deleted. NOTE: The data set WORK.YOUTHAVGVL_YR_SORTED has 47067 observations and 19 variables. NOTE: PROCEDURE SORT used (Total process time): real time 0.04 seconds cpu time 0.04 seconds 491 *When I ran the procedure without adding intercept or int after 491! random the output showed everything but with it some outputs where 492 cut out; 493 PROC MIXED data= youthavgvl_yr_sorted covtest noitprint method=reml 493! PLOTS(MAXPOINTS=NONE); 494 class sex_hars (ref="F") res_at_hiv_dx (ref="Urban") 494! race_combined (ref="Black") hiv_risk (ref="MSM") baseline_cd4 (ref="> 494! 500") agegroup(ref="20-24"); 495 model avgvlyryouth= year ccd4c ctimeyr sex_hars res_at_hiv_dx 495! race_combined hiv_risk baseline_cd4 496 agegroup ccd4c*sex_hars ccd4c*res_at_hiv_dx 496! ccd4c*race_combined ccd4c*hiv_risk ccd4c*baseline_cd4 497 ccd4c*agegroup ctimeyr*sex_hars ctimeyr*res_at_hiv_dx 497! ctimeyr*race_combined ctimeyr*hiv_risk 498 ctimeyr*baseline_cd4 ctimeyr*agegroup/ solution outp = 498! pred_un ; 499 *random intercept year/ Type= un subject=rfa_id; 500 random year / type= un subject= rfa_id; 501 lsmeans sex_hars res_at_hiv_dx race_combined hiv_risk 501! baseline_cd4 agegroup /diff; 502 title 'Mixed Model Test for Youth Unstructured'; 503 RUN; NOTE: 7898 observations are not included because of missing values. NOTE: The data set WORK.PRED_UN has 47067 observations and 26 variables. NOTE: PROCEDURE MIXED used (Total process time): real time 7.01 seconds cpu time 6.75 seconds 504 ods rtf close;
What you wanted is there in the LSMEANS DIFF output, but because of the large amount of output, it is not easy to spot it. You might use the following LSMEANS statement to see it more easily --
lsmeans race_combined /diff;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.