BookmarkSubscribeRSS Feed
angelapadilla
Calcite | Level 5

Hello! This is my thesis project. I want to know the influence of THI (temperature and humidity index) on the EIPH (excercise induced pulmonary haemorrhage) in horses. We did an average of THI for each month.  I want to do the same relation for each part (in the program) but separating it by year and by month. This analysis part 1: I did the relation with THI and Positive/negatives to bleeding and part 2: I divided the grades of eiph into slightly/serious bleeding. This may help you out to understand it. I tried many ways but it always came with an error in the model. I just took the class, so I only know the basics. If you can help me out here, I'll appreciate it!

 

 

FILENAME REFFILE '/folders/myfolders/sasuser.v94/DatosFinales_ITH.xlsx';

PROC IMPORT DATAFILE=REFFILE DBMS=XLSX OUT=WORK.IMPORT;
    GETNAMES=YES;
RUN;

PROC CONTENTS DATA=WORK.IMPORT;
proc print;
RUN;

libname eiph '/folders/myfolders/sasuser.v94/';

data eiph.import;
    set import;
run;

proc sgplot data=eiph.import pctlevel=group;
    vbar thi /stat=percent group=eiph grouporder=data;
    title "Proporciones por grado de EIPH según el ITH";

proc sgplot data=eiph.import;
    vline mes/ response=thi stat=mean markers group=year;
    title 'VALORES DE ITH POR MES Y POR AÑO';

Proc freq data=eiph.import;
    Tables thi*eiph/nocol nopercent chisq;
Run;

data eiphpos;
    set IMPORT;

    if eiph=0 then
        eiphpos=0;

    if eiph>0 then
        eiphpos=1;

proc glimmix data=eiphpos;
    class thi;
    model eiphpos(descending)=THI / dist=binary link=logit solution;
    output out=grado pred(ilink)=eiphp;
    lsmeans thi/ pdiff lines ilink;

proc sgplot data=eiphpos pctlevel=group;
    vbar thi /stat=percent group=eiphpos grouporder=data;

Proc freq data=eiphpos;
    Tables thi*eiphpos/nocol nopercent chisq;
Run;

proc sort data=grado;
    by thi eiphp;

proc sgplot data=grado;
    series x=THI y=eiphp;
    scatter x=THI y=eiphpos / jitter;
run;

data eiph.grad;
    set IMPORT;

    if eiph>=4 then
        eiphgrad=1;

    if eiph<4 then
        eiphgrad=0;

    if eiph=0 then
        eiphgrad=.;

proc glimmix data=eiph.grad;
    class thi;
    model eiphgrad(descending)=THI / dist=binary link=logit solution;
    output out=gradog pred(ilink)=eiphg;
    lsmeans thi / pdiff lines ilink;

proc sgplot data=eiph.grad pctlevel=group;
    vbar thi /stat=percent group=eiphgrad grouporder=data;

Proc freq data=eiph.grad;
    Tables thi*eiphgrad/nocol nopercent chisq;
Run;

proc sort data=gradog;
    by thi eiphg;

proc sgplot data=gradog;
    series x=THI y=eiphg;
    scatter x=THI y=eiphgrad / jitter;
run;

 

7 REPLIES 7
Reeza
Super User

Welcome to the SAS forums. 

 

I think we'll need some more information here. For example, you're saying: 

I tried many ways but it always came with an error in the model.

 

But we don't know what errors are occurring and don't have your data so we can't run the code either. If you could include the error and log that will likely help. Additionally, you can also comment your code, this tells us what you think each section is doing, and then we can try and verify if the logic is correct for your intentions. Code can be correct syntactically but not what you intended, ie logic errors. 

 

It would probably help if you could include some data so we can run the code as well. If you can't for confidentiality reasons, you can try to see if you can find a data set in the SASHELP library or SAS documentation that's similar enough and modify your code to work with that data set. SASHELP. HEART may be suitable for example.

 

PS. If you can format your code and post in the correct sub forum that helps. I've formatted your code (the second last icon in SAS Studio does this automatically) and moved your post to the stats forum so hopefully someone can help. 

 

Spoiler

@angelapadilla wrote:

Hello! This is my thesis project. I want to know the influence of THI (temperature and humidity index) on the EIPH (excercise induced pulmonary haemorrhage) in horses. We did an average of THI for each month.  I want to do the same relation for each part (in the program) but separating it by year and by month. This analysis part 1: I did the relation with THI and Positive/negatives to bleeding and part 2: I divided the grades of eiph into slightly/serious bleeding. This may help you out to understand it. I tried many ways but it always came with an error in the model. I just took the class, so I only know the basics. If you can help me out here, I'll appreciate it!

 

 

FILENAME REFFILE '/folders/myfolders/sasuser.v94/DatosFinales_ITH.xlsx';

PROC IMPORT DATAFILE=REFFILE DBMS=XLSX OUT=WORK.IMPORT;
    GETNAMES=YES;
RUN;

PROC CONTENTS DATA=WORK.IMPORT;
proc print;
RUN;

libname eiph '/folders/myfolders/sasuser.v94/';

data eiph.import;
    set import;
run;

proc sgplot data=eiph.import pctlevel=group;
    vbar thi /stat=percent group=eiph grouporder=data;
    title "Proporciones por grado de EIPH según el ITH";

proc sgplot data=eiph.import;
    vline mes/ response=thi stat=mean markers group=year;
    title 'VALORES DE ITH POR MES Y POR AÑO';

Proc freq data=eiph.import;
    Tables thi*eiph/nocol nopercent chisq;
Run;

data eiphpos;
    set IMPORT;

    if eiph=0 then
        eiphpos=0;

    if eiph>0 then
        eiphpos=1;

proc glimmix data=eiphpos;
    class thi;
    model eiphpos(descending)=THI / dist=binary link=logit solution;
    output out=grado pred(ilink)=eiphp;
    lsmeans thi/ pdiff lines ilink;

proc sgplot data=eiphpos pctlevel=group;
    vbar thi /stat=percent group=eiphpos grouporder=data;

Proc freq data=eiphpos;
    Tables thi*eiphpos/nocol nopercent chisq;
Run;

proc sort data=grado;
    by thi eiphp;

proc sgplot data=grado;
    series x=THI y=eiphp;
    scatter x=THI y=eiphpos / jitter;
run;

data eiph.grad;
    set IMPORT;

    if eiph>=4 then
        eiphgrad=1;

    if eiph<4 then
        eiphgrad=0;

    if eiph=0 then
        eiphgrad=.;

proc glimmix data=eiph.grad;
    class thi;
    model eiphgrad(descending)=THI / dist=binary link=logit solution;
    output out=gradog pred(ilink)=eiphg;
    lsmeans thi / pdiff lines ilink;

proc sgplot data=eiph.grad pctlevel=group;
    vbar thi /stat=percent group=eiphgrad grouporder=data;

Proc freq data=eiph.grad;
    Tables thi*eiphgrad/nocol nopercent chisq;
Run;

proc sort data=gradog;
    by thi eiphg;

proc sgplot data=gradog;
    series x=THI y=eiphg;
    scatter x=THI y=eiphgrad / jitter;
run;

 


 

angelapadilla
Calcite | Level 5

That's the data.

 

That code runs perfectly. I just want to know how can I do the same analysis but dividing it by years and by months to see when it can be significant the relationship THI*EIPH.

Reeza
Super User

Do you want the model run for each year/month? If so, add a BY statement (you may need to sort your data). A BY statement takes multiple variables and basically performs the same action across that group of variables. However, if you're looking to determine the effects of year/month you may want to add them into he model as variables, though if you have time series data you should verify stationarity and such first. 


For example, if you wanted to run a separate regression model by sex this would do that:

 

proc sort data=sashelp.class out=class;
by sex;
run;

proc reg data=class;
by sex;
model weight = height age;
run;
angelapadilla
Calcite | Level 5

Hi again! I tried the by statement and the numbers remained the same. How I put them as a third variable in the model? Because it says that can't be done.

Reeza
Super User

@angelapadilla wrote:

Hi again! I tried the by statement and the numbers remained the same. How I put them as a third variable in the model? Because it says that can't be done.


I don't know what either of that means. Please explain in detail.

ballardw
Super User

@angelapadilla wrote:

Hi again! I tried the by statement and the numbers remained the same. How I put them as a third variable in the model? Because it says that can't be done.


Show the code you attempted.

 

It is often self defeating to attempt to put BY variable on a model as each by group will have only one set of values for the variable(s).

if you do some thing like:

 

Proc glm data=mydata;

   by year month;

   model thisvar = thatvar year month;

then the data processes so that each of the models (one for each year/month combination)  in effect look like:

   model thisvar = thatvar 2018 1  ; (for year=2018 and month = 1). and tries to create parameters to assign to year/month.

the syntax

Proc glm data=mydata;

   by year month;

   model thisvar = thatvar ;

creates one model for each combination of year and month without attempting to create parameters for the year and month varaibles.

Any of your output data sets would have the values of year and month as well so you can see/ compare/ whatever based on those.

Rick_SAS
SAS Super FREQ

Hi again! I tried the by statement and the numbers remained the same. How I put them as a third variable in the model? Because it says that can't be done.

 

Be sure that you understand the difference between BY variables and CLASS variables in SAS. If you want to be able to compare the effects of different months and days, they need to be class variables and part of the model. If you just want to subset the data by month and day, then use the BY statement.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1525 views
  • 2 likes
  • 4 in conversation