How to do proc GLM analysis between insect LT50s and days/year in relative humidity in 50 globe sites if the values of some independent variables are 0? Removing some of the correlated variables with 0, which SAS treats it as "missing data"' will affect the analyses on several other environmental factors...
Its not a problem to have an independent variable with values of 0, unless there is something about it that you haven't described. I also don't understand why you say "SAS treats it as missing", it does not treat 0 as missing unless YOU have done something to replace 0 with missing.
Thanks for your help!
I did LT50 assay for 50 insect lines with individual stock number, which were collected from 30 global locations and colored differently in the attached figure. At 30-35C maximum temperature, there is 0 or 1 days at 1/3 localities. Also, there are 10 insect lines with slightly different LT50s from a large city circled in yellow. My questions are:
1) how to analyze the data with "0" values in x variables circled in green color as "normality" concern? and
2) should we take the average LT50 value from those 10 lines from the same locality?
Some people, such as myself, refuse to download attachments. You can make a screen capture of what you want to show me and include it in your reply by clicking on the "Insert Photos" icon.
how to analyze the data with "0" values in x variables circled in green color as "normality" concern?
Normality of the independent variables is not a concern at all. There is no requirement that the independent variable be normally distributed for PROC GLM.
should we take the average LT50 value from those 10 lines from the same locality?
This really depends on how the data was obtained, and what the 10 lines from the same locality mean. This requires a more detailed explanation of the data.
Thank you so much for your help! Please find the inserted photo. The insect collector told us those ten lines were actually from two close regions of a big city. However, we want to study the environmental factors, from 0.5 degree of latitude and longitude grid database, may be related to LT50 values in 30 different worldwide sites for 50 insect lines. Therefore, it's may be reasonable to use the mean LT50 from that location to do correlation analyses? By the way, there is a good correlation between LT50 and 25-30C Tmax (p=0.0003).
I don't think you have explained what the 10 lines are. And now you introduce latitude and longitude. Please, provide a complete COMPLETE explanation of the study and how the data was collected and what you want to learn; leaving nothing out and make sure your explanation is COMPLETE COMPLETE COMPLETE.
We assessed the ability of 50 age-matched insect lines/populations from 30 diverse geographic regions to survive infection. Median lethal times (LT50) were monitored using five replicates (20-30 insects each) per sex per insect line, and experiments were repeated at least twice. Some locations have more than one insect line collected, especially 10 insect lines from Tokyo, Japan. We want to know whether there are correlations/differences between LT50 and maximum temperature at 30 geographic regions/localities. There will be ten pairs of X, Y variables for the one Tokyo, Japan location as in the previous inserted figure. Should we use the average LT50 of 10 lines from Tokyo or show both?
Another question, we tried to divide 50 insect lines into 8 biome groups, each group has 3-13 insect lines. How do we compare the group LT50 by using the original 10-15 replicates of each of 50 insect lines with replicate N numbers, or the mean LT50 of insect lines with line N numbers in 8 groups?
This is a much more complicated problem than originally presented, and I think the best thing you can do is sit down with a statistician (preferably one who is experienced in biological work) and go through the whole thing slowly and carefully. It's not clear to me which factors are random and which are fixed (are "lines" random or fixed?), which variables are nested within other variables, and how the other items you mention will fit into the final analysis/final model. So I cannot provide that kind of assistance here.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.