04-29-2013 10:43 PM
Lets see if I can explain this well. I need to create a couple of new variables and need to figure out the easiest way to do it. I will give a hypothetical example of the type of data I have. There is the participant, say a school child, I have the school they attended and whether or not they had diseases. Right now my dataset has child, school, number of days at that school, disease(yes/no). I am trying to determine if kids getting esposed to other kids who have the disease of interest for a longer period makes them at greater risk for disease. So I have the number of days the kids were in school together, and whether kids changes schools. The data is in long form so there is a new record for each day they were in school together. So if kids were in school together for 20 days there will be 20 records. I need to create two new variables, one that for each new record, adds up the number of days at risk, the second is the cumulative risk, so overall, total number of days at risk (this one will be easy once the first is created) The cumulative risk will also be used in a shorter version of the dataset with only one record per participant. The variables, var1 and var2 below are what I need to create. Any help would be greatly appreciated.
Child school at risk var1 (days at risk) var2 (cumulative risk)
1 2 yes 1 4
1 2 yes 2 4
1 2 yes 3 4
1 2 yes 4 4
04-29-2013 11:00 PM
OK. I need more data.
data have; input Child school risk $; cards; 1 2 yes 1 2 yes 1 2 yes 1 2 yes ; run; data want temp; set have; by child school; if risk='yes' then var1+1; else var1=0; output want; if last.school then output temp; run; data want; merge want temp(keep=child school var1 rename=(var1=var2)); by child school; run;
04-30-2013 12:47 AM
Thank you for your response. You said you needed more data, what more information do you need. To try to clarify a little more since I think your code is getting at the right thing but it doesn't seem to be completely correct.
Var1 and Var2 are two different variable and var2 should be equal to max(var1) for each participant. For var1, for each different participant record, var1 should increase by 1 and var1 should only equal 0 if risk="no" so I was thinking:
If risk="no" then var1=0;
else if risk="yes" then do;
This is where your code would come in I think. I will try to run it now and see what happens. Thanks again. Let me know if you need any additional information.