Hello All!
I am trying to write a loop that will cycle through all mothers in my dataset and flag each day there was a heatwave in their gestational period. I have temperature measure for each day for a three year period. There is a gestational start date and gestational end date for each mother within this 3 year period. If a temperature is higher than a certain threshold it should be flagged as a heatwave. For example, if I am using greater than the 90th percentile any day with a temperature greater than the 90th percentile of all the temperatures over the three year period is a heatwave. I would like to output a dataset that has mothersID and the sum of each flagged heatwave day within the gestational period for each mother: My data looks like this:
MotherID gest_start_date Gest_end_date 1/1/13 1/2/13 1/3/13..... 12/31/16
1 5/2/13 2/4/14 81 82 79 91
2 1/18/14 9/2/14 71 73 78 73
3 2/20/13 10/15/13 82 85 81 82
4 8/12/15 5/12/16 75 76 77 78
Any help will be much appreciated!
@smcelroy wrote:
Hello All!
I am trying to write a loop that will cycle through all mothers in my dataset and flag each day there was a heatwave in their gestational period. I have temperature measure for each day for a three year period. There is a gestational start date and gestational end date for each mother within this 3 year period. If a temperature is higher than a certain threshold it should be flagged as a heatwave. For example, if I am using greater than the 90th percentile any day with a temperature greater than the 90th percentile of all the temperatures over the three year period is a heatwave. I would like to output a dataset that has mothersID and the sum of each flagged heatwave day within the gestational period for each mother: My data looks like this:
MotherID gest_start_date Gest_end_date 1/1/13 1/2/13 1/3/13..... 12/31/16
1 5/2/13 2/4/14 81 82 79 91
2 1/18/14 9/2/14 71 73 78 73
3 2/20/13 10/15/13 82 85 81 82
4 8/12/15 5/12/16 75 76 77 78
Any help will be much appreciated!
I am not seeing what that "sum" would actually be. A count of days over a given value would not be too difficult but it is not clear what you might be summing.
Provide some example of your starting data set in a data step: Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
It might also help to show us what the actual expected value for a couple of records is expected to be.
Since your apparent temperature dates are not normally valid SAS date variable names I am not going to attempt to fake dummy data to work with.
The code you provided to create the starting data does not work for my dataset. Let me try and explain this better. For each mother I would like to cycle through each day within their gestational period and count all the days that are greater than a certain threshold. For this example let's use 90th percentile. I have variables that already indicate this threshold.
MotherID gest_st_date Gest_end_date 1/1/13 1/2/13 1/3/13..... 12/31/16 90per HW1/1/13 HW1/2/13 HW1/3/13..HW12/31/16
1 5/2/13 2/4/14 81 82 79 91 82 0 1 0 1
2 1/18/14 9/2/14 71 73 78 73 80 0 0 0 0
3 2/20/13 10/15/13 82 85 81 82 84 0 1 0 0
4 8/12/15 5/12/16 75 76 77 78 77 0 0 1 1
The variables in bold are intermediate variables that I don't want to keep. I would like to sum over these bold variables to get the total number of heatwaves for each mother for only the dates within her gestational period. Is this more clear?
Some of your problem information resides in the variable (column) names, i.e. the dates, which, by the way, are not proper SAS variable names. Your data may exist in this form in a spreadsheet, but how is it represented in your SAS data set?
@smcelroy wrote:
The code you provided to create the starting data does not work for my dataset. Let me try and explain this better. For each mother I would like to cycle through each day within their gestational period and count all the days that are greater than a certain threshold. For this example let's use 90th percentile. I have variables that already indicate this threshold.
Please run the code with options mprint symbolgen; before the code you attempted to create the data step. Show the Log with the submitted code, the generated macro messages and any error or message notes in a code box opened using the forum's {I} or "running man" icon so we can determine if the code needs to be changed.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.