Hi all, Can I get some help with the code below? I am trying to determine menopausal status by combining 3 variables, but the code doesn't seem to work, and I am not sure why. Thank you
/* Categorize based on age */
if missing(age) then age_cat = 'Missing';
else if age > 55 then age_cat = 'Postmenopausal';
else age_cat = 'Premenopausal';
/* Categorize based on mens_stopped_perm */
if missing(mens_stopped_perm) then mens_stopped_cat = 'Missing';
else if mens_stopped_perm = 'Yes, natural menopause' then mens_stopped_cat = 'Postmenopausal';
else mens_stopped_cat = 'Premenopausal';
/* Categorize based on period */
if missing(periods) then period_cat = 'Missing';
else if periods = 'N' then period_cat = 'Postmenopausal';
else period_cat = 'Premenopausal';
/* Determine final menopausal status */
if age_cat = 'Postmenopausal' or mens_stopped_cat = 'Postmenopausal' or period_cat = 'Postmenopausal' then
menopausal_status = 'Postmenopausal';
else if age_cat = 'Premenopausal' or mens_stopped_cat = 'Premenopausal' or period_cat = 'Premenopausal' then
menopausal_status = 'Premenopausal';
else
menopausal_status = 'Missing';
"Doesn't seem to work: is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "</>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "</>" icon or attached as text to show exactly what you have and that we can test code against.
As a bare minimum discuss why you think it isn't working.
A guess based on your shown code and years of SAS experience is that you are getting results for Age_cat that are short likely
'Missing' 'Postmen' 'Premeno'
if that is the issue then you are getting the length of the Age_cat variable assigned by the default first use of the variable in the code. That would be addressed by adding a LENGTH assignment to hold the expected number of characters before use: this sets the length of the target variable AGE_CAT to 15 characters which should be long enough to hold Postmenopausal.
Length age_cat $ 15; if missing(age) then age_cat = 'Missing'; else if age > 55 then age_cat = 'Postmenopausal'; else age_cat = 'Premenopausal';
Do similar for each of the target variables. When you have multiple variables with the same length you can make the assignment in one statement:
Length age_cat mens_stopped_cat period_cat $ 15 ;
Some function uses could also default to creating variables of 200 characters which can lead to other odd results. The Length statement would restrict the length in those cases.
I did not specify in my code above, but I did set the length. I am not getting any error code in the log when I ran the code but when I print the menopausal status is it all missing. Thanks!
Please show us a portion of the actual data used. (If the data is confidential, mask the confidential parts by assigning sequential numbers or random numbers). Please provide a portion of your SAS data set as working SAS data step code (Examples and instructions). Do not provide the data as screen captures or in Excel files.
Also, please show us the ENTIRE data step that you are using.
Show us the result of
proc freq data=<dataset name>; tables age_cat * mens_stopped_cat*period_cat*menopausal_status / list missing; run;
This will provide each combination of the variables on a single easy to read line and is very useful for checking value recoding.
Here's the frequency table: Somehow, I am getting an output (no change to code)
data cohort2015; length age_cat mens_stopped_cat period_cat menopausal_status $20.; set cohort2015_1; /* Categorize based on age */ if missing(age) then age_cat = 'Missing'; else if age > 55 then age_cat = 'Postmenopausal'; else age_cat = 'Premenopausal'; /* Categorize based on mens_stopped_perm */ if missing(mens_stopped_perm) then mens_stopped_cat = 'Missing'; else if mens_stopped_perm = 'Yes, natural menopause' then mens_stopped_cat = 'Postmenopausal'; else mens_stopped_cat = 'Premenopausal'; /* Categorize based on period */ if missing(periods) then period_cat = 'Missing'; else if periods = 'N' then period_cat = 'Postmenopausal'; else period_cat = 'Premenopausal'; /* Determine final menopausal status */ if age_cat = 'Postmenopausal' or mens_stopped_cat = 'Postmenopausal' or period_cat = 'Postmenopausal' then menopausal_status = 'Postmenopausal'; else if age_cat = 'Premenopausal' or mens_stopped_cat = 'Premenopausal' or period_cat = 'Premenopausal' then menopausal_status = 'Premenopausal'; else menopausal_status = 'Missing'; run;
Here's the frequency table: Somehow, I am getting an output (no change to code)
I suspect that means that you may have run code with an issue that you missed previously. If your previous data step had more code that you removed to show this bit then something in that other code is a likely culprit. Copy/paste/edit for similar code and other variables but accidentally left the menopausal_status variable in use elsewhere.
Mysteries like this are why you will see requests for the entire data step and prefer a Log with the code and all the messages. You might have missed something, not an error, like Note: Age_cat is uninitialized. All three of those newly created variables with that note and the result variable would all be missing because all of the inputs were missing.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.