I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
This line is incorrect for starters:
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
Should be:
if year in (2008, 2010:2013) then do;
You could also create 3 user define formats and apply them conditionally.
if year in (2013) then edu_format="ed1_fmt";
else if year in (2015) then edu_format= "ed2_fmt";
else if year in (2012) then edu_format= "ed3_fmt";
Education = putn(education_variable, edu_format);
@DanielQuay wrote:
I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
This line is incorrect for starters:
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
Should be:
if year in (2008, 2010:2013) then do;
You could also create 3 user define formats and apply them conditionally.
if year in (2013) then edu_format="ed1_fmt";
else if year in (2015) then edu_format= "ed2_fmt";
else if year in (2012) then edu_format= "ed3_fmt";
Education = putn(education_variable, edu_format);
@DanielQuay wrote:
I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
Thanks Reeza, I knew it was going to be something ridiculously simple but I was striking out.
An alternate approach is to create a custom informat that takes any of the text values and will create a "standard" coding level (create with more levels than you think you may need at first sometimes helps), use that informat to create a standardized variable and have a custom format for the levels to display meaningful standard text.
I have a project that has increased the number of "values" from 20 the first year to over 40 after 5 years. But they need to get mapped to 5 report categories. This way I only have to add one or two lines to the Informat part for the "new values". By having the informat definition with an other=_error_ option then I get the invalid data diagnostic in the log and need not spend a lot of time pre-validating any "new" codes.
Hi,
It looks like you are on the right approach, although it's a little difficult from your post, as your code & problem description don't appear to match (e.g. ED_LEVEL in the description vs HC_DEVLP in the code).
Also your code appears to have several syntax mistakes
Could you supply a small dummy sample of the input and the expected output (removing any personal identifying information)
Thanks
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.