I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
This line is incorrect for starters:
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
Should be:
if year in (2008, 2010:2013) then do;
You could also create 3 user define formats and apply them conditionally.
if year in (2013) then edu_format="ed1_fmt";
else if year in (2015) then edu_format= "ed2_fmt";
else if year in (2012) then edu_format= "ed3_fmt";
Education = putn(education_variable, edu_format);
@DanielQuay wrote:
I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
This line is incorrect for starters:
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
Should be:
if year in (2008, 2010:2013) then do;
You could also create 3 user define formats and apply them conditionally.
if year in (2013) then edu_format="ed1_fmt";
else if year in (2015) then edu_format= "ed2_fmt";
else if year in (2012) then edu_format= "ed3_fmt";
Education = putn(education_variable, edu_format);
@DanielQuay wrote:
I am working in a multi-year data set. The problem I'm encountering is that I have the same variable name but they don't always have the same response category. So in 2013,
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Associate degree (AA, AS)
ED_LEVEL= 4 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 5 = Master's degree (e.g., MA, MS)
ED_LEVEL= 6 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
But in 2015 the categories changed, now:
ED_LEVEL= 1 = Less than 12th grade, no diploma
ED_LEVEL= 2 = High school graduate or GED completed
ED_LEVEL= 3 = Technical/trade/vocational school completed
ED_LEVEL= 4 = Associate degree (AA, AS)
ED_LEVEL= 5 = Bachelor's degree(e.g., BA, BS)
ED_LEVEL= 6 = Master's degree (e.g., MA, MS)
ED_LEVEL= 7 = Professional degree (e.g., JD, MD, DDS)
ED_LEVEL= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
ED_LEVEL= 9 = Don't Know
And just to spice things up a bit in 2012 the variable was:
EDU_M= 1 = 8th grade or less
EDU_M= 2 = 9th - 12th grade, no diploma
EDU_M= 3 = High school graduate or GED completed
EDU_M= 4 = Some college credit but no degree
EDU_M= 5 = Associate degree (AA, AS)
EDU_M= 6 = Bachelor's degree(e.g., BA, BS)
EDU_M= 7 = Master's degree (e.g., MA, MS)
EDU_M= 8 = Doctorate (e.g., Phd, EdD) or professional degree (e.g., MD, DDS)
EDU_M= 9 = Don't Know
What I am hoping to do is combine them all into a common variable: EDUCATION.
What I have tried thus far is (on another variable):
do;
if year = '2008' or '2010' or '2011' or '2012' or '2013' then do;
if HC_DEVLP = 1 then DEV_DEL = 1;
else if HC_DEVLP = 2 then DEV_DEL = 2;
else if HC_DEVLP = 9 then DEV_DEL = 9;
else DEV_DEL='.';
end;
if year = '2007' or '2014' or '2015' then do;
if HC_DEVLP = 1 then DEV_DEL2=1;
else if HC_DEVLP = 2 then DEV_DEL2=2;
else if HC_DEVLP = 9 then DEV_DEL2=9;
else if No5Developmentaldelay = 0 then DEV_DEL2 = 1;
else if No5Developmentaldelay = 1 then DEV_DEL2 = 2;
else DEV_DEL2='.';
end;
end;
but that did not work so I am reaching out for some advice because I'm not having much luck figuring it out at this point.
Thanks Community.
Thanks Reeza, I knew it was going to be something ridiculously simple but I was striking out.
An alternate approach is to create a custom informat that takes any of the text values and will create a "standard" coding level (create with more levels than you think you may need at first sometimes helps), use that informat to create a standardized variable and have a custom format for the levels to display meaningful standard text.
I have a project that has increased the number of "values" from 20 the first year to over 40 after 5 years. But they need to get mapped to 5 report categories. This way I only have to add one or two lines to the Informat part for the "new values". By having the informat definition with an other=_error_ option then I get the invalid data diagnostic in the log and need not spend a lot of time pre-validating any "new" codes.
Hi,
It looks like you are on the right approach, although it's a little difficult from your post, as your code & problem description don't appear to match (e.g. ED_LEVEL in the description vs HC_DEVLP in the code).
Also your code appears to have several syntax mistakes
Could you supply a small dummy sample of the input and the expected output (removing any personal identifying information)
Thanks
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.