I have a data set which contains Character variable "Housing_type". The response to that variables are "3-room", "4-room", "5-room", "Condominium", "Landed property" and "Other".
I would like to recode 3-room, 4-room and 5-room into "HDB" and Condominium, Landed Property and Other into "Non-HDB".
How may I do so please?
@SSin wrote:
I have a data set which contains Character variable "Housing_type". The response to that variables are "3-room", "4-room", "5-room", "Condominium", "Landed property" and "Other".
I would like to recode 3-room, 4-room and 5-room into "HDB" and Condominium, Landed Property and Other into "Non-HDB".
How may I do so please?
Another approach is to use a custom format, which if you do this frequently may be a better solution. Example:
Proc format library=work; value $ht "3-room", "4-room", "5-room" = 'HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; data want; set have; ht = put(housing_type,$ht.); run;
Or try
Proc Print data=have; format housing_type $ht.; run;
An advantage of a custom format is that if you find that you need to add an item, such as possibly "6-room" then you only need to add it to the format definition and rerun the proc format code. No need to add extra if/then/else.
If using the format directly with the variable instead of creating an additional variable then you have the option to use the groups dynamically in most analysis, reporting or graphing procedures.
Such as:
Proc format library=work; value $ht "3-room", "4-room", "5-room" = 'HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; value $ht_3lvl "2-room","3-room"='Small HDB' "4-room", "5-room" = 'Large HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; run; Proc freq data=have; tables housing_type; format housing_type $ht_3lvl.; run;
I of course have no idea if "Small" or "Large" actually apply to your need, that is just an example.
Since your data likely does not have any 2-room from your description the "Small" would just be 3-room.
I'm squinting, but I still can't see your program. If you post the entire DATA step that you tried, I would be happy to respond.
You can use IF/THEN to recode variables. The general structure is as follows:
length new_housing_type $12.;
if housing_type in ('3-room' '4-room' '5-room') then new_housing_type = 'HBD';
else if housing_type in ("Condo") then new_housing_type = 'Condo';
else new_housing_type = 'Other';
You'll need to modify it to match your exact cases. Note these are case sensitive comparisons.
@SSin wrote:
I have a data set which contains Character variable "Housing_type". The response to that variables are "3-room", "4-room", "5-room", "Condominium", "Landed property" and "Other".
I would like to recode 3-room, 4-room and 5-room into "HDB" and Condominium, Landed Property and Other into "Non-HDB".
How may I do so please?
@SSin wrote:
I have a data set which contains Character variable "Housing_type". The response to that variables are "3-room", "4-room", "5-room", "Condominium", "Landed property" and "Other".
I would like to recode 3-room, 4-room and 5-room into "HDB" and Condominium, Landed Property and Other into "Non-HDB".
How may I do so please?
Another approach is to use a custom format, which if you do this frequently may be a better solution. Example:
Proc format library=work; value $ht "3-room", "4-room", "5-room" = 'HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; data want; set have; ht = put(housing_type,$ht.); run;
Or try
Proc Print data=have; format housing_type $ht.; run;
An advantage of a custom format is that if you find that you need to add an item, such as possibly "6-room" then you only need to add it to the format definition and rerun the proc format code. No need to add extra if/then/else.
If using the format directly with the variable instead of creating an additional variable then you have the option to use the groups dynamically in most analysis, reporting or graphing procedures.
Such as:
Proc format library=work; value $ht "3-room", "4-room", "5-room" = 'HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; value $ht_3lvl "2-room","3-room"='Small HDB' "4-room", "5-room" = 'Large HDB' "Condominium", "Landed property" , "Other" = 'Non-HDB' ; run; Proc freq data=have; tables housing_type; format housing_type $ht_3lvl.; run;
I of course have no idea if "Small" or "Large" actually apply to your need, that is just an example.
Since your data likely does not have any 2-room from your description the "Small" would just be 3-room.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.