BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aongkeko
Calcite | Level 5

Hello SAS Community,

 

I am trying to create a new "age_group" variables from the "age" variable. I was able to create one using my first condition: if Age=<'20' then Age_Group='child';

 

but when I add the other conditions, it's not working anymore. Below is the code I am trying. Thanks for the assistance!

 

 

data final_age;
set project.final_sample;
if Age=<'20' then Age_Group='child';
if Age=>'21 and =<30' the Age_Group='early_adult';

if Age=>'31 and <=60' then Age_Group='middle adult';
if Age=>'61 and <=80' then Age_Group='late adult';
if Age=>'80' then Age_Group='senior'
run;


proc print data=final_age;
run;



1 ACCEPTED SOLUTION

Accepted Solutions
15 REPLIES 15
PeterClemmensen
Tourmaline | Level 20

Is age a character or numeric variable?

PeterClemmensen
Tourmaline | Level 20

Here is how I would do it. This kind of classification problem is best solved with PROC FORMAT. That way, you only have to edit the code in one place if you use the logic in more than one place. Also, the code is cleaner.

 

Feel free to ask 🙂

 

proc format;
    value agegroup 
        0  -< 20  =  'Child'
        20 -< 30  =  'Early Adult'
        31 -< 60  =  'Middle Adult'
        61 -< 80  =  'Late Adult'
        80 - high =  'Senior';
run;

data have;
input age;
datalines;
2 
7 
14
21
34
45
63
78
84
94
;

data want;
   set have;
   Group=put(Age, AgeGroup.);
run;
aongkeko
Calcite | Level 5

thanks, @PeterClemmensen, I think I am closing to getting it:

 

here's what I did:

 

data final_age;
set project.final_sample;
Group=put(Age, AgeGroup.);
run;
proc format;
value agegroup
0 -<20 = 'Child'
20 -<30 = 'Early Adult'
31 -<60 = 'Middle Adult'
61 -<80 = 'Late Adult'
80 - high = 'Senior';
run;
proc print data=final_age;
run;

 

but:

 

I have another variable "Group" and what the code above did was to change the values of the "Group" variable with the values I indicated in the proc format. it didn't create a new column.

Kurt_Bremser
Super User

If you want to create a new variable, you must do so in the assignment statement:

newvar = put(Age, AgeGroup.);

instead of

Group=put(Age, AgeGroup.);
PeterClemmensen
Tourmaline | Level 20

Simply use some other fitting name than Group (One that doesn't exist in your data already).

aongkeko
Calcite | Level 5

It worked! Thanks @PeterClemmensen  and @Kurt_Bremser !!!!!

aongkeko
Calcite | Level 5

Dear @PeterClemmensen  and @Kurt_Bremser,

 

Can I do the same if I want to carry out the same with the other variables as shown below or I need to create a new proc format line for each variable?

 

thanks so very much!

 

data final_age;
set project.final_sample;
AgeGroup=put(Age, AgeGroup.);
newvar=put(Total_Charge, ChargeGroup.);
run;
proc format;
value agegroup
0 -<20 = 'Child'
20 -<30 = 'Early Adult'
30 -<60 = 'Middle Adult'
60 -<80 = 'Late Adult'
80 - high = 'Senior';
value chargegroup
0 - <500 = 'less than 500'
run;

PeterClemmensen
Tourmaline | Level 20

You're missing a semicolon, but otherwise no problem 🙂

 

proc format;
    value agegroup
    0 -<20 = 'Child'
    20 -<30 = 'Early Adult'
    30 -<60 = 'Middle Adult'
    60 -<80 = 'Late Adult'
    80 - high = 'Senior';

    value chargegroup
    0 - <500 = 'less than 500';
run;
aongkeko
Calcite | Level 5

Dear @PeterClemmensen  and @Kurt_Bremser ,

 

i think i messed up my sample dataset when i tried creating newvar for other existing variables. i redid my initial steps but i can no longer get the correct output. here's the log:

 

731 data final_age;
732 set project.final_sample;
733 agegroup=put(Age, agegroup);
--------
85
76
ERROR 85-322: Expecting a format name.

ERROR 76-322: Syntax error, statement will be ignored.

734 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.FINAL_AGE may be incomplete. When this step was stopped there were 0
observations and 8 variables.
WARNING: Data set WORK.FINAL_AGE was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


735 proc format;
736 value agegroup;
NOTE: Format AGEGROUP is already on the library WORK.FORMATS.
NOTE: Format AGEGROUP has been output.
737 0 -<20 = 'Child'
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
738 20 -<30 = 'Early Adult'
739 31 -<60 = 'Middle Adult'
740 61 -<80 = 'Late Adult'
741 80 - high = 'Senior';
NOTE: The previous statement has been deleted.
742 run;

WARNING: RUN statement ignored due to previous errors. Submit QUIT; to terminate the procedure.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds

NOTE: The SAS System stopped processing this step because of errors.

 

 

sorry for pestering you guys on this! will appreciate any hint 🙂

 

 

PeterClemmensen
Tourmaline | Level 20

Remember: A format name ends with a period

aongkeko
Calcite | Level 5

sorry. added the period. but the result is the same 😞

 

here's the log output again:

 

820 data final_age;
821 set project.final_sample;
822 agegroup=put(Age,agegroup.);
823 run;

NOTE: There were 1000 observations read from the data set PROJECT.FINAL_SAMPLE.
NOTE: The data set WORK.FINAL_AGE has 1000 observations and 8 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds


824 proc format;
825 value agegroup;
NOTE: Format AGEGROUP is already on the library WORK.FORMATS.
NOTE: Format AGEGROUP has been output.
826 0 -<20 = 'Child'
-
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
827 20 -<30 = 'Early Adult'
828 31 -<60 = 'Middle Adult'
829 61 -<80 = 'Late Adult'
830 80 - high = 'Senior';
NOTE: The previous statement has been deleted.
831 run;

WARNING: RUN statement ignored due to previous errors. Submit QUIT; to terminate the procedure.
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds

NOTE: The SAS System stopped processing this step because of errors.


832 proc print data=final_age;
833 run;

NOTE: There were 1000 observations read from the data set WORK.FINAL_AGE.
NOTE: PROCEDURE PRINT used (Total process time):
real time 0.28 seconds
cpu time 0.26 seconds

 

Kurt_Bremser
Super User

First of all, create the format BEFORE(!) you use it.

Second, place the semicolons where needed:

proc format;
value agegroup
  0 -< 20 = 'Child'
  20 -< 30 = 'Early Adult'
  31 -< 60 = 'Middle Adult'
  61 -< 80 = 'Late Adult'
  80 - high = 'Senior'
; /* this semicolon ends the value statement */
run;
aongkeko
Calcite | Level 5

thanks guys! i really appreciate your patience! back to my original working sample!

 

happy thanksgiving 😄

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 22422 views
  • 0 likes
  • 4 in conversation