BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dhana18
Obsidian | Level 7

Hi I am trying to create age group variable  from age variable which is a character variable.  this code is not producing anything what is wrong with this code?

data STIREPOT.STI2020_Age_Grp;
set STIREPOT.STI2020_D;
length age_grp $ 6;
IF Age='0' And age='4' THEN AGE_GRP='0-4';
IF Age='5' AND age='9' THEN AGE_GRP= '5-9';
IF Age='10' AND age='14' THEN AGE_GRP= '10-14';
IF Age='15' AND age='19' THEN AGE_GRP= '15-19';
IF Age='20' AND age='24' THEN AGE_GRP= '20-24';
IF Age='25' AND age='29' THEN AGE_GRP= '25-29';
IF Age='30' AND age='34' THEN AGE_GRP= '30-34';
IF Age='35' AND age='39' THEN AGE_GRP= '35-39';
IF Age='40' AND age='44' THEN AGE_GRP= '40-44';
IF Age='45' AND age='49' THEN AGE_GRP= '45-49';
IF Age='50' AND age='100' THEN AGE_GRP= '>=50';
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Ron_Cody
Obsidian | Level 7

Hi. 

 

Is Age a character or numeric variable?  You don't want to use AND because Age can never be equal to two different values.  If Age were numeric I would suggest something like:

 

If missing(Age) then Age_Grp = ' ';

else if Age le 4 then Age_Grp = '0 to 4';

else if Age le 8 then Age_Grp = '4 to 8';

I don't know if you want lt or le, it is up to you 

 

 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26
IF Age='0' And age='4' 

This is never true. Age cannot be both '0' and '4'. 


Do yourself a favor, make age numeric and then things are a lot easier:

 

data STIREPOT.STI2020_Age_Grp;
set STIREPOT.STI2020_D;
length age_grp $ 6;
age1=input(age,3.); /* this makes a variable AGE1 which is numeric */
if 0<=age<=4 then age_grp='0-4';
else if 5<=age<=9 then ...

 

ADDING: creating this type of variable AGE_GRP character where values are '0-4' and so on, then these will not sort properly in most outputs because SAS will sort these alphabetically. If that's something you want, proper sorting, then apply a custom format to AGE1, and a lot of SAS procedures can be set to keep things in NUMERICAL order and the sorting will be correct.

--
Paige Miller
Patrick
Opal | Level 21

Converting age to a numerical value and using a format could make things as simple as below.

proc format;
  value age_grp(default=6)
    0-4     = '0-4'
    5-9     = '5-9'
    ....and so on ....
    50-high  = '>=50'
    .       = 'miss'
    other   = 'na'
  ;
run;

data stirepot.sti2020_age_grp;
  set stirepot.sti2020_d;
  age_grp=put(input(age,best32.),age_grp6.);
run;
jimbarbour
Meteorite | Level 14

I am of the same opinion as @Patrick.  Whenever I see ranges being assigned to something continuous like age, I think Proc Format.  See code and results, below.  Proc Format has the advantage that if something comes along that's outside of your normal range, e.g. Ron who is 115 in my test data, you can still catch it by specifying HIGH as the upper end of the last range.  You can also specify OTHER which is a catch all for anything that you didn't define.  In my example, I have a negative age and an invalid (garbage data) age, both of which are handled well by the format.

 

Jim

DATA	Test_Ages;
	INPUT	Name	:	$32.	Age;
DATALINES;
Bob 5
Susan 10
Joyce 15
Michiko 20
Marjorie 25
Corie 30
Devendra 35
Sugantha 40
Venkata 45
DaiJun 50
Jasmin 55
Randell 60
Ron 115
@~a#!!Y2P yy2-
Future_Person -999
;
RUN;

PROC	FORMAT;
	VALUE	Age_Grps
		0  - 4 		= 	'0-4'
		5  - 9 		= 	'5-9'
		10 - 14		= 	'10-14'
		15 - 19		= 	'15-19'
		20 - 24		= 	'20-24'
		25 - 29		= 	'25-29'
		30 - 34		= 	'30-34'
		35 - 39		= 	'35-39'
		40 - 44		= 	'40-44'
		45 - 49		= 	'45-49'
		50 - HIGH	= 	'>=50'
		OTHER 	  	= 	'Invalid'
		;
RUN;

DATA	Want;
	SET	Test_Ages;
	Age_Grp	=	PUT(Age, Age_Grps.);
RUN;

jimbarbour_0-1628874200469.png

 

ballardw
Super User

Another vote for a Format tied to a numeric variable.

 

One of the advantages of formats is that you can have multiple formats and use the one that you want at the time a report or analysis procedure is run. I have age grouping formats for 5-year intervals, 10-year intervals and age groups related to various levels of target groups based on a service model.

You can create different formats for character values as well however to be reliable you have to list every single value because you will quickly find out with character values that "11" is normally before, i.e. "less than" the value "2". Ranges of numeric values are much easier to specify and get expected results.

Ron_Cody
Obsidian | Level 7

Hi. 

 

Is Age a character or numeric variable?  You don't want to use AND because Age can never be equal to two different values.  If Age were numeric I would suggest something like:

 

If missing(Age) then Age_Grp = ' ';

else if Age le 4 then Age_Grp = '0 to 4';

else if Age le 8 then Age_Grp = '4 to 8';

I don't know if you want lt or le, it is up to you 

 

 

Ron_Cody
Obsidian | Level 7

I might suggest "Learning SAS by Example, 2nd edition" or Getting Started with SAS Programming Using SAS Studio in the Cloud. Both by me (Ron Cody) Just go to support.sas.com/cody or enter "Ron Cody" in Amazon search. You will see examples similar to your question

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1802 views
  • 2 likes
  • 6 in conversation