BookmarkSubscribeRSS Feed
u58780790
Calcite | Level 5

Read sample input of 5 rows & 3 columns  (name , Gender, Age) using data lines into a  temporary SAS dataset,  and create new column Age-group with values ( >45, 18-44, 0-5,5-11,11-18 ) while reading the datalines into SAS datasets.

Based on age column,

Create a new column Age group

Based a=on age value derive age group below is example for Age value and corresponding age group value.

Age Agegroup

5       0-5

65    >45

90   >45

30   18-44

 

 

SOLUTION:

 

data task;
input name $ gender $ age;
datalines;
Reena F 25
Shyam M 40
Deva M 53
John M 63
Mery F 9
;
a=age;
if 0<a<=5 then agegroup=0-5;
if 5<a<=11 then agegroup=5-11;
if 11<a<=18 then agegroup =11-18;
if 18<a<=44 then agegroup=18-44;
if a>45 then agegroup=>45;
run;
 
 
I have tried in this way but its not coming.  please help me out......
 
20 REPLIES 20
ChrisNZ
Tourmaline | Level 20

Surely your teacher and course material showed you how to do this.

What have you tried?

u58780790
Calcite | Level 5
Hii

No I tried!... One line logic is not getting! Help me out
andreas_lds
Jade | Level 19

Please

  • change the title so that it is in closer connection to the problem you have,
  • use the "insert sas code" button to insert properly formatted code,
  • show what you have tried so far.

I would use proc format to define a format, to avoid if-then-else, and then, after the input-statement, the input-function:

agegroup = input(Age, AgeGroupFmt.);
u58780790
Calcite | Level 5
data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sampath M 55
;
a=age
aggrp=put(a,agegroup.);
run;

proc format;
value agegroup
0-5='0-5'
5-11='5-11'
11-18='11-18'
18-44='18-44'
>45='>45';
run;

Astounding
PROC Star

You have some nearly working pieces.  Let's change a few things.

 

First, to use a format you have to first create it.  So PROC FORMAT must be moved to before the DATA step.

 

Second, there is a syntax error in the PROC FORMAT.  The way to get 45 and higher in the same group is to specify:

 

45 - high = '>45';

 

Third, expect that the first mention of a number determines which group it belongs to.  So 5 will go into the "0-5" category, not into the "5-11" category.

 

Fourth, there is no need to create the variable A.  You have AGE, and can use it:

 

aggrp=put(age, agegroup.);

u58780790
Calcite | Level 5
proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
aggrp=input(age,agegroup.);
run;


Is this right ?
But it doesn't give the output... shows error at aggrp line....
Astounding
PROC Star

You switched the PUT function to the INPUT function.  Restore the PUT function.

tarheel13
Rhodochrosite | Level 12

You're still going to get errors with the code like that. You need to move aggrp before datalines statement but I also think your ranges in proc format are not good. What if someone has an age between 5 and 6 or between 11 and 12? All of those people are going to get left out with the way you've assigned the ranges. Please read the documentation. 

https://documentation.sas.com/doc/en/vdmmlcdc/8.1/proc/n03qskwoints2an1ispy57plwrn9.htm

Also, it is ALWAYS a good idea to check your work. Please run the proc means that I use to check the derivation of aggrp.

proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
	infile datalines;
	input name $ gender $ age;
	aggrp=put(age,agegroup.);
	datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
proc print data=sample;
run;

proc means data=sample n nmiss min max;	
	var age;
	class agegroup / missing;
run;

 

Kurt_Bremser
Super User

@u58780790 wrote:
proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
aggrp=input(age,agegroup.);
run;


Is this right ?
But it doesn't give the output... shows error at aggrp line....

The crucial point is the DATALINES statement, which is documented here; you will find the clue in there.

tarheel13
Rhodochrosite | Level 12

Those formats are not going to work the way you've written them. Please try my code.


proc format;
	value agegroupf
	1='0-5'
	2='5-11'
	3='11-18'
	4='18-44'
	5='>45';
run;

 data task2;
 	set task;
	format agegroup agegroupf.;
	if age > .z then do;
	if age lt 5 then agegroup=1;
	else if 5 le age lt 11 then agegroup=2;
	else if 11 le age lt 18 then agegroup=3;
	else if 18 le age lt 45 then agegroup=4;
	else if age ge 45 then agegroup=5;
		end;
run;

title "Check agegroup derivation";
proc means data=task2 n nmiss min max;
	var age;
	class agegroup / missing;
run;
title;
u58780790
Calcite | Level 5
if age > .z then do;

can u explain y this comes?
tarheel13
Rhodochrosite | Level 12

Well, I put it there to only include non-missing age values. 

u58780790
Calcite | Level 5
can it be done in datastep without proc format?
tarheel13
Rhodochrosite | Level 12

I have shown a data step solution. If you don't want to do agegroup=1,2,3,4,5 and apply a format then just set agegroup = '0-5', '5-11' and so on and so forth.

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 1937 views
  • 7 likes
  • 8 in conversation