BookmarkSubscribeRSS Feed
u58780790
Calcite | Level 5

Read sample input of 5 rows & 3 columns  (name , Gender, Age) using data lines into a  temporary SAS dataset,  and create new column Age-group with values ( >45, 18-44, 0-5,5-11,11-18 ) while reading the datalines into SAS datasets.

Based on age column,

Create a new column Age group

Based a=on age value derive age group below is example for Age value and corresponding age group value.

Age Agegroup

5       0-5

65    >45

90   >45

30   18-44

 

 

SOLUTION:

 

data task;
input name $ gender $ age;
datalines;
Reena F 25
Shyam M 40
Deva M 53
John M 63
Mery F 9
;
a=age;
if 0<a<=5 then agegroup=0-5;
if 5<a<=11 then agegroup=5-11;
if 11<a<=18 then agegroup =11-18;
if 18<a<=44 then agegroup=18-44;
if a>45 then agegroup=>45;
run;
 
 
I have tried in this way but its not coming.  please help me out......
 
20 REPLIES 20
ChrisNZ
Tourmaline | Level 20

Surely your teacher and course material showed you how to do this.

What have you tried?

u58780790
Calcite | Level 5
Hii

No I tried!... One line logic is not getting! Help me out
andreas_lds
Jade | Level 19

Please

  • change the title so that it is in closer connection to the problem you have,
  • use the "insert sas code" button to insert properly formatted code,
  • show what you have tried so far.

I would use proc format to define a format, to avoid if-then-else, and then, after the input-statement, the input-function:

agegroup = input(Age, AgeGroupFmt.);
u58780790
Calcite | Level 5
data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sampath M 55
;
a=age
aggrp=put(a,agegroup.);
run;

proc format;
value agegroup
0-5='0-5'
5-11='5-11'
11-18='11-18'
18-44='18-44'
>45='>45';
run;

Astounding
PROC Star

You have some nearly working pieces.  Let's change a few things.

 

First, to use a format you have to first create it.  So PROC FORMAT must be moved to before the DATA step.

 

Second, there is a syntax error in the PROC FORMAT.  The way to get 45 and higher in the same group is to specify:

 

45 - high = '>45';

 

Third, expect that the first mention of a number determines which group it belongs to.  So 5 will go into the "0-5" category, not into the "5-11" category.

 

Fourth, there is no need to create the variable A.  You have AGE, and can use it:

 

aggrp=put(age, agegroup.);

u58780790
Calcite | Level 5
proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
aggrp=input(age,agegroup.);
run;


Is this right ?
But it doesn't give the output... shows error at aggrp line....
Astounding
PROC Star

You switched the PUT function to the INPUT function.  Restore the PUT function.

tarheel13
Rhodochrosite | Level 12

You're still going to get errors with the code like that. You need to move aggrp before datalines statement but I also think your ranges in proc format are not good. What if someone has an age between 5 and 6 or between 11 and 12? All of those people are going to get left out with the way you've assigned the ranges. Please read the documentation. 

https://documentation.sas.com/doc/en/vdmmlcdc/8.1/proc/n03qskwoints2an1ispy57plwrn9.htm

Also, it is ALWAYS a good idea to check your work. Please run the proc means that I use to check the derivation of aggrp.

proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
	infile datalines;
	input name $ gender $ age;
	aggrp=put(age,agegroup.);
	datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
proc print data=sample;
run;

proc means data=sample n nmiss min max;	
	var age;
	class agegroup / missing;
run;

 

Kurt_Bremser
Super User

@u58780790 wrote:
proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;


data sample;
input name $ gender $ age;
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;
aggrp=input(age,agegroup.);
run;


Is this right ?
But it doesn't give the output... shows error at aggrp line....

The crucial point is the DATALINES statement, which is documented here; you will find the clue in there.

tarheel13
Rhodochrosite | Level 12

Those formats are not going to work the way you've written them. Please try my code.


proc format;
	value agegroupf
	1='0-5'
	2='5-11'
	3='11-18'
	4='18-44'
	5='>45';
run;

 data task2;
 	set task;
	format agegroup agegroupf.;
	if age > .z then do;
	if age lt 5 then agegroup=1;
	else if 5 le age lt 11 then agegroup=2;
	else if 11 le age lt 18 then agegroup=3;
	else if 18 le age lt 45 then agegroup=4;
	else if age ge 45 then agegroup=5;
		end;
run;

title "Check agegroup derivation";
proc means data=task2 n nmiss min max;
	var age;
	class agegroup / missing;
run;
title;
u58780790
Calcite | Level 5
if age > .z then do;

can u explain y this comes?
tarheel13
Rhodochrosite | Level 12

Well, I put it there to only include non-missing age values. 

u58780790
Calcite | Level 5
can it be done in datastep without proc format?
tarheel13
Rhodochrosite | Level 12

I have shown a data step solution. If you don't want to do agegroup=1,2,3,4,5 and apply a format then just set agegroup = '0-5', '5-11' and so on and so forth.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 1077 views
  • 7 likes
  • 8 in conversation