BookmarkSubscribeRSS Feed
u58780790
Calcite | Level 5
where the datastep solution has? can u show me?
tarheel13
Rhodochrosite | Level 12

Would you please look at what @Reeza just posted? A lot of people have given you solutions now. 

u58780790
Calcite | Level 5
yup i am seeing that
thank u !
ballardw
Super User

There are reasons not to use the "data step" solutions but instead use a format directly.

Reason #1: Use the format when printing or using the variable means that you do not have to modify the data set.

Reason #2: Most reporting, analysis and graphing procedures/options will honor the groups created by the format.

Reason #3: You change the format associated with a variable without having to recreate the data set using code such as Proc Datasets.

Reason #4: If you have multiple variables with the same (or sometimes just close enough values) they can use the same format instead of creating multiple additional variables.

Reason #5: If the format definition is changed (but the name remains the same) the format groups (or spelling changes) are applied without any work on the data set at all.

 

There are more reasons but that gets you started.

 

Example below creates a data set to play with and two formats. Then use Proc Freq to count the members of the groups using the two different formats.

data junk;
   do i=1 to 100;
      age = rand('integer',85);
      output;
   end;
run;

proc format;
value agegrpa
1-12 = 'Pre-teen'
13-17= 'Teen'
18-24= 'Young adult'
25-45= '25 to 45'
46-high='46+'
;
value agegrpb
1-9   =' 1 to 9'
10-19 ='10 to 19'
20-29 ='20 to 29'
30-39 ='30 to 39'
40-49 ='40 to 49'
50-59 ='50 to 59'
60-69 ='60 to 69'
70-79 ='70 to 79'
80-89 ='80 to 89'
;
run;

proc freq data=junk;
   title "Using agegrpa";
   table age;
   format age agegrpa.;;
run;
proc freq data=junk;
   title "Using agegrpb";
   table age;
   format age agegrpb.;;
run;

I've worked with projects where we worked with as many as 12 different age group sets because the data was used for multiple programs and the programs targeted different age groups and wanted reports based on the target groups.

Astounding
PROC Star

One last response, modifying your latest program slightly:

 

proc format;
value agegroup
0-5='0-5'
6-11='5-11'
12-18='11-18'
19-44='18-44'
45-high='>45';
run;

data sample;
input name $ gender $ age;
aggrp = put(age, agegroup.);
datalines;
Vinod M 10
Shalini F 18
Reena F 25
Rishi M 40
Sam M 55
;

Once a semicolon marks the end of the datalines, you no longer need a run statement.

Reeza
Super User

You missed a few concepts with your initial code:

 

  1. No code after datalines is executed. You needed to move your IF statements BEFORE your datalines/cards statements.
  2. Use IF/ELSE IF, not multiple IF statements
  3. AgeGroup values need to be in quotation marks as it's a character variable.
  4. Set a length for AgeGroup ahead of time so that the value isn't truncated.

 

Example is here.

 

data task;
input name $ gender $ age;
a=age;

length agegroup $10.;

if 0<a<=5 then agegroup="0-5";
else if 5<a<=11 then agegroup="5-11";
else if 11<a<=18 then agegroup ="11-18";
else if 18<a<=44 then agegroup="18-44";
else if a>45 then agegroup= ">45";

datalines;
Reena F 25
Shyam M 40
Deva M 53
John M 63
Mery F 9
;
run;

@u58780790 wrote:

Read sample input of 5 rows & 3 columns  (name , Gender, Age) using data lines into a  temporary SAS dataset,  and create new column Age-group with values ( >45, 18-44, 0-5,5-11,11-18 ) while reading the datalines into SAS datasets.

Based on age column,

Create a new column Age group

Based a=on age value derive age group below is example for Age value and corresponding age group value.

Age Agegroup

5       0-5

65    >45

90   >45

30   18-44

 

 

SOLUTION:

 

data task;
input name $ gender $ age;
datalines;
Reena F 25
Shyam M 40
Deva M 53
John M 63
Mery F 9
;
a=age;
if 0<a<=5 then agegroup=0-5;
if 5<a<=11 then agegroup=5-11;
if 11<a<=18 then agegroup =11-18;
if 18<a<=44 then agegroup=18-44;
if a>45 then agegroup=>45;
run;
 
 
I have tried in this way but its not coming.  please help me out......
 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 1449 views
  • 7 likes
  • 8 in conversation