BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
NewUsrStat
Pyrite | Level 9

Hi, 

suppose to have the following:

 

Subj   Age

a      30-40

a      30-40

a      50-60

a      50-60

b      30-40

b      30-40

b      30-40 

.................

 

 

Is there a way, for each Subj to replace for the age variable the values with the greatest ones? In other words, 30-40 for "a" should be replaced with 50-60. Thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

Ok . Try this

 

data have;
input Subj $ Age $;
datalines;
a 30-40 
a 30-40 
a 50-60 
a 50-60 
b 30-40 
b 30-40 
b 30-40 
;

proc sql;
create table want as
   select Subj,
          (select max(Age) 
           from have 
           where a.Subj=Subj 
           group by Subj) as Age
   from have as a;
quit;

View solution in original post

4 REPLIES 4
NewUsrStat
Pyrite | Level 9
I have a lot of ranges. I just have to keep the highest. Ranges are: 21-25, 26-30, 31-35, 36-40, 41-45, 46-50, 51-55, 56-60, 61-65 and 66-70.
PeterClemmensen
Tourmaline | Level 20

Ok . Try this

 

data have;
input Subj $ Age $;
datalines;
a 30-40 
a 30-40 
a 50-60 
a 50-60 
b 30-40 
b 30-40 
b 30-40 
;

proc sql;
create table want as
   select Subj,
          (select max(Age) 
           from have 
           where a.Subj=Subj 
           group by Subj) as Age
   from have as a;
quit;
PaigeMiller
Diamond | Level 26

May I suggest that you use a numeric variable to represent age rather than a text string? The advantage (which may not make a difference in your data, I don't know) is that a text string such as 5-9 will seem to be higher (because that's how text is evaluated, alphabetically) than 21-25. And so if you had a 5-9 for a subject, you would possibly get the wrong answer. Similarly, if there are people in the 100-110 year age group, these will appear to be less than the 21-25 age group. Even if this doesn't apply to your data, it is still a good idea for you (and anyone else reading along) to make age a numeric variable.

 

This problem doesn't happen with numeric variables, since these will be evaluated numerically, either 5 or 9 is less than 25, and 100 is greater than 25. And so the code from @PeterClemmensen would also likely fail in this more general case.

 

I would do this:

 

data want;
    set have;
    age_n=input(scan(age,1,'-'),3.);
run;

 

and then use @PeterClemmensen 's code (or something very similar to it) to get the maximum of AGE_N.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 477 views
  • 2 likes
  • 3 in conversation