Hello,
I'm new to Sas and was wondering if this code could be simplified:
I'm basically adding a column for the person's age range (5 year range), for other functional procedures i add afterwards, but this is the part that takes longest to process in my program (plus my god how ugly :D)
DATA tranches1;
set DEMO;
if Age < 18 then
TraLibelle = "Moins de 18 ans";
else if 18 < AGE <= 24 then
TraLibelle = "De 18 à 24 ans";
else if 24 < AGE <= 29 then
TraLibelle = "De 25 à 29 ans";
else if 29 < AGE <= 34 then
TraLibelle = "De 30 à 34 ans";
else if 34 < AGE <= 39 then
TraLibelle = "De 35 à 39 ans";
else if 39 < AGE <= 44 then
TraLibelle = "De 40 à 44 ans";
else if 44 < AGE <= 49 then
TraLibelle = "De 45 à 49 ans";
else if 49 < AGE <= 54 then
TraLibelle = "De 50 à 54 ans";
else if 54 < AGE <= 59 then
TraLibelle = "De 55 à 59 ans";
else if 59 < AGE <= 64 then
TraLibelle = "De 60 à 64 ans";
else
TraLibelle = "65 ans et plus";
RUN;
Five year grouping is made for age 25 to 64. It can be used for fun as:
data have; do i = 1 to 20; age = int(99 * uniform(123)); output; end; drop i; run; data want; array T[11] $ 20 _temporary_ ( "Moins de 18 ans" "De 18 à 24 ans" "De 25 à 29 ans" "De 30 à 34 ans" "De 35 à 39 ans" "De 40 à 44 ans" "De 45 à 49 ans" "De 50 à 54 ans" "De 55 à 59 ans" "De 60 à 64 ans" "65 ans et plus" ); set have; if age < 18 then i = 1; else if age < 25 then i = 2; else if age > 64 then i = 11; else i = 2 + ceil((age - 24) / 5); TraLibelle = T[i]; drop i; run;
Hi @polpel IF THEN ELSE for your task is rather boring and tedious. Please take a look at USER DEFINED FORMATS aka PROC FORMAT. Do a simple search on the net, you will get gazillion examples.
hi, thanks for the quick response.
I already have a format set so that i can later order by ascending age range, here's the code:
PROC FORMAT;
value $tranche_age
"Moins de 18 ans"=01
"De 18 à 24 ans"=02
"De 25 à 29 ans"=03
"De 30 à 34 ans"=04
"De 35 à 39 ans"=05
"De 40 à 44 ans"=06
"De 45 à 49 ans"=07
"De 50 à 54 ans"=08
"De 55 à 59 ans"=09
"De 60 à 64 ans"=10
"65 ans et plus"=11
;
QUIT;
could i possibly use this to implement what i'm looking to do?
right now i was able to change my code to this:
DATA tranches1;
set DEMO;
select;
when (Age < 18) TraLibelle = "Moins de 18 ans";
when (18 <= Age <= 24) TraLibelle = "De 18 à 24 ans";
when (24 <= Age <= 29) TraLibelle = "De 25 à 29 ans";
when (29 <= Age <= 34) TraLibelle = "De 30 à 34 ans";
when (34 <= Age <= 39) TraLibelle = "De 35 à 39 ans";
when (39 <= Age <=44) TraLibelle = "De 40 à 44 ans";
when (44 <= Age <=49) TraLibelle = "De 45 à 49 ans";
when (49 <= Age <=54) TraLibelle = "De 50 à 54 ans";
when (54 <= Age <=59) TraLibelle = "De 55 à 59 ans";
when (59 <= Age <=64) TraLibelle = "De 60 à 64 ans";
otherwise TraLibelle = "65 ans et plus";
end;
RUN;
Hi @polpel ,
Try SELECT-WHEN statement:
1) nice blog post by DO-LOOP: https://blogs.sas.com/content/iml/2016/06/20/select-when-sas-data-step.html
2) documentation: https://documentation.sas.com/?docsetId=lestmtsref&docsetTarget=p09213s9jc2t99n1vx0omk2rh9ps.htm&doc...
All the best
Bart;
DATA tranches1;
set DEMO;
select ;
when ( Age < 18 ) TraLibelle = "Moins de 18 ans";
when ( AGE <= 24 ) TraLibelle = "De 18 à 24 ans";
when ( AGE <= 29 ) TraLibelle = "De 25 à 29 ans";
when ( AGE <= 34 ) TraLibelle = "De 30 à 34 ans";
when ( AGE <= 39 ) TraLibelle = "De 35 à 39 ans";
when ( AGE <= 44 ) TraLibelle = "De 40 à 44 ans";
when ( AGE <= 49 ) TraLibelle = "De 45 à 49 ans";
when ( AGE <= 54 ) TraLibelle = "De 50 à 54 ans";
when ( AGE <= 59 ) TraLibelle = "De 55 à 59 ans";
when ( AGE <= 64 ) TraLibelle = "De 60 à 64 ans";
otherwise TraLibelle = "65 ans et plus";
end;
RUN;
@polpel ,
One more approach would be double array and do-loop. But treat it more like "extra SAS tutorial" rather than real use case.
All the best
Bart
data Demo;
do Age = 7 to 77 by 3;
output;
end;
run ;
DATA tranches2;
set DEMO;
array A[0:11] _temporary_ (0 18 24 29 34 39 44 49 54 59 64 999);
array T[ 11] $ 20 _temporary_ (
"Moins de 18 ans"
"De 18 à 24 ans"
"De 25 à 29 ans"
"De 30 à 34 ans"
"De 35 à 39 ans"
"De 40 à 44 ans"
"De 45 à 49 ans"
"De 50 à 54 ans"
"De 55 à 59 ans"
"De 60 à 64 ans"
"65 ans et plus"
);
do _N_ = 1 to 11;
if A[_N_-1] < Age <= A[_N_] then
do;
TraLibelle = T[_N_] ;
leave;
end;
end;
RUN;
strange, I have the following output in both cases (select+when and arrays):
Obs Age TraLibelle 1 7 Moins de 18 ans 2 10 Moins de 18 ans 3 13 Moins de 18 ans 4 16 Moins de 18 ans 5 19 De 18 Ă 24 ans 6 22 De 18 Ă 24 ans 7 25 De 25 Ă 29 ans 8 28 De 25 Ă 29 ans 9 31 De 30 Ă 34 ans 10 34 De 30 Ă 34 ans 11 37 De 35 Ă 39 ans 12 40 De 40 Ă 44 ans 13 43 De 40 Ă 44 ans 14 46 De 45 Ă 49 ans 15 49 De 45 Ă 49 ans 16 52 De 50 Ă 54 ans 17 55 De 55 Ă 59 ans 18 58 De 55 Ă 59 ans 19 61 De 60 Ă 64 ans 20 64 De 60 Ă 64 ans 21 67 65 ans et plus 22 70 65 ans et plus 23 73 65 ans et plus 24 76 65 ans et plus
Bart
Five year grouping is made for age 25 to 64. It can be used for fun as:
data have; do i = 1 to 20; age = int(99 * uniform(123)); output; end; drop i; run; data want; array T[11] $ 20 _temporary_ ( "Moins de 18 ans" "De 18 à 24 ans" "De 25 à 29 ans" "De 30 à 34 ans" "De 35 à 39 ans" "De 40 à 44 ans" "De 45 à 49 ans" "De 50 à 54 ans" "De 55 à 59 ans" "De 60 à 64 ans" "65 ans et plus" ); set have; if age < 18 then i = 1; else if age < 25 then i = 2; else if age > 64 then i = 11; else i = 2 + ceil((age - 24) / 5); TraLibelle = T[i]; drop i; run;
hi, thanks for response,
only i don't understand what you're doing in the first step...
data want;
array T[11] $ 20 _temporary_ (
"Moins de 18 ans"
"De 18 à 24 ans"
"De 25 à 29 ans"
"De 30 à 34 ans"
"De 35 à 39 ans"
"De 40 à 44 ans"
"De 45 à 49 ans"
"De 50 à 54 ans"
"De 55 à 59 ans"
"De 60 à 64 ans"
"65 ans et plus"
);
set have;
if age < 18 then i = 1;
else if age < 25 then i = 2;
else if age > 64 then i = 11;
I am not sure what you mean by "----- i don't understand what you're doing in the first step...".
However, I will explain the logic used.
You require grouping of ages - a total of 11 groups. Each group is labelled and stored in an array T[ ]. Ages 25 to 64 need be grouped into 5-year widths which means 8 intervals(i=3 to i=10). Ages 18 to 24 has a width of 6 years. Rest of the years are falling in open intervals.
First interval i = 1 takes ages below 18 years.
Ages 18 to 24, 6-year width, will be placed in 2nd interval (i = 2).
Ages 65+ will be placed in i = 11.
Ages 25 to 64 takes intervals as said above. The corresponding i (i=3, ...,10) is computed using:
else i = 2 + ceil((age - 24) / 5);
Hope this clarifies your query.
Kind regards,
DataSP
Hello again and thanks for the quick answers!
I was able to use your method but I get an error at the end...
Here's the complete code
DATA calc;
array T[11] $ 20 _temporary_ (
"Moins de 18 ans"
"De 18 à 24 ans"
"De 25 à 29 ans"
"De 30 à 34 ans"
"De 35 à 39 ans"
"De 40 à 44 ans"
"De 45 à 49 ans"
"De 50 à 54 ans"
"De 55 à 59 ans"
"De 60 à 64 ans"
"65 ans et plus"
);
Lotdeselection = &NUM_CTR.;
set DEMO (where=(type_ben = "Adhérent"));
if age < 18 then i = 1;
else if age < 25 then i = 2;
else if age > 64 then i = 11;
else i = 2 + ceil((age - 24) / 5);
TraLibelle = T[i];
drop i;
RUN;
proc tabulate data=calc /*format=Z3.1*/out=tbdN /*style=[1]*/;
class TraLibelle sexe LotdeSelection;
var prorata;
table TraLibelle*LotdeSelection,sexe*(prorata)*(sum /*colpctsum*F=7.2*/);
run;
ODS SELECT ALL ;
proc sort data=tbdN (drop=_page_ _TYPE_ _TABLE_);
by TraLibelle;
run;
proc transpose data=tbdN out=T_;
by TraLibelle LotdeSelection;
var prorata_Sum;
id SEXE;
run;
proc sql;
create table tbd5 as
select LotdeSelection,
TraLibelle,/*put(TraLibelle,$tranche_age.) as test,*/
round(Homme,1) as NbhFinsurv,
round(Femme,1) as NbfFinsurv
from T_
ORDER BY put(TraLibelle,$tranche_age.);
quit;
and the problem is that whenever someone is less than eighteen, it displays a "." instead of "Moins de 18 ans". Is it because there are only 4 under 18 women and no under 18 men?
Best regards
If there is NONE having age < 18, you will get a missing Value which I thought it is legal value. If you want that such missing value to be handled in your way, let me know.
well, i have one value that is inexistent for men, but there are 4 women who are under 18.
if no one is under 18 (men or women), the "Moins de 18 ans" range shouldn't be displayed, but this isn't the case...
A format approach:
proc format library=work; value agegrp 0 - 18 = "Moins de 18 ans" 18 <- 24 = "De 18 à 24 ans" 24 <- 29 = "De 25 à 29 ans" 29 <- 34 = "De 30 à 34 ans" 34 <- 39 = "De 35 à 39 ans" 39 <- 44 = "De 40 à 44 ans" 44 <- 49 = "De 45 à 49 ans" 49 <- 54 = "De 50 à 54 ans" 54 <- 59 = "De 55 à 59 ans" 59 <- 64 = "De 60 à 64 ans" 65 -high = "65 ans et plus" ; RUN ; proc freq data=demo; tables age; format age agegrp.; run;
Formats have an advantage that you do not need to create additional variables, just a new format with different rules and apply in analysis, report or graphing procedure.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.