BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
pk2012
Calcite | Level 5

Hi All,

I am trying to create quintiles for my sample based on SIZE for each year. My sample has five years: 2004 to 2008. I sorted the data by year and SIZE. I, however, do not know how to create quintile for each year.

FIRM   YEAR    SIZE

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

Expected output:

FIRM   YEAR    SIZE Quintile

A          2004     10      1

B          2004     12      2

C          2004     13      3

D          2004     18      4

E          2004     19      5

F          2005      8        1

G          2005    10        2

H          2005    38        3

I          2005     41        4

J          2005     44        5

Please help.

1 ACCEPTED SOLUTION

Accepted Solutions
TomKari
Onyx | Level 15

Hi, pk

1. To find the actual quintile values:

PROC MEANS DATA=have NOPRINT NWAY NONOBS;

VAR SIZE;

CLASS YEAR;

OUTPUT  OUT=want1 P20()= P40()= P60= P80= / AUTONAME;

RUN;

2. To assign the quintile to the data (which looks like what you want):

Data must be sorted by YEAR;

PROC RANK DATA=have GROUPS=5 TIES=MEAN OUT=want2;

BY YEAR;

VAR SIZE;

RANKS rank_SIZE;

RUN;

Tom

View solution in original post

9 REPLIES 9
Reeza
Super User

Do you only have 5 observations per year as in your example?

pk2012
Calcite | Level 5

No, I have moe than five observations in the sample.

pk2012
Calcite | Level 5

Dear Linlin,

I tried the code but it does not create quintiles for each year. Each year has more than five observations.

Linlin
Lapis Lazuli | Level 10

Sorry. I misunderstood your question. I will delete my post.

Linlin
Lapis Lazuli | Level 10

is this helpful?

data have;

input FIRM $  YEAR    SIZE;

cards;

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

k          2005     46

l          2005     50

m          2005     70

;

proc sql noprint;

  select distinct year into :years separated by ' '

     from have;

quit;

%macro test;

%do i=1 %to %sysfunc(countw(&years));

%let year=%scan(&years,&i);

proc univariate noprint data=have;

where year=&year;

var size;

output out=quintile&year pctlpts=20 40 60 80 pctlpre=pct;

run;

/* write the cutpoints to macro variables */

data _null_;

set quintile&year;

call symput('q1',pct20) ;

call symput('q2',pct40) ;

call symput('q3',pct60) ;

call symput('q4',pct80) ;

run;

/* create a new variable containing the quintiles */

data _&year;

set have(where=(year=&year));;

if size <= &q1 then x_quint=1;

else if size <= &q2 then x_quint=2;

else if size <= &q3 then x_quint=3;

else if size <= &q4 then x_quint=4;

else x_quint=5;

run;

%end;

%mend;

%test

data want;

  set _:;

run;

proc print data=want;run;

                 Obs    FIRM    YEAR    SIZE    x_quint

                        1     A      2004     10        1

                        2     B      2004     12        2

                        3     C      2004     13        3

                        4     D      2004     18        4

                        5     E      2004     19        5

                        6     F      2005      8        1

                        7     G      2005     10        1

                        8     H      2005     38        2

                        9     I      2005     41        2

                       10     J      2005     44        3

                       11     k      2005     46        4

                       12     l      2005     50        4

                       13     m      2005     70        5

TomKari
Onyx | Level 15

Hi, pk

1. To find the actual quintile values:

PROC MEANS DATA=have NOPRINT NWAY NONOBS;

VAR SIZE;

CLASS YEAR;

OUTPUT  OUT=want1 P20()= P40()= P60= P80= / AUTONAME;

RUN;

2. To assign the quintile to the data (which looks like what you want):

Data must be sorted by YEAR;

PROC RANK DATA=have GROUPS=5 TIES=MEAN OUT=want2;

BY YEAR;

VAR SIZE;

RANKS rank_SIZE;

RUN;

Tom

Haikuo
Onyx | Level 15

You could make the following code more robust by making another pass to figure out the maximum number of obs each year, I just assume it is less than 100: (raw data stolen from LinLin's post)

data have;

input FIRM $  YEAR    SIZE;

cards;

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

k          2005     46

l          2005     50

m          2005     70

;

data want;

array t(100) _temporary_;

  do _n_=1 by 1 until (last.year);

   set have;

     by year;

      t(_n_)=size;

  end;

  _20=PCTL(20,of t(*));  _40=PCTL(40,of t(*));  _60=PCTL(60,of t(*));  _80=PCTL(80,of t(*));  _100=PCTL(100,of t(*));

  do _n_=1 by 1 until (last.year);

   set have;

     by year;

    if size <= _20 then q=1; else if size <=_40 then q=2; else if size <=_60 then q=3; else if size <=_80 then q=4; else if siZE <=_100 then q=5;

      output;

  end;

  call missing (of t(*));

  drop _:;

  run;

  proc print;run;

Haikuo

pk2012
Calcite | Level 5

Thank you everyone for the codes.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 6044 views
  • 6 likes
  • 6 in conversation