Quintile for each year

Accepted Solution Solved
Reply
Contributor
Posts: 28
Accepted Solution

Quintile for each year

Hi All,

I am trying to create quintiles for my sample based on SIZE for each year. My sample has five years: 2004 to 2008. I sorted the data by year and SIZE. I, however, do not know how to create quintile for each year.

FIRM   YEAR    SIZE

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

Expected output:

FIRM   YEAR    SIZE Quintile

A          2004     10      1

B          2004     12      2

C          2004     13      3

D          2004     18      4

E          2004     19      5

F          2005      8        1

G          2005    10        2

H          2005    38        3

I          2005     41        4

J          2005     44        5

Please help.


Accepted Solutions
Solution
‎09-24-2012 07:35 PM
PROC Star
Posts: 1,090

Re: Quintile for each year

Hi, pk

1. To find the actual quintile values:

PROC MEANS DATA=have NOPRINT NWAY NONOBS;

VAR SIZE;

CLASS YEAR;

OUTPUT  OUT=want1 P20()= P40()= P60= P80= / AUTONAME;

RUN;

2. To assign the quintile to the data (which looks like what you want):

Data must be sorted by YEAR;

PROC RANK DATA=have GROUPS=5 TIES=MEAN OUT=want2;

BY YEAR;

VAR SIZE;

RANKS rank_SIZE;

RUN;

Tom

View solution in original post


All Replies
Super User
Posts: 17,784

Re: Quintile for each year

Do you only have 5 observations per year as in your example?

Contributor
Posts: 28

Re: Quintile for each year

No, I have moe than five observations in the sample.

PROC Star
Posts: 7,360

Re: Quintile for each year

Contributor
Posts: 28

Re: Quintile for each year

Dear Linlin,

I tried the code but it does not create quintiles for each year. Each year has more than five observations.

Super Contributor
Posts: 1,636

Re: Quintile for each year

Sorry. I misunderstood your question. I will delete my post.

Super Contributor
Posts: 1,636

Re: Quintile for each year

is this helpful?

data have;

input FIRM $  YEAR    SIZE;

cards;

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

k          2005     46

l          2005     50

m          2005     70

;

proc sql noprint;

  select distinct year into :years separated by ' '

     from have;

quit;

%macro test;

%do i=1 %to %sysfunc(countw(&years));

%let year=%scan(&years,&i);

proc univariate noprint data=have;

where year=&year;

var size;

output out=quintile&year pctlpts=20 40 60 80 pctlpre=pct;

run;

/* write the cutpoints to macro variables */

data _null_;

set quintile&year;

call symput('q1',pct20) ;

call symput('q2',pct40) ;

call symput('q3',pct60) ;

call symput('q4',pct80) ;

run;

/* create a new variable containing the quintiles */

data _&year;

set have(where=(year=&year));;

if size <= &q1 then x_quint=1;

else if size <= &q2 then x_quint=2;

else if size <= &q3 then x_quint=3;

else if size <= &q4 then x_quint=4;

else x_quint=5;

run;

%end;

%mend;

%test

data want;

  set _:;

run;

proc print data=want;run;

                 Obs    FIRM    YEAR    SIZE    x_quint

                        1     A      2004     10        1

                        2     B      2004     12        2

                        3     C      2004     13        3

                        4     D      2004     18        4

                        5     E      2004     19        5

                        6     F      2005      8        1

                        7     G      2005     10        1

                        8     H      2005     38        2

                        9     I      2005     41        2

                       10     J      2005     44        3

                       11     k      2005     46        4

                       12     l      2005     50        4

                       13     m      2005     70        5

Solution
‎09-24-2012 07:35 PM
PROC Star
Posts: 1,090

Re: Quintile for each year

Hi, pk

1. To find the actual quintile values:

PROC MEANS DATA=have NOPRINT NWAY NONOBS;

VAR SIZE;

CLASS YEAR;

OUTPUT  OUT=want1 P20()= P40()= P60= P80= / AUTONAME;

RUN;

2. To assign the quintile to the data (which looks like what you want):

Data must be sorted by YEAR;

PROC RANK DATA=have GROUPS=5 TIES=MEAN OUT=want2;

BY YEAR;

VAR SIZE;

RANKS rank_SIZE;

RUN;

Tom

Respected Advisor
Posts: 3,124

Re: Quintile for each year

You could make the following code more robust by making another pass to figure out the maximum number of obs each year, I just assume it is less than 100: (raw data stolen from LinLin's post)

data have;

input FIRM $  YEAR    SIZE;

cards;

A          2004     10

B          2004     12

C          2004     13

D          2004     18

E          2004     19

F          2005      8

G          2005    10

H          2005    38

I          2005     41

J          2005     44

k          2005     46

l          2005     50

m          2005     70

;

data want;

array t(100) _temporary_;

  do _n_=1 by 1 until (last.year);

   set have;

     by year;

      t(_n_)=size;

  end;

  _20=PCTL(20,of t(*));  _40=PCTL(40,of t(*));  _60=PCTL(60,of t(*));  _80=PCTL(80,of t(*));  _100=PCTL(100,of t(*));

  do _n_=1 by 1 until (last.year);

   set have;

     by year;

    if size <= _20 then q=1; else if size <=_40 then q=2; else if size <=_60 then q=3; else if size <=_80 then q=4; else if siZE <=_100 then q=5;

      output;

  end;

  call missing (of t(*));

  drop _:;

  run;

  proc print;run;

Haikuo

Contributor
Posts: 28

Re: Quintile for each year

Thank you everyone for the codes.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 2391 views
  • 6 likes
  • 6 in conversation