BookmarkSubscribeRSS Feed
Mirisage
Obsidian | Level 7
Hello Colleagues,

I am trying to split the values of income variable “PPI” below into 10 parts (deciles). PPI has over 10,000 records.

I attempted the program indicated below but all values for the newly created ‘PPI_decile’ variable turn out to be “10” whereas it should have been 1, 2,3,4, 5, 6, 7,8, 9 and 10.

I wonder if anyone of you could help me to revise this program or suggest a more efficient approach.



Data data_families1;
Input ID PPI;
Cards;
1 750
2 800
3 850
4 950
5 1250
6 1500
7 .
8 1600
9 1700
10 1850
11 2500
12 750
13 2100
14 2500
15 3750
16 .
17 750
;
Run;


/*DECILE CALCULATIONS*/
proc univariate data=data_families1;
var PPI;
output out=decile pctlpts=10 20 30 40 50 60 70 80 90 pctlpre=pct;
run;

/*Write the cut points to macro variable*/
data _null_;
set data_families1;
call symput ('q1', pct10);
call symput ('q2', pct20);
call symput ('q3', pct30);
call symput ('q4', pct40);
call symput ('q5', pct50);
call symput ('q6', pct60);
call symput ('q7', pct70);
call symput ('q8', pct80);
call symput ('q9', pct90);
run;

/*Creating a new variable containing the deciles*/
data data_families2;
set data_families1;
if PPI=. then PPI_decile=.;
else if PPI <=&q1 then PPI_decile=1;
else if PPI<=&q2 then PPI_decile=2;
else if PPI<=&q3 then PPI_decile=3;
else if PPI<=&q4 then PPI_decile=4;

else if PPI<=&q5 then PPI_decile=5;
else if PPI<=&q6 then PPI_decile=6;
else if PPI<=&q7 then PPI_decile=7;
else if PPI<=&q8 then PPI_decile=8;
else if PPI<=&q9 then PPI_decile=9;
else PPI_decile=10;
run;



Thank you

Mirisage
9 REPLIES 9
Cynthia_sas
Diamond | Level 26
Hi:
Did your code example get cut off??? Remember that if your code contains < or > symbols, you need to "protect" them, as described in this previous forum posting:
http://support.sas.com/forums/thread.jspa?messageID=27609毙

cynthia
Mirisage
Obsidian | Level 7
Hi Cynthia,

No, it did not get cut off.

So, the question is how to split the PPI variable below into 10 parts (deciles) using SAS.

Data data_families1;
Input ID PPI;
Cards;
1 750
2 800
3 850
4 950
5 1250
6 1500
7 .
8 1600
9 1700
10 1850
11 2500
12 750
13 2100
14 2500
15 3750
16 .
17 750
;
Run;
data_null__
Jade | Level 19
I think you want PROC RANK with the GROUPS option.

[pre]
Data data_families1;
Input ID PPI @@;
Cards;
1 750 2 800 3 850 4 950 5 1250 6 1500 7 .
8 1600 9 1700 10 1850 11 2500 12 750 13 2100
14 2500 15 3750 16 . 17 750
;
Run;
proc rank group=10 out=deciles;
var ppi;
ranks decile;
run;
proc print;
run;
[/pre]
Cynthia_sas
Diamond | Level 26
Oh, I just wondered because
[pre]
else if PPI
[/pre]

(which is where your post ends ... is not a complete, valid SAS statement).

It just seemed odd.

cynthia
Mirisage
Obsidian | Level 7
Hi Cynthia,

This is the complete program I attempted to split the above data set into deciles.

/*DECILE CALCULATIONS*/
proc univariate data=data_families1;
var PPI;
output out=percentile pctlpts=10 20 30 40 50 60 70 80 90 pctlpre=pct;
run;

/*Write the cutpoints to macro variables*/
data _null_;
set data_families1;
call symput ('q1', pct10);
call symput ('q2', pct20);
call symput ('q3', pct30);
call symput ('q4', pct40);
call symput ('q5', pct50);
call symput ('q6', pct60);
call symput ('q7', pct70);
call symput ('q8', pct80);
call symput ('q9', pct90);
run;

/*Create a new variable containing the DECILES*/
data data_families2;
set data_families1;
if PPI=. then PPI_quint=.;
else if PPI <=&q1 then PPI_quint=1;
else if PPI<=&q2 then PPI_quint=2;
else if PPI<=&q3 then PPI_quint=3;
else if PPI<=&q4 then PPI_quint=4;

else if PPI<=&q5 then PPI_quint=5;
else if PPI<=&q6 then PPI_quint=6;
else if PPI<=&q7 then PPI_quint=7;
else if PPI<=&q8 then PPI_quint=8;
else if PPI<=&q9 then PPI_quint=9;

else PPI_quint=10;
run;


/*Test to make sure it worked*/
proc means data=data_families2 missing;
class PPI_quint;
var PPI;
run;
Mirisage
Obsidian | Level 7
Hi Cynthia,

Sorry, although I pasted the complete program again in the window above, it is truncated automatically. So, only a part is shown.
Mirisage
Obsidian | Level 7
Hi data_null_,

Thank you very much for these codes.

They worked correctly.



Hi Cynthia,

Thank you as well for your support.

Mirisage
Peter_C
Rhodochrosite | Level 12
> Hello Colleagues,
>
> I am trying to split the values of income variable “PPI” below into 10 parts (deciles). PPI has over 10,000 records.
>
> I attempted the program indicated below but all values for the newly created ‘PPI_decile’ variable turn out to be “10” whereas it should have been 1, 2,3,4, 5, 6, 7,8, 9 and 10.
>
> I wonder if anyone of you could help me to revise this program or suggest a more efficient approach.
>
>
>
> Data data_families1;
> Input ID PPI;
> Cards;
> 1 750
> 2 800
> 3 850
> 4 950
> 5 1250
> 6 1500
> 7 .
> 8 1600
> 9 1700
> 10 1850
> 11 2500
> 12 750
> 13 2100
> 14 2500
> 15 3750
> 16 .
> 17 750
> ;
> Run;
>
>
> /*DECILE CALCULATIONS*/
> proc univariate data=data_families1;
> var PPI;
> output out=decile pctlpts=10 20 30 40 50 60 70 80 90
> pctlpre=pct;
> run;
>
> /*Write the cut points to macro variable*/
> data _null_;
> set data_families1;
> call symput ('q1', pct10);
> call symput ('q2', pct20);
> call symput ('q3', pct30);
> call symput ('q4', pct40);
> call symput ('q5', pct50);
> call symput ('q6', pct60);
> call symput ('q7', pct70);
> call symput ('q8', pct80);
> call symput ('q9', pct90);
> run;
>
> /*Creating a new variable containing the deciles*/
> data data_families2;
> set data_families1;
> if PPI=. then PPI_decile=.;
> else if PPI LE&q1 then PPI_decile=1;
> else if PPI LE &q2 then PPI_decile=2;
> else if PPI LE &q3 then PPI_decile=3;
> else if PPI LE &q4 then PPI_decile=4;
>
> else if PPI LE &q5 then PPI_decile=5;
> else if PPI LE &q6 then PPI_decile=6;
> else if PPI LE &q7 then PPI_decile=7;
> else if PPI LE &q8 then PPI_decile=8;
> else if PPI LE &q9 then PPI_decile=9;
> else PPI_decile=10;
> run;
>
>
>
> Thank you
>
> Mirisage

Mirisage
1
your program was entirely there, as quoting your message reveals it. It was just not displayed from the first ≤ . So in this response I've replaced those with ≤ and a ;

2
your problem was not because you used symput() instead of symputX() (although your choice does not help diagnose the problem)

3
your solution placed all into PPI_decile =10 because you loaded the macro variables from the input to proc univariate data=data_families1 instead of from the output dataset out=decile.

with that change, your solution works just fine.
Now [inappropriate question], is it better than the proc rank approach ?






answer ="special missing value" [question not applicable]
Mirisage
Obsidian | Level 7
Hi Peter,

Thank you very much for this.

Mirisage
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2929 views
  • 0 likes
  • 4 in conversation