Programming the statistical procedures from SAS

Quantiles and Sum of Variables...

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

Quantiles and Sum of Variables...

Hi everyone!

I am fairly new to SAS. Maybe you could help me with two problems:

1. I am trying to export the quantiles for several consecutive variables with the PROC UNIVARIATE procedure. However, even though I am mentioning all variables in the VAR statement (VAR x1-x200) the output just shows the results for one of the variables. Is there a way to export the quantiles of all variables at once?

2. There are several variables (including the observations) which I would like to sum up. I'm well aware of the SUM function. What I'm looking for is a way in creating n new variables, each a sum of the already existing variables.

In detail: x1= a1+....+a100, x2=a101+...+a200,. .....  a50=a4900+...+a5000

Since I would rather not type in all SUM commands by hand I would be glad if one of you had an idea. I've already thought about arrays abut couldn't think of an appropriate way.

Looking forward to your help! THANK YOU!


Accepted Solutions
Solution
‎11-03-2011 10:53 AM
Respected Advisor
Posts: 2,655

Re: Quantiles and Sum of Variables...

That seems odd--I ran that code on existing data and got all the quantiles for each variable, named as per x1_p30 (30th percentile of x1), x1_p60 (60th percentile of x1), etc.

Unfortunately, I agree that the decimal place thing doesn't work for PROC MEANS, and for PROC UNIVARIATE, it gets complicated with the pctlpts and pctlpre options.  However, I got the following to work quite well, and the variables px12_5, px197_5, px22_5, px297_5, px32_5, px397_5, px42_5, and px497_5 are in the output dataset with labels that look like "the   2.5000 percentile, x1", etc.:

proc univariate data=tryit;

var x1-x4;

output out=checkit2

     pctlpts= 2.5 97.5

     pctlpre= px1-px4;

run;

If you are only getting a single value, I have no idea what might be going on, unless you have BY, CLASS or WHERE statements that essentially partition the data before processing.  Can you show your code?

Steve Denham

View solution in original post


All Replies
PROC Star
Posts: 7,416

Quantiles and Sum of Variables...

Not sure if I correctly understand your requests.  If I do, for Q1, use ods.  Take a look at:

http://support.sas.com/documentation/cdl/en/procstat/63104/HTML/default/viewer.htm#procstat_univaria...

For Q2, couldn't you just use a variable list?  E.g.,

data have;

  input a b c;

  sum=sum(of a--c);

  cards;

12 0  36

0  0  20

30 29  0

1  30  4

55  0  0

0  12  0

;

If the variables are as shown in your example, you could also use a numbered variable list an not have to worry about the actual order of the variables in your dataset.  e.g.

sum=sum(of a1-a5000);

Contributor
Posts: 33

Re: Quantiles and Sum of Variables...

Thank you art297 for your quick reply!

Unfortunately my problems still remain. Please excuse my vague description.

To Question 1: I'd like to create a new dataset and not a table with the quantiles of the variables. I've used the following code:

proc univariated data=x;

Var x1 x2 x3

output out= quantiles pctlpts=90.0, 60.0, 30.0, pctlpre=q;

run;

However as mentioned in my original post, the dataset just gives me the result for one variable. Is there a way to export the quantiles of all variables at once?

To question 2: I am well aware of the sum(of....) command. In fact it's what I'm using at the moment. Though, I'm having lots of variables and I'd like sum up 100 of them in every new variable, i. e. not all in one variable - cause in this case the sum(of x1-x5000) would be the way to go.

In detail:

x1=a1+....+a100

x2=a101+....+a200

....

x50=a4900+....+a5000

At the moment the only way I can think of is the sum(of ...) command for every new variable - i.e. x1=sum(of a1 to a100). But that would mean a lot of typing for creating 50 new variables.

I hope you get a better picture of my difficulties. Maybe you could help me outSmiley Happy Thank you for your effort!!!

Respected Advisor
Posts: 2,655

Re: Quantiles and Sum of Variables...

What about shifting to PROC MEANS from UNIVARIATE?

This should give an output dataset named work.want with variables x1_P30, x1_P60, x1_P90, x2_P30, etc.

proc means noprint data=have;

var x1-x200;

output out=want p30= p60= p90=/autoname;

run;

Steve Denham

PROC Star
Posts: 7,416

Re: Quantiles and Sum of Variables...

And, in addition to Steve's advice, couldn't you do the sums with something like:

data have (drop=a: ySmiley Happy;

  array a(5000);

  array x(50);

  array y(100);

  retain a (5000*1);

  do i=1 to 4901 by 100;

    do j=0 to 99;

      y(j+1)=a(i+j);

    end;

    k+1;

    x(k)=sum(of ySmiley Happy;

  end;

run;

However, I'm not sure what you mean by you want a dataset rather than a table.  The words may be getting in the way, thus I think you'll have to explain what you mean by a "dataset".

Contributor
Posts: 33

Re: Quantiles and Sum of Variables...

Thank you Steve and art297 for your great support!

To Steve's advice:

I've tried the PROC MEANS procedure but it won't let me output more than one quantile (i.e. p90 or p60) at once. In addition to that it does not seem to work for quantiles with decimal places (i.e. 99.9%).

To Art297's advice:

First of allI want to thank you for your patience with me. Unfortunately the code you posted just seems to work for variables with one observation. When I use it on more than one observation I get the error: "Array subscript out ofrange at line....". I've tried to find the cause for the error but do not have a clue how to change your code. To be frank I still do not get how your code exactly works. Why does summing up all y (ySmiley Happy lead to x1-x50? Shouldn’t be the sum of all y be the same for all x? Could you tell me what the RETAIN a(5000*1) function does in the given example?

P.S.: With dataset I meant a new data file. With table I meant the print out in the results section.

Once again thank you for your awesome support!

Solution
‎11-03-2011 10:53 AM
Respected Advisor
Posts: 2,655

Re: Quantiles and Sum of Variables...

That seems odd--I ran that code on existing data and got all the quantiles for each variable, named as per x1_p30 (30th percentile of x1), x1_p60 (60th percentile of x1), etc.

Unfortunately, I agree that the decimal place thing doesn't work for PROC MEANS, and for PROC UNIVARIATE, it gets complicated with the pctlpts and pctlpre options.  However, I got the following to work quite well, and the variables px12_5, px197_5, px22_5, px297_5, px32_5, px397_5, px42_5, and px497_5 are in the output dataset with labels that look like "the   2.5000 percentile, x1", etc.:

proc univariate data=tryit;

var x1-x4;

output out=checkit2

     pctlpts= 2.5 97.5

     pctlpre= px1-px4;

run;

If you are only getting a single value, I have no idea what might be going on, unless you have BY, CLASS or WHERE statements that essentially partition the data before processing.  Can you show your code?

Steve Denham

Contributor
Posts: 33

Re: Quantiles and Sum of Variables...

Thank you so much Steve!!!

As it turns out, I forgot to state all variable names in the pctlpre option - I just put in letter q. So much for that....

Now there is only question 2 still to be solved....Smiley Wink

SAS Super FREQ
Posts: 3,547

Re: Quantiles and Sum of Variables...

Steve has the right idea.

Since "Mark is fairly new to SAS," I don't want to complicate the discussion. But for the "old timers" I will mention that if you have SAS/IML this problem is pretty simple:

proc iml;
use sashelp.class;
read all var _NUM_ into x[colname=varnames];

probs = {0.025 0.05 0.95 0.975};
call qntl(q, x, probs); /* compute quantiles of each column */
labels = "P" + strip(putn(probs, "5.3"));
labels = tranwrd(labels, ".", "_");
print q[rowname=labels colname=varnames];

/* then output matrix (or it's transpose) */

PROC Star
Posts: 7,416

Re: Quantiles and Sum of Variables...

Mark, Here is the explanation you wanted in the form of comments in the code.  I'll break the code into two parts: one creating a test file to approximate your data, and another to do the calculation.

/*create a test file to mimic your real data. The test*/

/*file, have, will have 1 record and 5000 variables*/

data have;

  array a(5000);

/*  assign a 1 to all 5000 variables*/

  retain a (5000*1);

run;

data want (drop=a: ySmiley Happy;

  set have;

/*  assign all 5000 variables to the array a*/

  array a(5000) a1-a5000;

/*  initialize k to 0 at the beginning of reading each new record*/

  k=0;

/*initialize 2 additional arrays to represent x1 to*/

/*x50 .. the variables that will contain the sums and*/

/*y1 to y100 .. the variables that will temporarily*/

/*contain the values from each set of 100 variables*/

  array x(50);

  array y(100);

/*loop through the variables 100 at a time*/

  do i=1 to 4901 by 100;

/*fill the array y with each set of 100 variables*/

    do j=0 to 99;

      y(j+1)=a(i+j);

    end;

/*calculate the desired sum*/

    k+1;

    x(k)=sum(of ySmiley Happy;

/*move onto the next set of 100 variables*/

  end;

run;

Contributor
Posts: 33

Re: Quantiles and Sum of Variables...

Thank you all!

@Rick: Thank you for the alternative method!

@Art297: Wow.... Thank you so much! The comments really help a lot... But I still think that this code is not working in case of more than one record for each variable (see my last reply)Smiley Sad

PROC Star
Posts: 7,416

Re: Quantiles and Sum of Variables...

You would have to explain your data with an example.  It sounds like you first want to sum or doing something with the individual records before doing your analysis .. like maybe first run proc summary to create a file with aggregated data.

Contributor
Posts: 33

Re: Quantiles and Sum of Variables...

@Art297

Aggregated data wouldn't help. To give you an idea:

Let's assume my data consists of 5000 variables (a1-a5000) each with 100 records.

My goal is to sum up those variables as follows (horizontal addition):

x1= a1+...+a100 .....x50=a4900+...+a5000

Each new variable (x1 to x50) should have 100 records.

Maybe this helps to clear things up! THANK YOU!

Super User
Posts: 9,769

Re: Quantiles and Sum of Variables...

How about:

options symbolgen mprint mlogic;
%macro a;
%let start=1;
%let end=100;
data want;
 set have;
 %do i=1 %to 50;
  x&i=sum(of a&start - a&end);
  %let start=%eval(&end+1);
  %let end=%eval(&start+99);
 %end;
run;
%mend a;

%a


Ksharp

PROC Star
Posts: 7,416

Re: Quantiles and Sum of Variables...

KSharp,

That is what my originally suggested code did but the OP said it wasn't what was needed.  I'm confused about what the OP is seeking.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 20 replies
  • 914 views
  • 7 likes
  • 5 in conversation