Sum all numeric variables

Accepted Solution Solved
Reply
Contributor
Posts: 39
Accepted Solution

Sum all numeric variables

My data looks like this.

 

Obs   Id   Numeric       Character

1        1       1                  xxx

2        1       2                  xxx

3        2       3                  xyz

4        2        4                 xyz

5        3        5                 zzz

6        3         6                zzz

 

What I want to do is create a new table that is grouped by the ID variable, while summing the numeric variable and retaining the value of the character variable (which is the same for every ID).

 

I know how to do this for this exact example with proc sql, a select statement and summing Numeric, and using a group by statement, but the problem is there are hundreds of numeric variables in my actual data set and I want to do it for every one. Anyone have any ideas?

 

Thanks


Accepted Solutions
Solution
‎10-23-2015 02:26 PM
Respected Advisor
Posts: 4,804

Re: Sum all numeric variables

Use the AUTONAME option of proc summary to avoid trouble with variable names

 


proc summary data=have;
    by id notsorted;
    id character;
    var _numeric_;
    output out=want( drop=id_sum _: ) sum= / autoname;
run;
PG

View solution in original post


All Replies
Trusted Advisor
Posts: 1,783

Re: Sum all numeric variables

UNTESTED CODE

proc summary data=whatever;
    id character;
    class id;
    var _numeric_;
    output out=sums sum=;
run;
Super User
Posts: 19,038

Re: Sum all numeric variables

This sums at the overall level plus the ID level, you may want to restrict it using the NWAY option to only the ID level.
Solution
‎10-23-2015 02:26 PM
Respected Advisor
Posts: 4,804

Re: Sum all numeric variables

Use the AUTONAME option of proc summary to avoid trouble with variable names

 


proc summary data=have;
    by id notsorted;
    id character;
    var _numeric_;
    output out=want( drop=id_sum _: ) sum= / autoname;
run;
PG
Trusted Advisor
Posts: 1,783

Re: Sum all numeric variables

What "trouble with variable names" can happen in this situation?

Respected Advisor
Posts: 4,804

Re: Sum all numeric variables

When ID is being used as a class or by variable but is also a _numeric_, it ends up being twice in the output dataset. Proc Summary doesn't like that. 

PG
Contributor hbi
Contributor
Posts: 66

Re: Sum all numeric variables

I am not sure which interpretation is correct, so I have included two versions.

 

If the objective is to do running sums of each numeric variable (individually), grouped by ID, then in theory, something like this could work: 

 

DATA have; 
   INPUT id Numeric1 Numeric2 Numeric3 Numeric4 Character $;
   DATALINES; 
1 1 11 115 1199 xxx
1 2 22 225 2299 xxx
2 3 33 335 3399 xyz
2 4 44 445 4499 xyz
3 5 55 555 5599 zzz
3 6 66 665 6699 zzz
;
RUN;


/* untested code */
DATA want;
  SET have;
  BY id; /* assumes your data is sorted properly */
  
  LENGTH Sum1-Sum4 8;
  ARRAY num_array Numeric1-Numeric4;
  ARRAY sum_array Sum1-Sum4;

  DO OVER num_array;
    IF first.id THEN sum_array = num_array;
    ELSE             sum_array + num_array;
  END;

RUN;

 

If the objective is to lump all numeric variables together (with the exception of ID) and find their running sum, something like this could work: 

 

/* untested code */
DATA want;
  SET have;
  BY id;
  
  /* id is numeric and it will be summed; thus negate it ... */
  obs_total = SUM(of _numeric_) - id;
  
  IF first.id THEN running_total = obs_total;
  ELSE             running_total + obs_total;

RUN;

 

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 378 views
  • 2 likes
  • 5 in conversation