BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jacob_klimek
Obsidian | Level 7

My data looks like this.

 

Obs   Id   Numeric       Character

1        1       1                  xxx

2        1       2                  xxx

3        2       3                  xyz

4        2        4                 xyz

5        3        5                 zzz

6        3         6                zzz

 

What I want to do is create a new table that is grouped by the ID variable, while summing the numeric variable and retaining the value of the character variable (which is the same for every ID).

 

I know how to do this for this exact example with proc sql, a select statement and summing Numeric, and using a group by statement, but the problem is there are hundreds of numeric variables in my actual data set and I want to do it for every one. Anyone have any ideas?

 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use the AUTONAME option of proc summary to avoid trouble with variable names

 


proc summary data=have;
    by id notsorted;
    id character;
    var _numeric_;
    output out=want( drop=id_sum _: ) sum= / autoname;
run;
PG

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

UNTESTED CODE

proc summary data=whatever;
    id character;
    class id;
    var _numeric_;
    output out=sums sum=;
run;
--
Paige Miller
Reeza
Super User
This sums at the overall level plus the ID level, you may want to restrict it using the NWAY option to only the ID level.
PGStats
Opal | Level 21

Use the AUTONAME option of proc summary to avoid trouble with variable names

 


proc summary data=have;
    by id notsorted;
    id character;
    var _numeric_;
    output out=want( drop=id_sum _: ) sum= / autoname;
run;
PG
PaigeMiller
Diamond | Level 26

What "trouble with variable names" can happen in this situation?

--
Paige Miller
PGStats
Opal | Level 21

When ID is being used as a class or by variable but is also a _numeric_, it ends up being twice in the output dataset. Proc Summary doesn't like that. 

PG
hbi
Quartz | Level 8 hbi
Quartz | Level 8

I am not sure which interpretation is correct, so I have included two versions.

 

If the objective is to do running sums of each numeric variable (individually), grouped by ID, then in theory, something like this could work: 

 

DATA have; 
   INPUT id Numeric1 Numeric2 Numeric3 Numeric4 Character $;
   DATALINES; 
1 1 11 115 1199 xxx
1 2 22 225 2299 xxx
2 3 33 335 3399 xyz
2 4 44 445 4499 xyz
3 5 55 555 5599 zzz
3 6 66 665 6699 zzz
;
RUN;


/* untested code */
DATA want;
  SET have;
  BY id; /* assumes your data is sorted properly */
  
  LENGTH Sum1-Sum4 8;
  ARRAY num_array Numeric1-Numeric4;
  ARRAY sum_array Sum1-Sum4;

  DO OVER num_array;
    IF first.id THEN sum_array = num_array;
    ELSE             sum_array + num_array;
  END;

RUN;

 

If the objective is to lump all numeric variables together (with the exception of ID) and find their running sum, something like this could work: 

 

/* untested code */
DATA want;
  SET have;
  BY id;
  
  /* id is numeric and it will be summed; thus negate it ... */
  obs_total = SUM(of _numeric_) - id;
  
  IF first.id THEN running_total = obs_total;
  ELSE             running_total + obs_total;

RUN;

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 17154 views
  • 2 likes
  • 5 in conversation