BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Khaladdin
Quartz | Level 8

Hi all,

I have a question on SAS Programming. It is about conditional sum. But it is very specific for me. Therefore, I want to ask as an example. I have the following dataset:

Group A Quantity
    1    10      7
    1     8      4
    1     7      3
    1    10     5
    2    11     6
    2    13     8
    2     9      7
    2    13     9


I want to add two more columns to this dataset. The new dataset should be:

Group   A   Quantity   B     NewColumn
     1     10        7        10      12 (7+5)
     1      8         4        10      12
     1      7         3        10      12
     1     10        5        10      12
     2     13        6        13      15 (6+9)
     2     10        8        13      15
     2      9         7        13      15
     2     13        9        13      15

So, the column B should be equal tha maximum value of each group and it is the same for all observations of each group. In this example, Group number 1 has 4 values. They are 10, 8, 7, 10. The maximum among these values is 10. Therefore, the values of the observations of the B column for the first group are all equal to 10. Maximum number for group number 2 is 13. Therefore, the values of the observations of the B column for the second group are all equal to 13.
The column C is more complicated. Its value depends on the all columns. Similiar to B column, it will be the same within group. More detailed, it is the sum of the specific observations of QUANTITIES column. These specific observations should belong to the observations that have the maximum value in each group. In our example, it is 12 for the first group. The reason is, the maximum number of first group is 10. and the quantities belong to 10 are 7 and 5. So, the sum of these is 12. For the second group it is 15. because the maximum value of the second group is 13 and the quantities belong to 13 are 6 and 9. So the sum is 15.
I hope. I can explain it. Many thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
rogerjdeangelis
Barite | Level 11
Great explanation and nice presentation

data have;
input Group A Quantity;
cards4;
 1 10 7
 1 8 4
 1 7 3
 1 10 5
 2 11 6
 2 13 8
 2 9 7
 2 13 9
;;;;
run;quit;

data want;
 retain group a quantity;
 keep  group a quantity max sum;
 retain max beg end .;
 do until(last.group);
    set have;
    by group;
    if first.group    then beg=quantity;
    if a > max        then max=a;
    if last.group then do;
            end=quantity;
            sum=beg+end;
    end;
 end;
 do until(last.group);
   set have;
   by group;
   output;
 end;
 call missing(max, beg, end);

run;quit;

Up to 40 obs WORK.WANT total obs=8

Obs    GROUP     A    QUANTITY    MAX    SUM

 1       1      10        7        10     12
 2       1       8        4        10     12
 3       1       7        3        10     12
 4       1      10        5        10     12
 5       2      11        6        13     15
 6       2      13        8        13     15
 7       2       9        7        13     15
 8       2      13        9        13     15



View solution in original post

10 REPLIES 10
rogerjdeangelis
Barite | Level 11
Great explanation and nice presentation

data have;
input Group A Quantity;
cards4;
 1 10 7
 1 8 4
 1 7 3
 1 10 5
 2 11 6
 2 13 8
 2 9 7
 2 13 9
;;;;
run;quit;

data want;
 retain group a quantity;
 keep  group a quantity max sum;
 retain max beg end .;
 do until(last.group);
    set have;
    by group;
    if first.group    then beg=quantity;
    if a > max        then max=a;
    if last.group then do;
            end=quantity;
            sum=beg+end;
    end;
 end;
 do until(last.group);
   set have;
   by group;
   output;
 end;
 call missing(max, beg, end);

run;quit;

Up to 40 obs WORK.WANT total obs=8

Obs    GROUP     A    QUANTITY    MAX    SUM

 1       1      10        7        10     12
 2       1       8        4        10     12
 3       1       7        3        10     12
 4       1      10        5        10     12
 5       2      11        6        13     15
 6       2      13        8        13     15
 7       2       9        7        13     15
 8       2      13        9        13     15



Khaladdin
Quartz | Level 8

Amazing. Many thanks.

art297
Opal | Level 21

Just of note: @rogerjdeangelis's code will produce the wrong answer if there is a 3 or more way tie for the maximum value. e.g.:

 

data have;
  input Group   A   Quantity   _B     _NewColumn;
  cards;
     1     10        7        10      13 (7+1+5)
     1     10        1        10      13 (7+1+5)
     1      8         4        10      13
     1      7         3        10      13
     1     10        5        10      12
     2     13        6        13      15 (6+9)
     2     10        8        13      15
     2      9         7        13      15
     2     13        9        13      15
 ;

As such, I prefer to use proc sql for such calculations. The following can be written so that everything is nested within one call (i.e., not create a temporary file and then work from it), but I don't think either is more efficient than the other:

 

proc sql;
  create table need as
    select *, max(A) as B,
           (calculated B=A)*quantity as __NewColumn
      from have
        group by Group
  ;
  create table want (drop=_:) as
    select *, sum(__NewColumn) as NewColumn
      from need
        group by Group
  ;
 quit;

Art, CEO, AnalystFinder.com

 

collinelliot
Barite | Level 11

Your description of the problem is far clearer than 90% of what gets posted here.  Here's a possible solution using SQL:

 

data have;
input Group A Quantity;
datalines;
1 10 7
1 8 4
1 7 3
1 10 5
2 11 6
2 13 8
2 9 7
2 13 9
;

proc sql;
    CREATE TABLE want AS
    SELECT a.group, a.a, a.quantity, max(a.a) AS b,
           sum(a.quantity * (a.a = b.max_a)) AS new_column
    FROM have AS a
    LEFT JOIN (SELECT group, max(a) AS max_a
               FROM have
               GROUP BY group) AS b
    ON a.group = b.group
    GROUP BY a.group;
quit;
Khaladdin
Quartz | Level 8
Many thanks. I cannot check it now. I will check it tomorrow. If it works, i will accept it. Many thanks again for your time.
rogerjdeangelis
Barite | Level 11

I guess I read the problem wrong, I thought the op wanted the first  + last  as sum

not the associated with values..

Khaladdin
Quartz | Level 8
Many thanks for your comments. My question is, Assume that I will use sql and create table. Then, will it be seen like a table or dataset? The reason is, I will further work on that datasets. Therefore, I need datasets.
collinelliot
Barite | Level 11

Table and dataset are the same thing, really. Table is SQL nomenclature and dataset is SAS. The code that I've sent you is creating a SAS dataset. If you're querying data in Oracle, for example, there might be some details based on how you're connecting, but the syntax should work.

Khaladdin
Quartz | Level 8
Many thanks. I will check it and reply tomorrow. It will very helpful for me.
Quentin
Super User

This is untested (I'm away from sas) and will probably have mistakes, but a triple DoW loop should work. One pass to get the max per group, second pass to get the sum, third pass to output the records. Something like:

 

data want ;
  do until(last.group) ; 
    by group ;
    set have ;
    B=max(A,B) ;
  end ;

  do until(last.group) ;
    set have ;
    by group ;
    if A = B then NewColumn = sum(NewColumn, Quantity) ;
  end;

  do until(last.group);
    set have ;
    by group;
    output ;
  end ;
run;
BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Discussion stats
  • 10 replies
  • 3279 views
  • 7 likes
  • 5 in conversation