BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Desmond
Calcite | Level 5

I have got some data like this:

Attr_F1Attr_F2Attr_F3obs_lenwant
1233sum( of Attr_F1 - Attr_F3)
1232sum( of Attr_F1 - Attr_F2)

 

I would like to get the sum() of first several columns according to the value of the fourth column(obs_len).

For example: if obs_len =2, then I would like to sum up Attr_F1 and Attr_F2, etc.

 

Could you help me on this? Hopefully a data step to solve it. Thanks a lot.

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

@Desmond wrote:

Hi Thanks a lot for your reply.

However I got this error:

ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing,
 zero,
       or invalid.


So, it appears that you have an observation with a missing value of OBS_LEN. You need to decide how to define the value of variable WANT in this case -- and also in other special cases not covered by your sample data.

 

For example, you could define:

  1. If OBS_LEN is missing or all values of ATTR_F[*] are missing, WANT should be missing as well.
  2. Otherwise, if OBS_LEN is zero or negative, WANT should be set to 0 ("empty sum").
  3. Missing values involved in the sum are treated as zeros, unless all summands are missing (in which case WANT is set to missing).

A more robust DATA step implementing the above rules could look like this:

data want;
set have;
array Attr_F[3];
if obs_len<=.z | ~n(of Attr_F[*]) then want=.;
else if obs_len<=0 then want=0;
else do _n_=1 to obs_len;
  if Attr_F[_n_]>.z then want=sum(want,Attr_F[_n_]);
end;
run;

View solution in original post

6 REPLIES 6
FreelanceReinh
Jade | Level 19

Hello @Desmond and welcome to the SAS Support Communities!

 

Try this:

data want;
set have;
array Attr_F[3];
do _n_=1 to obs_len;
  want=sum(want,Attr_F[_n_]);
end;
run;
Desmond
Calcite | Level 5

Hi Thanks a lot for your reply.

However I got this error:

ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing,
 zero,
       or invalid.

Kurt_Bremser
Super User

The code DOES work, see this:

data have;
input Attr_F1 Attr_F2 Attr_F3 obs_len;
datalines;
1 2 3 3
1 2 3 2
;
run;

data want;
set have;
array Attr_F[3];
do _n_=1 to obs_len;
  want=sum(want,Attr_F[_n_]);
end;
run;

proc print data=want noobs;
run;

Log:

24         data have;
25         input Attr_F1 Attr_F2 Attr_F3 obs_len;
26         datalines;

NOTE: The data set WORK.HAVE has 2 observations and 4 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds
      
29         ;

30         run;
31         
32         data want;
33         set have;
34         array Attr_F[3];
35         do _n_=1 to obs_len;
36           want=sum(want,Attr_F[_n_]);
37         end;
38         run;

NOTE: There were 2 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 2 observations and 5 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds
      

39         
40         proc print data=want noobs;
41         run;

NOTE: There were 2 observations read from the data set WORK.WANT.
NOTE: The PROCEDURE PRINT printed page 1.
2                                                          Das SAS System                            08:14 Wednesday, April 24, 2019

NOTE: PROZEDUR PRINT used (Total process time):
      real time           0.02 seconds
      cpu time            0.00 seconds

Result:

Attr_F1    Attr_F2    Attr_F3    obs_len    want

   1          2          3          3         6 
   1          2          3          2         3 

This once again illustrates the utmost importance of properly posting example data in a data step with datalines.

FreelanceReinh
Jade | Level 19

@Desmond wrote:

Hi Thanks a lot for your reply.

However I got this error:

ERROR: Invalid DO loop control information, either the INITIAL or TO expression is missing or the BY expression is missing,
 zero,
       or invalid.


So, it appears that you have an observation with a missing value of OBS_LEN. You need to decide how to define the value of variable WANT in this case -- and also in other special cases not covered by your sample data.

 

For example, you could define:

  1. If OBS_LEN is missing or all values of ATTR_F[*] are missing, WANT should be missing as well.
  2. Otherwise, if OBS_LEN is zero or negative, WANT should be set to 0 ("empty sum").
  3. Missing values involved in the sum are treated as zeros, unless all summands are missing (in which case WANT is set to missing).

A more robust DATA step implementing the above rules could look like this:

data want;
set have;
array Attr_F[3];
if obs_len<=.z | ~n(of Attr_F[*]) then want=.;
else if obs_len<=0 then want=0;
else do _n_=1 to obs_len;
  if Attr_F[_n_]>.z then want=sum(want,Attr_F[_n_]);
end;
run;
Desmond
Calcite | Level 5

Hi That's exactly the case.

Thanks a lot!! best!

novinosrin
Tourmaline | Level 20
data have;
input Attr_F1 Attr_F2 Attr_F3 obs_len;
datalines;
1 2 3 3
1 2 3 2
;
run;

data want;
set have;
array t(999);
array a(*) attr_:;
call pokelong(peekclong(addrlong(a(1)),obs_len*8),addrlong(t(1)),obs_len*8);
want=sum(of t(*));
drop t:;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 702 views
  • 4 likes
  • 4 in conversation