DATA Step, Macro, Functions and more

How to calculate the sum of continous nonmissing value?

Accepted Solution Solved
Reply
Contributor hua
Contributor
Posts: 66
Accepted Solution

How to calculate the sum of continous nonmissing value?

Hello all,

 

I tried to find the sum of continous non-missing value in a dataset by group (stn and year), but still have some trouble to do so. I hope someone in the SAS community groups can help. Thanks a lot!

I have a dataset looks like this:

stn year month day var
1 2000 12 28 0.2
1 2000 12 29 0.5
1 2000 12 30 .
1 2000 12 31 2
1 2001 1 1 0.2
1 2001 1 2 1
1 2001 1 3 .
2 2000 12 28 0.6
2 2000 12 29 .
2 2000 12 30 .
2 2000 12 31 .
2 2001 1 1 0
2 2001 1 2 1.5
2 2001 1 3

3

 

And I want to get  the following:

stn year month day var sum
1 2000 12 28 0.2 0.7
1 2000 12 29 0.5 0.7
1 2000 12 30 .  
1 2000 12 31 2 2
1 2001 1 1 0.2 1.2
1 2001 1 2 1 1.2
1 2001 1 3 .  
2 2000 12 28 0.6 0.6
2 2000 12 29 .  
2 2000 12 30 .  
2 2000 12 31 .  
2 2001 1 1 0 4.5
2 2001 1 2 1.5 4.5
2 2001 1 3 3 4.5

If any one can assist in this problem I would greatly appreciate it!

 

Thank you and hope for your reply!

 

Best,

Hua


Accepted Solutions
Solution
‎06-27-2017 10:12 PM
PROC Star
Posts: 7,473

Re: How to calculate the sum of continous nonmissing value?

Here is one way:

data need;
  set have;
  _check=missing(var);
run;

data want (drop=_:);
  do until(last._check);
    set need;
    by stn year _check notsorted;
    if first._check then call missing(sum);
    sum+var;
  end;
  do until(last._check);
    set need;
    by stn year _check notsorted;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

View solution in original post


All Replies
Trusted Advisor
Posts: 1,566

Re: How to calculate the sum of continous nonmissing value?

You can expend your group (stn year) by adding a new variable to it:

 

data temp;
  set have;
    by stn year;
         retain new_var;
         if first.year then new_var=0;
         if var=. then new_var+1;
run;

Now use temp dataset to sum by stn year new_var and adjust the sum to each of the new group observation.

If you preffer you can drop the new_var from the final dataset.

  

Contributor hua
Contributor
Posts: 66

Re: How to calculate the sum of continous nonmissing value?

Thanks Shmuel, I tried your code, but it doesn't work. After I run the code, the new_var came out as from 1 to n where the var is missing value and if the var is nonmissing value new_var keep constant.
Solution
‎06-27-2017 10:12 PM
PROC Star
Posts: 7,473

Re: How to calculate the sum of continous nonmissing value?

Here is one way:

data need;
  set have;
  _check=missing(var);
run;

data want (drop=_:);
  do until(last._check);
    set need;
    by stn year _check notsorted;
    if first._check then call missing(sum);
    sum+var;
  end;
  do until(last._check);
    set need;
    by stn year _check notsorted;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

Contributor hua
Contributor
Posts: 66

Re: How to calculate the sum of continous nonmissing value?

Thank you very much! It worked well, and I got what I want!
Valued Guide
Posts: 765

Re: How to calculate the sum of continous nonmissing value?

[ Edited ]

Hi, another idea ...


data have;
set have;
new + missing(var) + missing(lag(var));
run;

 

proc sql;

create table want (drop=new) as select *, sum(var) as sum from have 
group by stn, year, month, new
order by stn, year, month, day;
quit;

 

Contributor hua
Contributor
Posts: 66

Re: How to calculate the sum of continous nonmissing value?

Thanks for your reply! I haven't catch up your idea, what do you mean by new + missing(var) + missing(lag(var)); why do we need to do this step? And I tried this code. It seems like most of the results are correct, but several got the wrong number. I don't understand the code, so I could not debug it.
PROC Star
Posts: 7,473

Re: How to calculate the sum of continous nonmissing value?

@MikeZdeb's code would have produced the same result as my code but, contrary to your specs, he included month as a group variable.

 

The following would produce the same result as my code:

data have;
  set have;
  new + missing(var) + missing(lag(var));
run;

proc sql;
  create table want (drop=new) as select *, sum(var) as sum from have 
    group by stn, year, new 
      order by stn, year, month, day;
quit;

He used new to create a separate group variable. missing(var) would increase new by 1 each time there was a missing value for var.

missing(lag(var)) checked the value of the previous var, thus initially had a value of 1. That was also essential to increase the group when the previous value of var was missing.

 

The correct summing was accomplish with the group statement.

 

Art, CEO, AnalystFinder.com

 

Contributor hua
Contributor
Posts: 66

Re: How to calculate the sum of continous nonmissing value?

Yes! That make sence! I tried again, it worked. Thank you both! I'm really appreciate it!
Valued Guide
Posts: 765

Re: How to calculate the sum of continous nonmissing value?

Hi ... MONTH was NOT used as a GROUP variable, just as an ORDER variable to control the look of the output since SQL goes its own merry way when doing the various calculations ...

 

proc sql;
create table want (drop=new) as select *, sum(var) as sum from have
group by stn, year, new
order by stn, year, month, day;
quit;

 

 

Valued Guide
Posts: 765

Re: How to calculate the sum of continous nonmissing value?

The statement ...

 

new + missing(var) + missing(lag(var));

 

creates a new variable named NEW (how prosaic) that is incremented by 1 each time ...

 

#1 a missing value of VAR is encountered in the current observation being processed in the data step

#2 a missing value of VAR in the previous observation that was processed

 

The expression MISSING(VAR) is TRUE and evaluates as 1 if the value is missing and evaluates as 1 (otherwise as zero). The same is true for the MISSING(LAG(VAR)) portion (add a 1 or a zero). Stements like that are a bit cryptic if you have not been introduced to using logical expressions (TRUE/FALSE that evaluates as 1/0) in equations ... just saves a lot of typing longer ways to accomplish the same task but I agree it might be hard to follow.

 

Not sure why it did not work for everything. Send me an example (msz03@albany.edu) and I can help the DEBUG process.

 

new + 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 317 views
  • 3 likes
  • 4 in conversation