BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hua
Obsidian | Level 7 hua
Obsidian | Level 7

Hello all,

 

I tried to find the sum of continous non-missing value in a dataset by group (stn and year), but still have some trouble to do so. I hope someone in the SAS community groups can help. Thanks a lot!

I have a dataset looks like this:

stn year month day var
1 2000 12 28 0.2
1 2000 12 29 0.5
1 2000 12 30 .
1 2000 12 31 2
1 2001 1 1 0.2
1 2001 1 2 1
1 2001 1 3 .
2 2000 12 28 0.6
2 2000 12 29 .
2 2000 12 30 .
2 2000 12 31 .
2 2001 1 1 0
2 2001 1 2 1.5
2 2001 1 3

3

 

And I want to get  the following:

stn year month day var sum
1 2000 12 28 0.2 0.7
1 2000 12 29 0.5 0.7
1 2000 12 30 .  
1 2000 12 31 2 2
1 2001 1 1 0.2 1.2
1 2001 1 2 1 1.2
1 2001 1 3 .  
2 2000 12 28 0.6 0.6
2 2000 12 29 .  
2 2000 12 30 .  
2 2000 12 31 .  
2 2001 1 1 0 4.5
2 2001 1 2 1.5 4.5
2 2001 1 3 3 4.5

If any one can assist in this problem I would greatly appreciate it!

 

Thank you and hope for your reply!

 

Best,

Hua

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

Here is one way:

data need;
  set have;
  _check=missing(var);
run;

data want (drop=_:);
  do until(last._check);
    set need;
    by stn year _check notsorted;
    if first._check then call missing(sum);
    sum+var;
  end;
  do until(last._check);
    set need;
    by stn year _check notsorted;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

View solution in original post

10 REPLIES 10
Shmuel
Garnet | Level 18

You can expend your group (stn year) by adding a new variable to it:

 

data temp;
  set have;
    by stn year;
         retain new_var;
         if first.year then new_var=0;
         if var=. then new_var+1;
run;

Now use temp dataset to sum by stn year new_var and adjust the sum to each of the new group observation.

If you preffer you can drop the new_var from the final dataset.

  

hua
Obsidian | Level 7 hua
Obsidian | Level 7
Thanks Shmuel, I tried your code, but it doesn't work. After I run the code, the new_var came out as from 1 to n where the var is missing value and if the var is nonmissing value new_var keep constant.
art297
Opal | Level 21

Here is one way:

data need;
  set have;
  _check=missing(var);
run;

data want (drop=_:);
  do until(last._check);
    set need;
    by stn year _check notsorted;
    if first._check then call missing(sum);
    sum+var;
  end;
  do until(last._check);
    set need;
    by stn year _check notsorted;
    output;
  end;
run;

Art, CEO, AnalystFinder.com

hua
Obsidian | Level 7 hua
Obsidian | Level 7
Thank you very much! It worked well, and I got what I want!
MikeZdeb
Rhodochrosite | Level 12

Hi, another idea ...


data have;
set have;
new + missing(var) + missing(lag(var));
run;

 

proc sql;

create table want (drop=new) as select *, sum(var) as sum from have 
group by stn, year, month, new
order by stn, year, month, day;
quit;

 

hua
Obsidian | Level 7 hua
Obsidian | Level 7
Thanks for your reply! I haven't catch up your idea, what do you mean by new + missing(var) + missing(lag(var)); why do we need to do this step? And I tried this code. It seems like most of the results are correct, but several got the wrong number. I don't understand the code, so I could not debug it.
art297
Opal | Level 21

@MikeZdeb's code would have produced the same result as my code but, contrary to your specs, he included month as a group variable.

 

The following would produce the same result as my code:

data have;
  set have;
  new + missing(var) + missing(lag(var));
run;

proc sql;
  create table want (drop=new) as select *, sum(var) as sum from have 
    group by stn, year, new 
      order by stn, year, month, day;
quit;

He used new to create a separate group variable. missing(var) would increase new by 1 each time there was a missing value for var.

missing(lag(var)) checked the value of the previous var, thus initially had a value of 1. That was also essential to increase the group when the previous value of var was missing.

 

The correct summing was accomplish with the group statement.

 

Art, CEO, AnalystFinder.com

 

hua
Obsidian | Level 7 hua
Obsidian | Level 7
Yes! That make sence! I tried again, it worked. Thank you both! I'm really appreciate it!
MikeZdeb
Rhodochrosite | Level 12

Hi ... MONTH was NOT used as a GROUP variable, just as an ORDER variable to control the look of the output since SQL goes its own merry way when doing the various calculations ...

 

proc sql;
create table want (drop=new) as select *, sum(var) as sum from have
group by stn, year, new
order by stn, year, month, day;
quit;

 

 

MikeZdeb
Rhodochrosite | Level 12

The statement ...

 

new + missing(var) + missing(lag(var));

 

creates a new variable named NEW (how prosaic) that is incremented by 1 each time ...

 

#1 a missing value of VAR is encountered in the current observation being processed in the data step

#2 a missing value of VAR in the previous observation that was processed

 

The expression MISSING(VAR) is TRUE and evaluates as 1 if the value is missing and evaluates as 1 (otherwise as zero). The same is true for the MISSING(LAG(VAR)) portion (add a 1 or a zero). Stements like that are a bit cryptic if you have not been introduced to using logical expressions (TRUE/FALSE that evaluates as 1/0) in equations ... just saves a lot of typing longer ways to accomplish the same task but I agree it might be hard to follow.

 

Not sure why it did not work for everything. Send me an example (msz03@albany.edu) and I can help the DEBUG process.

 

new + 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 2067 views
  • 3 likes
  • 4 in conversation