Hi SAS community,
Today I faced one problem when trying to count the number of observations receiving 0-value in my sample.
What I want is to count: for each distinct Type, how many "r" receive 0-value in every year.
My data is as below:
Type date r
13016K 01JAN1987 .
13016K 02JAN1987 .
13016K 05JAN1987 .
.
.
13016K 27DEC2019 0.2352941176
13016K 30DEC2019 0
13016K 31DEC2019 0
.
999673 27DEC2019 0.0368244859
999673 30DEC2019 -0.012915129
999673 31DEC2019 0.0345794393
I use the code like that
Proc sql;
create table want as
select *,
sum(case when r = 0 then 1 else 0 end) as nobs_0_r
from have
group by type,year;
Quit;
But the results turned out is not what I expected
Type date r nob_0_r
13016K 01APR1987 . 0
13016K 27JAN1987 . 0
13016K 13AUG1987 . 0
.
.
13016K 03JAN2019 0 199
13016K 02JAN2019 0 199
13016K 01JAN2019 0 199
.
.
A glance of the result, we can see that the date is not following the order, which should be wrong or else. Do you know how to fix this code, or if it is available, can you please provide me any other code working for my requirement (for each distinct Type, how many "r" receive 0-value in every year)?
And apart from that, I am not sure whether we should use "array" or else in this case, because apart from "r" I have a couple of other variables, just being aware of using "array" to deal with this requirement can harm the value of other variables as well.
Thanks in advance.
Why did you tell SQL to include ALL of the variables if you only wanted three?
To get the YEAR from a DATE use the YEAR() function.
To get 0/1 result just use a boolean expression.
create table want as
select
type
, year(date) as year
, sum(r=0) as nobs_0_r
from have
group by type,year
;
Some example starting data, preferably in the form of data step code pasted into a code box opened on the forum with the </> icon.
What the desired output for the example data should be.
I am afraid that wrong output without input data is margin at best for diagnosing or suggesting specific code.
Your code will generate a message in the log similar to:
NOTE: The query requires remerging summary statistics back with the original data.
That says the SUM was calculated for the groups and merged back on to the entire data. So every combination of type and year gets that summary total.
But without an example of what you expect for the output I can't make a suggestion because there are several ways the output might make sense and I don't see a reason to provide an example to "guess" at the desired output.
Thank you @ballardw for your answer.
Yes, the log also results in: "NOTE: The query requires remerging summary statistics back with the original data.". And, now, from your explanation, I understand that SUM was calculated for the groups and merged back on to the entire data.
What I expected for my output is:
Type year nob_0_r
13016K 1987 0
13016K 1988 0
13016K 1989 3
.
.
As suggested by @ballardw , I also attached one part of my input (for 2 distinct Type from 1/1/1987 to 31/12/2019).
Thanks in advance.
Hi all,
After reading something interesting regarding proc means, I come out with a solution like that:
* I flag 1 for each "r" receiving 0 value
I got year by using year=year(date);
data count;
set have;
by Type year;
if r=0 then zero=1;
run;
*Then I use proc mean to calculate the number of 0-value per year of each stock;
proc means data=count sum noprint;
var zero;
by Type year;
output out=want n=obs sum=zero_total;
run;
I am wondering if my codes contain error or I have some fallacies in writing this code for getting my results.
Why did you tell SQL to include ALL of the variables if you only wanted three?
To get the YEAR from a DATE use the YEAR() function.
To get 0/1 result just use a boolean expression.
create table want as
select
type
, year(date) as year
, sum(r=0) as nobs_0_r
from have
group by type,year
;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.