BookmarkSubscribeRSS Feed
GKati
Pyrite | Level 9

Hello All,

 

1. I would like to calculate the mean for a restricted sample of Final_claims where tag_sample=1.  How would I build this into the following code?

 

2. Also, I would like to create a variable N_mean that records the number of observations (per PC4 and year) that were used in the above calculation of the mean. 

 

data final_claims;

set final_claims;

by PC4 year;

y=mean(x);

run;

 

Thanks

 

8 REPLIES 8
Kurt_Bremser
Super User

Such is best done in SQL:

proc sql;
create table want as
select
  PC4, year,
  avg(x) as y,
  count(*) as n_mean
from final_claims
where tag_sample = 1
group by PC4, year
;
quit;

 Edit: moved the where condition up.

GKati
Pyrite | Level 9

1.  When running KurtBremser's code I get the following error message after the "where"-statement  

proc sql;
create table want as
select
  PC4, year,
  avg(x) as y,
  count(*) as n_mean
from final_claims
where tag_sample = 1
group by PC4, year
;
quit;

ERROR 22-322: Syntax error, expecting one of the following: ;, !, !!, &, (, *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND,

              BETWEEN, CONTAINS, EQ, EQT, EXCEPT, GE, GET, GT, GTT, HAVING, IN, INTERSECT, IS, LE, LET, LIKE, LT, LTT, NE, NET,

              NOT, NOTIN, OR, ORDER, OUTER, UNION, ^, ^=, |, ||, ~, ~=. 

ERROR 76-322: Syntax error, statement will be ignored.

 

2. Also, I was hoping to get a VARIABLE with these values. This seems to refer to the separate table. Is it possible to directly calculate these as a variable?

Kurt_Bremser
Super User

If you want your stat re-merged, do this:

proc sql;
create table want as
select
  *,
  avg(x) as y,
  count(*) as n_mean
from final_claims
where tag_sample = 1
group by PC4, year
;
quit;

I advise against overwriting a dataset in the same step. If something happens, you need to recreate the dataset.

GKati
Pyrite | Level 9

 

 

Kurt_Bremser
Super User

Try this:

proc sql;
create table want as
select
  a.*,
  avg(x) as y,
  count(*) as n_mean
from final_claims a
where tag_sample = 1
group by PC4, year
union all
select
  b.*,
  0 as y,
  0 as n_mean
from final_claims b
where tag_sample ne 1
;
quit;
Jagadishkatam
Amethyst | Level 16

Please use the proc sql if you want to get the mean in this case

 

data final_claims;

set final_claims;

by PC4 year;

y=mean(x);

run;

proc sql;
create table final_claims2 as select * , mean(x) as y from final_claims group by PC4, year;
quit;
Thanks,
Jag
ballardw
Super User

You should also be aware of the procedures designed for such:

 

proc summary data final_claims nway;

   class PC4 year;

   where tag_sample=1;

   var x;

   output out=want  mean=x_mean n=x_n;

run;

 

Note that without the nway you can get summaries for all the data meeting the where clause, PC4 only, Year only in addition to the combinations of PC4 and year. The automatic variable _type_ is used to indicate which combination the resulting record represents.

 

Also if you have many variables you can place them all on the var statement and use Mean= n= std= /autoname to create output variables with the statistic added to the variable name or even specific statistics for only some of the variables.

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1611 views
  • 1 like
  • 4 in conversation