Hi Friends,
I want to generate a new column for the mean value within the same ID and the same year. How can I write the sas code without using proc sql?
Thanks.
Data I have
ID | year | value |
19450405FCIS | 1 | 143 |
19450405FCIS | 1 | 125 |
19450405FCIS | 2 | 128 |
19450405FCIS | 2 | 137 |
19600917FNMS | 1 | 133 |
19600917FNMS | 1 | 143 |
19600917FNMS | 2 | 119 |
19600917FNMS | 2 | 122 |
data I want:
ID | year | value | average |
19450405FCIS | 1 | 143 | 134 |
19450405FCIS | 1 | 125 | 134 |
19450405FCIS | 2 | 128 | 132.5 |
19450405FCIS | 2 | 137 | 132.5 |
19600917FNMS | 1 | 133 | 138 |
19600917FNMS | 1 | 143 | 138 |
19600917FNMS | 2 | 119 | 120.5 |
19600917FNMS | 2 | 122 | 120.5 |
@superbibi wrote:
I want to generate a new column for the mean value within the same ID and the same year. How can I write the sas code without using proc sql?
You know, I absolutely despise requirements like "Can't use SQL" when you are in a situation where it is reasonable to use SQL because it is the easiest and most direct approach.
So that's my answer. SQL is the easiest and most direct approach. I'd be happy to provide the actual code if you want to see it.
Good morning @PaigeMiller +1
Hi @superbibi This is the perfect situation/case where you can take advantage of proc sql remerge and also for the reason the dataset doesn't have to sorted by id. This is the reasoning behind this.
Do a proc means on the original data, then merge the output from the proc means back onto the original data.
Here's a fully worked example.
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
If you're still having issues please post your code and log and explain what type of issues you're having.
@superbibi wrote:
Hi Friends,
I want to generate a new column for the mean value within the same ID and the same year. How can I write the sas code without using proc sql?
Thanks.
Data I have
ID year value 19450405FCIS 1 143 19450405FCIS 1 125 19450405FCIS 2 128 19450405FCIS 2 137 19600917FNMS 1 133 19600917FNMS 1 143 19600917FNMS 2 119 19600917FNMS 2 122
data I want:
ID year value average 19450405FCIS 1 143 134 19450405FCIS 1 125 134 19450405FCIS 2 128 132.5 19450405FCIS 2 137 132.5 19600917FNMS 1 133 138 19600917FNMS 1 143 138 19600917FNMS 2 119 120.5 19600917FNMS 2 122 120.5
as @PaigeMiller said this is very easy to do in SQl and also @RW9 suggested code is given below. modified code/simplified by using nway as per suggestion of @Reeza. Thanks @reeza
proc means data= have mean nway noprint;
class id year;
output out = have1 mean=/autoname;
run;
proc sort data=have;
by id year;
run;
proc sort data=have1;
by id year;
run;
data want;
merge have have1(keep=id year value_mean);
by id year;
run;
@kiranv_ wrote:
as @PaigeMiller said this is very easy to do in SQl and also @RW9 suggested code is given below. modified code/simplified by using nway as per suggestion of @Reeza@. Thanks @Reeza
proc means data= have mean nway noprint;
class id year;
output out = have1 mean=/autoname;
run; proc sort data=have; by id year; run; proc sort data=have1; by id year; run; data want; merge have have1(keep=id year value_mean); by id year; run;
Yes, way back in the Cretaceous era, when I was fighting off dinosaur attacks with one hand and programming SAS with the other hand, and there was no PROC SQL in SAS, I did it this way as well. The sort of data set HAVE1 is not necessary.
Something fancy but I will not use this:
data have;
input ID $ year value;
cards;
19450405FCIS 1 143
19450405FCIS 1 125
19450405FCIS 2 128
19450405FCIS 2 137
19600917FNMS 1 133
19600917FNMS 1 143
19600917FNMS 2 119
19600917FNMS 2 122
;
data _null_;
if _n_=1 then do;
dcl hash H (ordered: "A",multidata:'y') ;
h.definekey ("id",'year') ;
h.definedata ("id",'year',"value", "average") ;
h.definedone () ;
end;
call missing(average);
do until(last.year);
set have end=lr;
by id year;
average=mean(value,average);
h.add();
end;
h.replace();
if lr then h.output(dataset:'want');
run;
Hi @superbibi ---" How can I write the sas code without using proc sql?"
Aside the above doube DOW is another one equivalent of remerge.
That leads to some kind of assumptions that this could perhaps be some interview question by a nut or atleast i have faced so weird syntax questions where once I forgot the syntax, and i was rejected with the feedback I do not know SAS. I was like, oh well thank you.
On the other hand, it could your curiosity to perhaps learn all the possibilities besides sql presumably you are good in sql.
1. Traditional:
notsorted class statement and proc means/summary
sorted- by statement and proc means/summary
2. SQL - remerge
3. One by one datastep on the first pass, calculate avg and then merge back on the original
4. Fancy DOW double
5. Hash
6. Array along with dow -load nums , calc avg and double pass
7. Super fancy APP load with peekc and calc avg and double pass
8. Self imposed interleave mimicking DOW in1, in2 perform avg on in1 and output in2
prolly once my caffeine sets in, i should add more lol
Thank you all. At least I know now proc sql is simpler than another SAS code.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.