Dear friends,
How can I replace missing values for a specific column for each year? It is very important that two different years do not affect each other. Average should happen within a year.
My dataset is:
year | anomaly |
1/1/2014 | 42 |
1/2/2014 | 43 |
1/3/2014 | 45 |
1/4/2014 | . |
1/5/2014 | 55 |
1/6/2014 | 65 |
1/7/2014 | 75 |
1/8/2014 | 63 |
1/9/2014 | 50 |
1/10/2014 | 48 |
1/11/2014 | 42 |
1/12/2014 | . |
1/1/2015 | 125 |
1/2/2015 | 128 |
1/3/2015 | 125 |
1/4/2015 | 132 |
1/5/2015 | 139 |
1/6/2015 | . |
1/7/2015 | 158 |
1/8/2015 | 150 |
1/9/2015 | 142 |
1/10/2015 | . |
1/11/2015 | 122 |
1/12/2015 | 123 |
1/1/2016 | 1135 |
1/2/2016 | 1139 |
1/3/2016 | 1135 |
1/4/2016 | 1144 |
1/5/2016 | . |
1/6/2016 | 1151 |
1/7/2016 | 1159 |
1/8/2016 | 1144 |
1/9/2016 | 1140 |
1/10/2016 | 1138 |
1/11/2016 | . |
1/12/2016 | 1129 |
1/1/2017 | 5512 |
1/2/2017 | 5516 |
1/3/2017 | 5514 |
1/4/2017 | 5520 |
1/5/2017 | 5525 |
1/6/2017 | 5529 |
1/7/2017 | 5522 |
1/8/2017 | 5519 |
1/9/2017 | . |
1/10/2017 | 5518 |
1/11/2017 | 5514 |
1/12/2017 | . |
data have;
input year :mmddyy10. anomaly;
format year mmddyy10.;
cards;
1/1/2014 42
1/2/2014 43
1/3/2014 45
1/4/2014 .
1/5/2014 55
1/6/2014 65
1/7/2014 75
1/8/2014 63
1/9/2014 50
1/10/2014 48
1/11/2014 42
1/12/2014 .
1/1/2015 125
1/2/2015 128
1/3/2015 125
1/4/2015 132
1/5/2015 139
1/6/2015 .
1/7/2015 158
1/8/2015 150
1/9/2015 142
1/10/2015 .
1/11/2015 122
1/12/2015 123
1/1/2016 1135
1/2/2016 1139
1/3/2016 1135
1/4/2016 1144
1/5/2016 .
1/6/2016 1151
1/7/2016 1159
1/8/2016 1144
1/9/2016 1140
1/10/2016 1138
1/11/2016 .
1/12/2016 1129
1/1/2017 5512
1/2/2017 5516
1/3/2017 5514
1/4/2017 5520
1/5/2017 5525
1/6/2017 5529
1/7/2017 5522
1/8/2017 5519
1/9/2017 .
1/10/2017 5518
1/11/2017 5514
1/12/2017 .
;
proc sql;
create table want as
select *, ifn(anomaly=.,mean(anomaly),anomaly) as imputed_anomaly
from have
group by year(year) ;
quit;
year | anomaly | imputed_anomaly |
---|---|---|
01/12/2014 | . | 52.8 |
01/05/2014 | 55 | 55.0 |
01/08/2014 | 63 | 63.0 |
01/02/2014 | 43 | 43.0 |
01/04/2014 | . | 52.8 |
01/06/2014 | 65 | 65.0 |
01/10/2014 | 48 | 48.0 |
01/07/2014 | 75 | 75.0 |
01/01/2014 | 42 | 42.0 |
01/11/2014 | 42 | 42.0 |
01/09/2014 | 50 | 50.0 |
01/03/2014 | 45 | 45.0 |
01/02/2015 | 128 | 128.0 |
01/01/2015 | 125 | 125.0 |
01/03/2015 | 125 | 125.0 |
01/12/2015 | 123 | 123.0 |
01/11/2015 | 122 | 122.0 |
01/09/2015 | 142 | 142.0 |
01/10/2015 | . | 134.4 |
01/08/2015 | 150 | 150.0 |
01/07/2015 | 158 | 158.0 |
01/06/2015 | . | 134.4 |
01/05/2015 | 139 | 139.0 |
01/04/2015 | 132 | 132.0 |
01/06/2016 | 1151 | 1151.0 |
01/05/2016 | . | 1141.4 |
01/04/2016 | 1144 | 1144.0 |
01/07/2016 | 1159 | 1159.0 |
01/03/2016 | 1135 | 1135.0 |
01/02/2016 | 1139 | 1139.0 |
01/01/2016 | 1135 | 1135.0 |
01/12/2016 | 1129 | 1129.0 |
01/11/2016 | . | 1141.4 |
01/10/2016 | 1138 | 1138.0 |
01/09/2016 | 1140 | 1140.0 |
01/08/2016 | 1144 | 1144.0 |
01/12/2017 | . | 5518.9 |
01/10/2017 | 5518 | 5518.0 |
01/09/2017 | . | 5518.9 |
01/08/2017 | 5519 | 5519.0 |
01/11/2017 | 5514 | 5514.0 |
01/07/2017 | 5522 | 5522.0 |
01/06/2017 | 5529 | 5529.0 |
01/05/2017 | 5525 | 5525.0 |
01/04/2017 | 5520 | 5520.0 |
01/03/2017 | 5514 | 5514.0 |
01/02/2017 | 5516 | 5516.0 |
01/01/2017 | 5512 | 5512.0 |
PROC STDIZE will do this, with the METHOD=MEAN and REPONLY options.
Please provide data as a SAS data step. Do not provide data as screen captures.
You can't just say "it's not working". We don't how to help if that's all the information you give us.
Show us the LOG from the code. Show us the output if it doesn't have the right answer.
Dear sir,
It did not replace it with the mean of each year.
proc stdize data=have out=imputed_have
method=mean reponly;
var anomaly;
run;
You did not use a BY statement.
Also, sometimes we confused ourselves by calling a variable YEAR when it does not contain YEAR, it contains month/day/year. So you need to create a new variable in your data set that contains only the YEAR value, not the month/day/year. Let's call this new column YEAR2.
Then, adding
BY YEAR2;
into PROC STDIZE will work.
data have;
input year :mmddyy10. anomaly;
format year mmddyy10.;
cards;
1/1/2014 42
1/2/2014 43
1/3/2014 45
1/4/2014 .
1/5/2014 55
1/6/2014 65
1/7/2014 75
1/8/2014 63
1/9/2014 50
1/10/2014 48
1/11/2014 42
1/12/2014 .
1/1/2015 125
1/2/2015 128
1/3/2015 125
1/4/2015 132
1/5/2015 139
1/6/2015 .
1/7/2015 158
1/8/2015 150
1/9/2015 142
1/10/2015 .
1/11/2015 122
1/12/2015 123
1/1/2016 1135
1/2/2016 1139
1/3/2016 1135
1/4/2016 1144
1/5/2016 .
1/6/2016 1151
1/7/2016 1159
1/8/2016 1144
1/9/2016 1140
1/10/2016 1138
1/11/2016 .
1/12/2016 1129
1/1/2017 5512
1/2/2017 5516
1/3/2017 5514
1/4/2017 5520
1/5/2017 5525
1/6/2017 5529
1/7/2017 5522
1/8/2017 5519
1/9/2017 .
1/10/2017 5518
1/11/2017 5514
1/12/2017 .
;
proc sql;
create table want as
select *, ifn(anomaly=.,mean(anomaly),anomaly) as imputed_anomaly
from have
group by year(year) ;
quit;
year | anomaly | imputed_anomaly |
---|---|---|
01/12/2014 | . | 52.8 |
01/05/2014 | 55 | 55.0 |
01/08/2014 | 63 | 63.0 |
01/02/2014 | 43 | 43.0 |
01/04/2014 | . | 52.8 |
01/06/2014 | 65 | 65.0 |
01/10/2014 | 48 | 48.0 |
01/07/2014 | 75 | 75.0 |
01/01/2014 | 42 | 42.0 |
01/11/2014 | 42 | 42.0 |
01/09/2014 | 50 | 50.0 |
01/03/2014 | 45 | 45.0 |
01/02/2015 | 128 | 128.0 |
01/01/2015 | 125 | 125.0 |
01/03/2015 | 125 | 125.0 |
01/12/2015 | 123 | 123.0 |
01/11/2015 | 122 | 122.0 |
01/09/2015 | 142 | 142.0 |
01/10/2015 | . | 134.4 |
01/08/2015 | 150 | 150.0 |
01/07/2015 | 158 | 158.0 |
01/06/2015 | . | 134.4 |
01/05/2015 | 139 | 139.0 |
01/04/2015 | 132 | 132.0 |
01/06/2016 | 1151 | 1151.0 |
01/05/2016 | . | 1141.4 |
01/04/2016 | 1144 | 1144.0 |
01/07/2016 | 1159 | 1159.0 |
01/03/2016 | 1135 | 1135.0 |
01/02/2016 | 1139 | 1139.0 |
01/01/2016 | 1135 | 1135.0 |
01/12/2016 | 1129 | 1129.0 |
01/11/2016 | . | 1141.4 |
01/10/2016 | 1138 | 1138.0 |
01/09/2016 | 1140 | 1140.0 |
01/08/2016 | 1144 | 1144.0 |
01/12/2017 | . | 5518.9 |
01/10/2017 | 5518 | 5518.0 |
01/09/2017 | . | 5518.9 |
01/08/2017 | 5519 | 5519.0 |
01/11/2017 | 5514 | 5514.0 |
01/07/2017 | 5522 | 5522.0 |
01/06/2017 | 5529 | 5529.0 |
01/05/2017 | 5525 | 5525.0 |
01/04/2017 | 5520 | 5520.0 |
01/03/2017 | 5514 | 5514.0 |
01/02/2017 | 5516 | 5516.0 |
01/01/2017 | 5512 | 5512.0 |
data have;
input year :mmddyy10. anomaly;
format year mmddyy10.;
cards;
1/1/2014 42
1/2/2014 43
1/3/2014 45
1/4/2014 .
1/5/2014 55
1/6/2014 65
1/7/2014 75
1/8/2014 63
1/9/2014 50
1/10/2014 48
1/11/2014 42
1/12/2014 .
1/1/2015 125
1/2/2015 128
1/3/2015 125
1/4/2015 132
1/5/2015 139
1/6/2015 .
1/7/2015 158
1/8/2015 150
1/9/2015 142
1/10/2015 .
1/11/2015 122
1/12/2015 123
1/1/2016 1135
1/2/2016 1139
1/3/2016 1135
1/4/2016 1144
1/5/2016 .
1/6/2016 1151
1/7/2016 1159
1/8/2016 1144
1/9/2016 1140
1/10/2016 1138
1/11/2016 .
1/12/2016 1129
1/1/2017 5512
1/2/2017 5516
1/3/2017 5514
1/4/2017 5520
1/5/2017 5525
1/6/2017 5529
1/7/2017 5522
1/8/2017 5519
1/9/2017 .
1/10/2017 5518
1/11/2017 5514
1/12/2017 .
;
data want;
do _n_=1 by 1 until(last.year);
set have;
by year groupformat;
format year year.;
_n=sum(n(anomaly),_n);
_sum=sum(anomaly,_sum);
end;
_mean=_sum/_n;
do _n_=1 to _n_;
set have;
if nmiss(anomaly) then anomaly=_mean;
output;
end;
drop _:;
run;
year | anomaly |
---|---|
2014 | 42.0 |
2014 | 43.0 |
2014 | 45.0 |
2014 | 52.8 |
2014 | 55.0 |
2014 | 65.0 |
2014 | 75.0 |
2014 | 63.0 |
2014 | 50.0 |
2014 | 48.0 |
2014 | 42.0 |
2014 | 52.8 |
2015 | 125.0 |
2015 | 128.0 |
2015 | 125.0 |
2015 | 132.0 |
2015 | 139.0 |
2015 | 134.4 |
2015 | 158.0 |
2015 | 150.0 |
2015 | 142.0 |
2015 | 134.4 |
2015 | 122.0 |
2015 | 123.0 |
2016 | 1135.0 |
2016 | 1139.0 |
2016 | 1135.0 |
2016 | 1144.0 |
2016 | 1141.4 |
2016 | 1151.0 |
2016 | 1159.0 |
2016 | 1144.0 |
2016 | 1140.0 |
2016 | 1138.0 |
2016 | 1141.4 |
2016 | 1129.0 |
2017 | 5512.0 |
2017 | 5516.0 |
2017 | 5514.0 |
2017 | 5520.0 |
2017 | 5525.0 |
2017 | 5529.0 |
2017 | 5522.0 |
2017 | 5519.0 |
2017 | 5518.9 |
2017 | 5518.0 |
2017 | 5514.0 |
2017 | 5518.9 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.