I have a dataset like this, how to access the each value in the last observation? Let's say, I want the last values for the varaibles var2, var3 and var4 whose values are 3, 7.5, 8. I know that we cannot use the last. beacuse my data should not be sorted or grouped. Please help.
1 | 0 | 1 | 6.0 | 9 |
4 | 5 | 8 | 8 | 7 |
3 | 6 | . | 8.0 | 11 |
4 | 5 | 4 | 9.0 | 12 |
6 | 5 | 5 | . | 10 |
3 | 5 | 3 | 7.5 | 8 |
....Feel free to start there next time.
Use PROC STDIZE instead, with the MISSING= and REPONLY options you can replace missing values with a variety of different options, 0, mean, median, etc.
See the documentation for reference.
proc stdize data=have out=want reponly missing=mean;
var var2-var5;
run;
What are you planning to do with said 'last' observation? Is the only way to identify this observation is that its the last record?
You could merge the data with itself, but keep only the last row in the second dataset. This is only one method, it depends on what you're trying to do.
My goal is to replace the missing values with last value of that record, so I need to access the last values
What if the last observation has a missing value for VAR3? Do you want to go back to the next-to-last
observation in that case?
My last observation wont be having missing values. Even its missing I want to replace with the missing itself or leave.
OK, here's a generic way:
data want;
if _n_=1 then set have nobs=_nobs_ point=_nobs_ (rename=(var1-var4 = replacement_1 - replacement_4));
set have;
*** Add any statements you want here to change VAR1 - VAR4 based on replacement variables;
run;
You can use this method to merge the record in. You can use COALESCE() to replace a value with antoher if missing.
If you don't know the number of obs, you can pre calculate it using a few different options.
data have;
input var1-var5;
cards;
1 0 1 6.0 9
4 5 8 8 7
3 6 . 8.0 11
4 5 4 9.0 12
6 5 5 . 10
3 5 3 7.5 8
;
run;
data want;
set have;
if _n_ = 1 then do;
set have (firstobs=6 rename=var1-var5 = nvar1-nvar5);
end;
run;
I appended the last to row to the original dataset, so I dont want to do a set statement again to redo it
The actual question is... I have a dataset with misssing values, now I have to replace the missing values in each colums with the average of its respective column.
My appraoch - I took the mean values in to the variables using proc means and then appended that row to the original dataset, I did this beacause if there are 200 variables I thought this would be easy. But I think this is not working.
Please tell me how would you achieve this?
....Feel free to start there next time.
Use PROC STDIZE instead, with the MISSING= and REPONLY options you can replace missing values with a variety of different options, 0, mean, median, etc.
See the documentation for reference.
proc stdize data=have out=want reponly missing=mean;
var var2-var5;
run;
sure. Thanks for the help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.