Hi All,
I ran into a problem where I have the following table:
Year Year_L
2000 2000
2001
2002
2003
2004 2004
2005
2006
2007
2008 2008
2009 2009
2010
2011
and I want
Year Year_L
2000 2000
2001 2000
2002 2000
2003 2000
2004 2004
2005 2004
2006 2004
2007 2004
2008 2008
2009 2009
2010 2009
2011 2009
Basically I want all missing fields to take all previous latest Year_L.
Note that I cannot use lags since data is too disorganized to have it sorted (and it's not quite in my control to make them more organized)
I thought about using max(Year_L) but I'm not sure how to set the "By GROUPS" so that it would take current year > year.
Thank you so much!
Well, something like:
data want; set have (rename=(year_l=tmp)); retain year_l; if tmp ne "" then year_l=tmp; run;
Note, I assume your data is character format as you have not posted the test data in a datastep.
Well, something like:
data want; set have (rename=(year_l=tmp)); retain year_l; if tmp ne "" then year_l=tmp; run;
Note, I assume your data is character format as you have not posted the test data in a datastep.
From your given example, you would sort by col1 then apply that code. If there's no logically order, not sure how you could logically apply any condition. Try posting test data in the form of a datastep which accurately describes your problem. Follow this if you stuck:
Left for the day, will see responses tomorrow.
@chrischoi614 wrote:
I am wondering is there a way to make it work if its not in order? say if 2001 comes before the row of 2000.
If you have a more complex need you shoud provide example input, output and any actual rules for determining assignment.
For example I am not sure which variable you mean to address that comes before 2000.
It may be that the solution is to sort the data first and then apply the existing solution.
Is the case you are thinking like this?
Year Year_L
2001
2000 2000
2002
2003
2004 2004
2005
2006
If so, what value should Year_L have and how do we know that it should get that value?
@chrischoi614 wrote:
The variables actually mean Year and Year of loss. IF there is a loss, Year_L will be populated with the same year.
so since in 2000, there is a loss, in 2001, the latest loss up to 2001 is 2000. Therefore 2001 should have Year_L = 2000. Hope that explains the logic.
Thank you so much!
That really sounds like Sort the data and apply the previous solution should work.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.