turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- MAX with a conditional By group?

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-26-2017 12:02 PM

Hi All,

I ran into a problem where I have the following table:

**Year Year_L**

2000 2000

2001

2002

2003

2004 2004

2005

2006

2007

2008 2008

2009 2009

2010

2011

and I want

**Year Year_L**

2000 2000

2001 2000

2002 2000

2003 2000

2004 2004

2005 2004

2006 2004

2007 2004

2008 2008

2009 2009

2010 2009

2011 2009

Basically I want all missing fields to take all previous latest Year_L.

Note that I cannot use lags since data is too disorganized to have it sorted (and it's not quite in my control to make them more organized)

I thought about using max(Year_L) but I'm not sure how to set the "By GROUPS" so that it would take current year > year.

Thank you so much!

Accepted Solutions

Solution

01-26-2017
12:58 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chrischoi614

01-26-2017 12:09 PM

Well, something like:

data want; set have (rename=(year_l=tmp)); retain year_l; if tmp ne "" then year_l=tmp; run;

Note, I assume your data is character format as you have not posted the test data in a datastep.

All Replies

Solution

01-26-2017
12:58 PM

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chrischoi614

01-26-2017 12:09 PM

Well, something like:

data want; set have (rename=(year_l=tmp)); retain year_l; if tmp ne "" then year_l=tmp; run;

Note, I assume your data is character format as you have not posted the test data in a datastep.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-26-2017 12:17 PM

Hi,

Thank you for your solution!

I am wondering is there a way to make it work if its not in order? say if 2001 comes before the row of 2000.

Thank you once again!

Thank you for your solution!

I am wondering is there a way to make it work if its not in order? say if 2001 comes before the row of 2000.

Thank you once again!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chrischoi614

01-26-2017 12:25 PM

From your given example, you would sort by col1 then apply that code. If there's no logically order, not sure how you could logically apply any condition. Try posting test data in the form of a datastep which accurately describes your problem. Follow this if you stuck:

Left for the day, will see responses tomorrow.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chrischoi614

01-26-2017 12:25 PM

chrischoi614 wrote:

I am wondering is there a way to make it work if its not in order? say if 2001 comes before the row of 2000.

If you have a more complex need you shoud provide example input, output and any actual rules for determining assignment.

For example I am not sure which variable you mean to address that comes before 2000.

It may be that the solution is to sort the data first and then apply the existing solution.

Is the case you are thinking like this?

**Year Year_L**

2001

2000 2000

2002

2003

2004 2004

2005

2006

If so, what value should Year_L have and how do we know that it should get that value?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

01-26-2017 12:35 PM

The variables actually mean Year and Year of loss. IF there is a loss, Year_L will be populated with the same year.

so since in 2000, there is a loss, in 2001, the latest loss up to 2001 is 2000. Therefore 2001 should have Year_L = 2000. Hope that explains the logic.

Thank you so much!

so since in 2000, there is a loss, in 2001, the latest loss up to 2001 is 2000. Therefore 2001 should have Year_L = 2000. Hope that explains the logic.

Thank you so much!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to chrischoi614

01-26-2017 12:41 PM

chrischoi614 wrote:

The variables actually mean Year and Year of loss. IF there is a loss, Year_L will be populated with the same year.

so since in 2000, there is a loss, in 2001, the latest loss up to 2001 is 2000. Therefore 2001 should have Year_L = 2000. Hope that explains the logic.

Thank you so much!

That really sounds like Sort the data and apply the previous solution should work.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

01-26-2017 12:59 PM

Thank you! I will probably just do that.