Hi there,
My question is how to find the nearest match in the same column. Please see the following for a quick illustration. I am new to SAS programming, and only have a preliminary guess that proc sql might do the work. What I am doing now is manually adjusting - it is painful and especially so for over 3,000 observations.
Firm | Year | Quarter | Value |
AA | 2000 | 1 | -100 |
AA | 2000 | 2 | 50 |
AA | 2000 | 3 | 50 |
AA | 2000 | 4 | 0 |
AA | 2001 | 1 | 0 |
AA | 2001 | 2 | -50 |
AA | 2001 | 3 | 100 |
AA | 2001 | 4 | -50 |
AA | 2002 | 1 | 0 |
AA | 2002 | 2 | 50 |
AA | 2002 | 3 | -150 |
AA | 2002 | 4 | 100 |
BB | 2000 | 1 | 0 |
BB | 2000 | 2 | 0 |
BB | 2000 | 3 | 0 |
BB | 2000 | 4 | 50 |
BB | 2001 | 1 | 150 |
BB | 2001 | 2 | 0 |
BB | 2001 | 3 | -100 |
BB | 2001 | 4 | -100 |
I want to find the nearest "Value" match that could add up to zero. For example, for firm AA in 1st quarter 2000, I want to match the nearest two numbers that could add up to 100. I don't want the 50 for firm AA in 2002Q2 nor firm BB 2000Q4. In addition, I also struggle with the case for firm BB, and have no idea how to perform the matching: the two negative numbers add up to -200, the two positive numbers add up to +200, and they maybe in same or different years. To help you understand better, please find the following table for what I have in mind at the end of the day:
Firm | Year | Quarter | Value | Value_matched | YQ_matched |
AA | 2000 | 1 | -100 | 50 | 2000Q2 |
AA | 2000 | 1 | -100 | 50 | 2000Q3 |
AA | 2001 | 2 | -50 | 100 | 2001Q3 |
AA | 2001 | 4 | -50 | 100 | 2001Q3 |
AA | 2002 | 3 | -150 | 50 | 2002Q2 |
AA | 2002 | 3 | -150 | 100 | 2002Q4 |
BB | 2001 | 3 | -100 | 50 | 2000Q4 |
BB | 2001 | 4 | -100 | 100 | 2001Q1 |
For the BB case, it can be 2001Q3 "-100" matched to "50" in 2000Q4, it is also fine if it matches to "100" in 2001Q1 - the order doesn't matter. Thanks in advance! Any help is really appreciated!
Regards,
Michael
You are missing some details.
Your output has
AA | 2000 | 1 | -100 | 50 | 2000Q2 |
AA | 2000 | 1 | -100 | 50 | 2000Q3 |
Why aren't any of the other quarters for AA 2000 included?
why isn't
AA | 2000 | 1 | -100 |
Matched to
AA | 2001 | 3 | 100 |
where the sum would be exactly 0?
There appear to be several other limitations that you have not described.
Sorry for the confusion. Let me explain:
The one you posted is not a match for me because I would like "nearest" match. By "nearest", I mean within +/- two quarters for a given firm, so 2000Q1 shouldn't be related to 2001Q3 - but your idea is exactly right, the following should also be a valid match (which is missing from the original post)
AA 2001 4 -50 0 2002Q1
AA 2001 4 -50 50 2002Q2
Another clarification is about the case of firm BB - they don't have common entry as the other matches. The reason I post it here is that I have no idea how I should deal with such case, and any ideas would be extremely appreciated. The original data is about cash flows - a story would be that I earn 50 in 2000Q4 and 100 in 2001Q1, and I spent 100 each in 2001Q3 and 2001Q4. I manually checked and the data is consistent with my story.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.