BookmarkSubscribeRSS Feed
yliu315
Calcite | Level 5

Hi there,

 

My question is how to find the nearest match in the same column. Please see the following for a quick illustration. I am new to SAS programming, and only have a preliminary guess that proc sql might do the work. What I am doing now is manually adjusting - it is painful and especially so for over 3,000 observations.

 

FirmYearQuarterValue
AA20001-100
AA2000250
AA2000350
AA200040
AA200110
AA20012-50
AA20013100
AA20014-50
AA200210
AA2002250
AA20023-150
AA20024100
BB200010
BB200020
BB200030
BB2000450
BB20011150
BB200120
BB20013-100
BB20014-100

 

I want to find the nearest "Value" match that could add up to zero. For example, for firm AA in 1st quarter 2000, I want to match the nearest two numbers that could add up to 100. I don't want the 50 for firm AA in 2002Q2 nor firm BB 2000Q4. In addition, I also struggle with the case for firm BB, and have no idea how to perform the matching: the two negative numbers add up to -200, the two positive numbers add up to +200, and they maybe in same or different years. To help you understand better, please find the following table for what I have in mind at the end of the day:

 

FirmYearQuarterValueValue_matchedYQ_matched
AA20001-100502000Q2
AA20001-100502000Q3
AA20012-501002001Q3
AA20014-501002001Q3
AA20023-150502002Q2
AA20023-1501002002Q4
BB20013-100502000Q4
BB20014-1001002001Q1

 

For the BB case, it can be 2001Q3 "-100" matched to "50" in 2000Q4, it is also fine if it matches to "100" in 2001Q1 - the order doesn't matter. Thanks in advance! Any help is really appreciated!

 

 

Regards,

Michael

2 REPLIES 2
ballardw
Super User

You are missing some details.

 

Your output has

AA 2000 1 -100 50 2000Q2
AA 2000 1 -100 50 2000Q3

 

Why aren't any of the other quarters for AA 2000 included?

why isn't

AA 2000 1 -100

Matched to

AA 2001 3 100

where the sum would be exactly 0?

 

There appear to be several other limitations that you have not described.

yliu315
Calcite | Level 5

Sorry for the confusion. Let me explain:

 

The one you posted is not a match for me because I would like "nearest" match. By "nearest", I mean within +/- two quarters for a given firm, so 2000Q1 shouldn't be related to 2001Q3 - but your idea is exactly right, the following should also be a valid match (which is missing from the original post) 

        AA        2001        4        -50        0        2002Q1 
        AA        2001        4        -50        50        2002Q2 

 

Another clarification is about the case of firm BB - they don't have common entry as the other matches. The reason I post it here is that I have no idea how I should deal with such case, and any ideas would be extremely appreciated. The original data is about cash flows - a story would be that I earn 50 in 2000Q4 and 100 in 2001Q1, and I spent 100 each in 2001Q3 and 2001Q4. I manually checked and the data is consistent with my story.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 640 views
  • 0 likes
  • 2 in conversation