BookmarkSubscribeRSS Feed
yliu315
Calcite | Level 5

Hi there,

 

My question is how to find the nearest match in the same column. Please see the following for a quick illustration. I am new to SAS programming, and only have a preliminary guess that proc sql might do the work. What I am doing now is manually adjusting - it is painful and especially so for over 3,000 observations.

 

FirmYearQuarterValue
AA20001-100
AA2000250
AA2000350
AA200040
AA200110
AA20012-50
AA20013100
AA20014-50
AA200210
AA2002250
AA20023-150
AA20024100
BB200010
BB200020
BB200030
BB2000450
BB20011150
BB200120
BB20013-100
BB20014-100

 

I want to find the nearest "Value" match that could add up to zero. For example, for firm AA in 1st quarter 2000, I want to match the nearest two numbers that could add up to 100. I don't want the 50 for firm AA in 2002Q2 nor firm BB 2000Q4. In addition, I also struggle with the case for firm BB, and have no idea how to perform the matching: the two negative numbers add up to -200, the two positive numbers add up to +200, and they maybe in same or different years. To help you understand better, please find the following table for what I have in mind at the end of the day:

 

FirmYearQuarterValueValue_matchedYQ_matched
AA20001-100502000Q2
AA20001-100502000Q3
AA20012-501002001Q3
AA20014-501002001Q3
AA20023-150502002Q2
AA20023-1501002002Q4
BB20013-100502000Q4
BB20014-1001002001Q1

 

For the BB case, it can be 2001Q3 "-100" matched to "50" in 2000Q4, it is also fine if it matches to "100" in 2001Q1 - the order doesn't matter. Thanks in advance! Any help is really appreciated!

 

 

Regards,

Michael

2 REPLIES 2
ballardw
Super User

You are missing some details.

 

Your output has

AA 2000 1 -100 50 2000Q2
AA 2000 1 -100 50 2000Q3

 

Why aren't any of the other quarters for AA 2000 included?

why isn't

AA 2000 1 -100

Matched to

AA 2001 3 100

where the sum would be exactly 0?

 

There appear to be several other limitations that you have not described.

yliu315
Calcite | Level 5

Sorry for the confusion. Let me explain:

 

The one you posted is not a match for me because I would like "nearest" match. By "nearest", I mean within +/- two quarters for a given firm, so 2000Q1 shouldn't be related to 2001Q3 - but your idea is exactly right, the following should also be a valid match (which is missing from the original post) 

        AA        2001        4        -50        0        2002Q1 
        AA        2001        4        -50        50        2002Q2 

 

Another clarification is about the case of firm BB - they don't have common entry as the other matches. The reason I post it here is that I have no idea how I should deal with such case, and any ideas would be extremely appreciated. The original data is about cash flows - a story would be that I earn 50 in 2000Q4 and 100 in 2001Q1, and I spent 100 each in 2001Q3 and 2001Q4. I manually checked and the data is consistent with my story.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 242 views
  • 0 likes
  • 2 in conversation