Create a table that sums the number of times that lenders lend together

Accepted Solution Solved
Reply
Contributor
Posts: 62
Accepted Solution

Create a table that sums the number of times that lenders lend together


Hi,

I have a table with a lot of variables.

It looks like that table:
PackageIDLender_nameLenderIDyear DealAmount
33Bank of Hong Kong7819951000000
33Citi7919951000000
33BMO8019951000000
56JPMorgan451997560000000
56TD701997560000000
56Citi791997560000000
56SunLife1231997560000000

I have twenty LenderID i'm interessed in.

My observations are from 1995 to 2012. I have 18 years under observation.

My data are grouped by year and by packageID. I want to know how many times by year two lenders are together in a package. For the purpose, I think that I can count the number of packages (i.e. We can see in the table, for example, that BMO and Citi are together in a package in 1995. So this is a count of one.)

I want to create a matrix with my twenty LenderID (one for each year) that counts at each junction the number of packages they are together in.

I have already used a code of distance (proc distance) which creates a matrix after I created a character variable with my numeric variable LenderID. I don't think this is the same process so this is the problem.

Thank you!


Accepted Solutions
Solution
‎01-20-2014 05:33 PM
Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

WindyBoo

That code makes the only simple solution - a join with itself.

the idea is to count the packages shared between pairs of *different* lenders. Of course these lenders are all coming from the same source (your.data in the code above) but there must be rules about how the pairs are brought together.

In the query above those lenders come in two streams - lefthand part of the pair comprising the X axis of your array and the righthand half which could be considered the Y axis. There is of course also the Year dimension to be created in the query. Package is the primary part of the join. Then the comparison of lenders ensures no counts appear on the "diagonal" and that the counts appear on only one side of the diagonal (the other side would only be a repeat).

For such a complex query I consider the code remarkably concise - the nature of a solution using "suitable" tools. But that is just imho.

peterC

View solution in original post


All Replies
Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

Proc sql ;

create matrix_elements as

select A.year, A.lender_id x, B.lender_id y

      , count( distinct A.package_id ) cases

from your.data A

join your.data B

on a.year = b.year

and a.package_id=b.package_id

and a.lender_id GT b.lender_id

group by 1, 2, 3

;

quit ;

Message was edited by: Peter Crawford 23Jan2014 fixing bug referred in later post  to make package_ID non-ambiguous in the count( distinct ........ )

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

Thank for your reply.

I have only one table. I don't need to merge two tables. So it considerably changes the code you suggest. 

Super User
Posts: 19,815

Re: Create a table that sums the number of times that lenders lend together

There isn't two tables, its the table joined to itself. This is a way to allow comparison to rows above or below the current row easily.

Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

Thank you Reeza

regards

Peter

Solution
‎01-20-2014 05:33 PM
Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

WindyBoo

That code makes the only simple solution - a join with itself.

the idea is to count the packages shared between pairs of *different* lenders. Of course these lenders are all coming from the same source (your.data in the code above) but there must be rules about how the pairs are brought together.

In the query above those lenders come in two streams - lefthand part of the pair comprising the X axis of your array and the righthand half which could be considered the Y axis. There is of course also the Year dimension to be created in the query. Package is the primary part of the join. Then the comparison of lenders ensures no counts appear on the "diagonal" and that the counts appear on only one side of the diagonal (the other side would only be a repeat).

For such a complex query I consider the code remarkably concise - the nature of a solution using "suitable" tools. But that is just imho.

peterC

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

Sorry, I didn't want to be impolite.

I began to ''play'' with SAS a few months ago so I have a lot to learn!

I have misunderstood the code you suggest. My bad.

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

Again, thank you for your replies!

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

Hi,

If you don't mind, I have an error  (it must clearly be my fault).

When I use ''group by'', I use ''group by'' by year only in my case?

It said ''ambiguous reference, column packageid is in more than one table'' and ''ambiguous reference, column year is in more than one table''.

I surely do something wrong.

(Thank you again ... You help me a lot!)

Super Contributor
Posts: 644

Re: Create a table that sums the number of times that lenders lend together

In Peter's original code the group by statement was

     group by 1, 2, 3

which means "group by new columns 1, 2, & 3 in the order given in the select statement".

It is equivalent to

     group by A.year, A.lender_id , B.lender_id

If you used either of these expressions you would not get the error because the group by statement states explicitly which of the duplicate references you are using in each case.

Richard

Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

my apologies to and appreciation to .

the ambiguous reference to packageid (perhaps) occurs in my count() function.

as this is an inner join on packageid, it does not matter which reference is used, but apparently one must be provided. Like:

count( distinct a.packageid )

That is based on my assumptions of the code.

For the elimination of doubt, please, when seeking help on an error, provide an extract from the saslog showing the error message with the proc or data step of code which caused the error.

Peter

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

I will take printscreens of the problem next time. It will be clearer.

Thank you for your replies! I really appreciate.

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

Hi,

I've put this code on the side for a few days but it is still problematic.

I'll show you my input and my output.

My code is:

proc sql;

create table matrix_elements as

select A.year, A.lenderID, B.lenderID, count(distinct a.packageID)

from pf4 A join pf4 B

on A.year = B.year

and A.packageID=B.packageID

and A.lenderID GT B.lenderID

group by 1,2,3;

quit;

My log gives me a warning: Variable LenderID already exists on the file work.matrix_elements.


My output is:

YearLenderID_TEMG001
1995785810
1995219761
199530309105
19953030917
1995303094
1995303091
19958490696
1995849069
1995849061
1995849061
199584906101
199511344270
199511344218
1995113442103
199511344282
19951391972
19951391971
19951391971
19951391973
19951391971
1995139643103
199513964318
19951396431
19951396431
1995139643133
1995139643112
19951396431
1995139643105
19951396433
1996785813
.........

I don't understand the final table. It is supposed to count the number of packageID by pair of two LenderIDs but I have just one LenderID identified by the second column. Who is the second LenderID?

Thank you in advance for any help you can provide Smiley Happy

Valued Guide
Posts: 2,177

Re: Create a table that sums the number of times that lenders lend together

nearly there.

PROC SQL allows you to assign a new name when you select the columns to output.

It needs only a very small change to your code: provide the new name Lender_ID2 for the second Lender_ID (see below).

Once you have solved that issue, you might want to assigned a name for the column holding the results of the COUNT() function. The default has been generated as "_TEMG001". In italics below I suggest calling it COUNTS:

select A.year, A.lenderID, B.lenderID as Lender_ID2, count(distinct a.packageID) as counts

from pf4 A join pf4 B


In the code I first offered, both Lender_ID were renamed, one to X the other to Y:


select A.year, A.lender_id x, B.lender_id y

      , count( distinct A.package_id ) cases

Message was edited by: Peter Crawford

sorry about my casual assumption of shortcuts invalid in SAS SQL, but allowed in Teradata SQL.

Teradata does not demand that " AS " in " B.lenderID AS Lender_ID2" 

(that mistake comes from spending too much development time in Teradata and  "off-SAS" Smiley Wink )

Contributor
Posts: 62

Re: Create a table that sums the number of times that lenders lend together

No problem. It was a detail (the ''as'').

Thank you so much!! It works! I'm very happy.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 14 replies
  • 510 views
  • 7 likes
  • 4 in conversation