BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Nici
Obsidian | Level 7

Good evening SAS community,

 

I have once again run into a problem which seems pretty complex to me and I have no idea how to solve it. I have also not found anything on this forum that comes close to my problem but it is really hard to put in words so I might've missed it. If this is the case I'm extremely sorry. Once again I'm pretty new to SAS and thus do not have a good understanding of how to properly display my problem here.

 

I have the following data:

 

ID; DATE; PERMNO; RETURN; CAPITALIZATION;

1 19990202 a 0.01 400

1 19990202 b 0.04 100

1 19990203 a 0.01 400

1 19990203 b 0.04 100

2 20000101 c 0.01 800

2 20000102 c 0.02 800

3 19300812 d 0.03 750

3 19300813 d 0.03 750

 

The main problem is that some of my IDs have multiple permnos for the same date (see observations 1+2 & 3+4).

What I want to happen is that the observations where this is the case are combined to one in the following way:

-ID: stays the same

-Date: stays the same

-Permno: it doesn't really matter which one is taken, either the first or the second one, whatever is easier I guess

-Return: Now here comes the funny part... 😄 what I want to happen is: A new return is calculated from the old ones in the following way (a.return*(a.capitalization/(a.capitalization+b.capitalization))+b.return*(b.capitalization/(a.capitalization+b.capitalization)))

-Capitalization: take the sum of the two capitalization (a.capitalization+b.capitalization)

 

I have a basic idea on how to identify the specific observations where this problem is apparent. I thought of something like:

 

if ID=lag(ID) AND date=lag(date) AND permno^=lag(permno)

then...

 

and now I do not know how to go on. I have no idea how to properly combine two observations, especially including the calculation of a new return.

 

Thank you very much in advance for helping me out. SAS can be so hard to understand as a newbie sometimes..

 

Have a great evening.

 

Best regards

Nici

 

PS. I'm running on SAS 9.4

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This:

data have;
input ID DATE:yymmdd8. PERMNO $ RETURN CAPITALIZATION;
format date yymmdd10.;
datalines;
1 19990202 a 0.01 400
1 19990202 b 0.04 100
1 19990203 a 0.01 400
1 19990203 b 0.04 100
2 20000101 c 0.01 800
2 20000102 c 0.02 800
3 19300812 d 0.03 750
3 19300813 d 0.03 750
;

proc sql;
create table want as
select
    id,
    date,
    min(permNo) as permNo,
    sum(return*capitalization)/ sum(capitalization) as return,
    sum(capitalization) as capitalization
from have
group by id, date;
select * from want;
quit;
                    ID        DATE  permNo      return  capitalization
              --------------------------------------------------------
                     1  1999-02-02  a            0.016             500
                     1  1999-02-03  a            0.016             500
                     2  2000-01-01  c             0.01             800
                     2  2000-01-02  c             0.02             800
                     3  1930-08-12  d             0.03             750
                     3  1930-08-13  d             0.03             750
PG

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

would  multiple permnos  be sets of two or

 

can it be any number of distinct values a,b,c, and so on ?

Nici
Obsidian | Level 7
PGstats already provided a sufficient solution 🙂
I still want to thank you very much for your answer and your time looking through my problem.

BR
Nici
PGStats
Opal | Level 21

This:

data have;
input ID DATE:yymmdd8. PERMNO $ RETURN CAPITALIZATION;
format date yymmdd10.;
datalines;
1 19990202 a 0.01 400
1 19990202 b 0.04 100
1 19990203 a 0.01 400
1 19990203 b 0.04 100
2 20000101 c 0.01 800
2 20000102 c 0.02 800
3 19300812 d 0.03 750
3 19300813 d 0.03 750
;

proc sql;
create table want as
select
    id,
    date,
    min(permNo) as permNo,
    sum(return*capitalization)/ sum(capitalization) as return,
    sum(capitalization) as capitalization
from have
group by id, date;
select * from want;
quit;
                    ID        DATE  permNo      return  capitalization
              --------------------------------------------------------
                     1  1999-02-02  a            0.016             500
                     1  1999-02-03  a            0.016             500
                     2  2000-01-01  c             0.01             800
                     2  2000-01-02  c             0.02             800
                     3  1930-08-12  d             0.03             750
                     3  1930-08-13  d             0.03             750
PG
Nici
Obsidian | Level 7
Amazing, thank you very much! Have a great weekend.

BR
Nici
Nici
Obsidian | Level 7

Hey @PGStats ,

 

I've been using your code successfully. Somehow I'm having problems now however. Before, as shown in your example, I received only one combined observation. Now the observations are still combined, however, I receive duplicates. If 2 observations where combined, I get the combined observation twice. Why is this the case? I'm really confused because the code worked perfectly before.

I also noticed the following message in the log: "The query requires remerging summary statistics back with the original data"

instead of this output

                    ID        DATE  permNo      return  capitalization
              --------------------------------------------------------
                     1  1999-02-02  a            0.016             500
                     1  1999-02-03  a            0.016             500
                     2  2000-01-01  c             0.01             800
                     2  2000-01-02  c             0.02             800
                     3  1930-08-12  d             0.03             750
                     3  1930-08-13  d             0.03             750

 I get the following one:

                    ID        DATE  permNo      return  capitalization
              --------------------------------------------------------
                     1  1999-02-02  a            0.016             500
1 1999-02-02 a 0.016 500 1 1999-02-03 a 0.016 500
1 1999-02-03 a 0.016 500 2 2000-01-01 c 0.01 800 2 2000-01-02 c 0.02 800 3 1930-08-12 d 0.03 750 3 1930-08-13 d 0.03 750

 

Thank you in advance for your help.

 

BR

Nici

PGStats
Opal | Level 21

The code I suggested will not generate that message. Remerging will occur when columns are named in the select clause that are not summarized or mentioned in the group by clause. Can't say more without seeing your code.

PG
Nici
Obsidian | Level 7

Hey,

 

I think I might know what the problem is then. The shown here was only a simplification of the actual data. This is how my data looks:

 

data have;
input pcusip date permno ret cap anncmt code shrout prc;
datalines;
1 19990202 a 0.01 400 19990209 3 100 4
1 19990202 b 0.04 100 19990209 3 100 1
1 19990203 a 0.01 400 19990209 3 100 4
1 19990203 b 0.04 100 19990209 3 100 1
2 20000101 c 0.01 800 20000310 3 400 2
2 20000102 c 0.02 800 20000310 3 400 2
3 19300812 d 0.03 750 19300810 3 750 1
3 19300813 d 0.03 750 19300810 3 750 1
;

 

My Code:

 

proc sql; /*Capital weight the returns of same CUSIP with multiple permnos on same date and combine the permnos to one*/
create table want as
select
pcusip,
date,
code,
anncmt,
sum(shrout) as shrout,
sum(prc) as prc,
min(permno) as permno, /*for the two cases in this sample the smaller permno is the one that existed first*/
sum(ret*cap)/ sum(cap) as ret,
sum(cap) as cap
from have
group by pcusip, date;
select * from want;
quit;

 

From your comment I guess the problem are the variables anncmt & code right? Since they are not part of any calculations but also are not in the group by statement. Is there any way around this problem?

 

Thank you for your help once again.

 

BR

nici

Nici
Obsidian | Level 7

@PGStatsAlright, I fixed it. Thank you again!

 

Have an amazing day

 

BR

nici

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1937 views
  • 3 likes
  • 3 in conversation