BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sjm
Fluorite | Level 6 sjm
Fluorite | Level 6

Dear All,

The dataset I am using has daily data for about 500 firms each year over a period of 15 years. I want to remove firms with a market capitalization (MKTCAP) of less than $5 million. To do this, I computed MKTCAP for each firm at the end of each year. Firms with MKTCAP < 5 are removed each year.

PART 1 of the code attached below, subsets the daily data to obtain only year-end observations.

data RET1;
	set RET;
	by year gvkey date;
	if last.gvkey;
run;

GVKEY refers to the firm's code and RET is the dataset with daily data. YEAR is the year variable created in SAS.

I then used PROC SQL to retain firms with MKTCAP>5 as follows:

proc sql;
create table WANT as
   select * from RET1
    where mktcap>5
  group by year
  order by gvkey, year;
quit;

While this did the trick of keeping only those firms with MKTCAP>5, I am however left with a dataset by GVKEY and YEAR. 

I am not sure how to merge WANT (with yearly obs) with the original dataset RET (with daily obs) to obtain the daily dataset with only MKTCAP>5

Would appreciate your help and advice please. Thanks.

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Amethyst | Level 16

So when we look on the example data for GVKEY = 272762,  you want to delete daily data for 2005(09DEC05 mktcap=1.62) and 2006(29DEC06 mktcap=1.8) but keep daily data for 2007(18DEC07 mktcap=28.45) and 2008(05DEC08 mktcap=11.38), right?

In that case the sql query should be:

proc sql;
create table WANT as
   select * from RET1
    where mktcap>5
  group by year, GVKEY
  order by gvkey, year;
quit;

 

Btw. check code for `WANT_DAILY_v2` it should already do what you want.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

12 REPLIES 12
yabwon
Amethyst | Level 16

Could you share some test data?

 

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sjm
Fluorite | Level 6 sjm
Fluorite | Level 6

Dear Yabwon.

 

Attached is the test data RET for three firms, one of which, has the first few years where MKTCAP<5. The other two firms are large MKTCAP firms where MKTCAP>5 for all years.  Thanks.

 

yabwon
Amethyst | Level 16

Hi,

 

Do I understand correct that you want to get something like:

proc sql;
create table WANT_DAILY as
   select RET.* from RET
    join
   WANT
   on RET.gvkey= WANT.gvkey 
    and RET.year = WANT.year
    order by year gvkey date;
quit;

or am I missing something?

If I'm right, alternative approach could be:

data WANT_DAILY_v2;

  do _N_=1 by 1 until(last.gvkey);  
	 set RET;
	 by year gvkey date;
	 if last.gvkey then 
    do;
      trigger = (mktcap>5);
    end;
  end;

  do _N_=1 to _N_;  
	 set RET;
	 if trigger then output;
  end;

run;

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sjm
Fluorite | Level 6 sjm
Fluorite | Level 6
Dear Bart,
I tried the two suggested codes on the full dataset but the final dataset (WANT_DAILY) still has firms with a mktcap of less than $5m.
yabwon
Amethyst | Level 16

So you want to remove given company if it has at leas one case with mktcap<5 ?

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sjm
Fluorite | Level 6 sjm
Fluorite | Level 6

Sorry, my message is probably not clear. Let me clarify.

 

I have around daily data on about 500 firms over a period of 15 years. What I need to is to form a final dataset where the firms included each year have a market cap greater than $5m. 

 

Let's say that there are 500 firms in 2015 and applying a simple rule, I find that 50 of them have a mktcap that is less than $5m.  This means that, for 2015, the dataset would have daily data on only the 400 firms with a mktcap greater than $5m.  The simple selection rule is to base the mktcap of the firm on its year-end value (last value of the year) and only select those with mktcap>$5m. 

 

By that token, a firm excluded in the earlier years when its mktcap < $5m, may be included in later years when its mktcap exceeds $5m.

 

So, the final dataset should only contain daily data on all firms with mktcap greater than $5m. 

 

I hope this is clearer. Thanks.

  

 

 

 

sjm
Fluorite | Level 6 sjm
Fluorite | Level 6
One minor correction to the message above: For the second sentence of the second para, the figure should be 450 firms not 400 firms.
yabwon
Amethyst | Level 16

So when we look on the example data for GVKEY = 272762,  you want to delete daily data for 2005(09DEC05 mktcap=1.62) and 2006(29DEC06 mktcap=1.8) but keep daily data for 2007(18DEC07 mktcap=28.45) and 2008(05DEC08 mktcap=11.38), right?

In that case the sql query should be:

proc sql;
create table WANT as
   select * from RET1
    where mktcap>5
  group by year, GVKEY
  order by gvkey, year;
quit;

 

Btw. check code for `WANT_DAILY_v2` it should already do what you want.

 

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



sjm
Fluorite | Level 6 sjm
Fluorite | Level 6
Sorry for the late reply. The solution works! Thanks.
mkeintz
PROC Star

You have DAILY data, but want to exclude a firm for the entire year if the end-of-year market cap is <5, yes?   If you have daily data, then you have about 200 records (i.e.for trading dates) for each firm-year, with corresponding daily MKTCAP.

 

Your daily data is sorted by YEAR GVKEY DATE.  So you need to get to the last record of each year for each gvkey, check it's market cap value, then re-read the same daily data, either outputting or not the roughly 200 daily records for that firm/year.

 

Something like this should work.

 

data want (drop=keep_dummy);
  do until (last.gvkey=1);
    set ret ;
    by year gvkey date;
  end;
  keep_dummy=(mktcap>5);
  do until (last.gvkey);
    set ret;
by year gvkey date; if keep_dummy=1 then output; end; run;

 

Note that if a gvkey only has, say, the first 9 months of data for a given year, then the mktcap criterion will depend on the end-of-September status, not end-of-December.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
sjm
Fluorite | Level 6 sjm
Fluorite | Level 6
Sorry for the late reply. Thanks.
ChrisNZ
Tourmaline | Level 20

I probably misunderstood something, but isn't this what you want?

data RET1;
	set RET;
	by year gvkey date;
	if last.gvkey and MKTCAP > 5;
run;

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2438 views
  • 0 likes
  • 4 in conversation