How do i find number of firms by cusip and each year?
thanks a lot.
Hello Dear Cynthia,
I am a beginner and undergraduate student who is trying to replicate a finance paper individually. First obstacle was i needed to learn sas by myself, and still learning. as far as i thought, cusip is an unique identifier which represents a firm or company.
this is a sample of my data set: from here, i want to get number of firms by cusip for each year. how may firms in a year. of course, i was not doing anything and i had tried to use count function;
proc sql;
select count(distinct cusip) as count_cusip, count(distinct names date) as count_names date from tmp1.msenames;
quit;
i have attached a sample data set.
thanks for you reply and feedback,
Best wishes,
sincerely Daniel
If you're just getting started, you may want to consider SAS UE. It has tasks and code snippets that make it easier to figure out how to do something. You're also trying to learn SAS and SQL at the same time which is just a touch more difficult, but still a good idea.
Regardless of your level, as Cynthia has mentioned, learning how to ask a good question, in programming and generally is an important skill.
Your code generates errors, did you try and resolve those? I don't download excel files, so I'm replicating your example using the SASHELP.STOCKS dataset, which is a SAS sample dataset available to everyone.
56 proc sql;
57 select count(distinct stock) as count_stock,
58 count(distinct stock date) as count_names date from sashelp.stocks;
____
22
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, '.', /, <, <=, <>, =, >, >=, ?, AND, CONTAINS,
EQ, EQT, GE, GET, GT, GTT, LE, LET, LIKE, LT, LTT, NE, NET, OR, ^=, |, ||, ~=.
ERROR 76-322: Syntax error, statement will be ignored.
The error here is telling you that you have too many arguments in your count() function. You can only have one variable listed.
Starting small, here's how to count each distinct stock in the file:
proc sql;
create table count_stocks as
select count(distinct stock) as count_stock from sashelp.stocks;
quit;
Then if you want stocks by date, you can add date to the SELECT and a GROUP statement, since you're doing analysis by groups.
proc sql;
create table count_stocks_by_date as
select date, count(distinct stock) as count_stock
from sashelp.stocks
group by date;
quit;
Hello Reeza,
Thanks for you advice, i will keep in mind. I tried to implement your simple code in my sas. but it does not work. what is the problem? too many arguments?
proc sql;
create table count_stocks_by_date as
select date, count(distinct cusip number) as count_stock
from tmp1.secnmd
group by effective date;
quit;
this is what i tried to do.
When programming, you have to use the same names for variables/columns that SAS recognizes. In you code, you referred to the same value as "date" and "effective_date" and used the phrase "cusip number" instead of "cusip". SAS should have given you grief in the log. Did you look at the log? SAS will tell you what problems it has with your code. Though, keep in mind that that can sometimes be a symptom, not the main cause, of the problem.
Hi:
And, if you are going to use SASHELP.STOCKS, there are ways to count without using PROC SQL. For example:
proc freq data=sashelp.stocks;
table stock / nocum ;
table date*stock / list;
format date year4.;
run;
proc tabulate data=sashelp.stocks;
class date stock;
table date,
stock;
format date year4.;
run;
proc report data=sashelp.stocks;
column date stock;
define date / group f=year4.;
define stock / across;
run;
proc means data=sashelp.stocks mean n nonobs;
class date stock;
var high;
format date year4.;
run;
In the PROC MEANS example, the MEAN of the variable HIGH will be calculated, as an additional example of the type of processing you can get without using SQL.
cynthia
Having looked at you dataset (just called "data" below), I think you might want to count the number of firms/cusip by year. If so, you need to get the year from the effect_date and group by that value named "eff_year" below.
proc sql;
select year(effect_date) as eff_year,
count(distinct cusip) as cusip_cnt
from data
group by eff_year;
i thank all of you from bottom of the heart.
Best wishes.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.