Hello everyone
i have a huge data for 15 years and for more than 2500 firms. so far it is fine, but when i started analyzing the data i found out that some companies do not have complete data for 15 years. so i tried to use If statement as follows.
data cashflow; set Radwan;
if year= 2000 2001 2002 .... ; run;
unfortunately it does not work except when i use only one year as follows: so please help me through giving me the right code to remove the firms that do not have the data for my research period.
data cashflow; set Radwan;
if year= 2000; run;
This means that YEAR (in your data set) is character, not numeric. Change both WHERE statements:
where ("2000" <= year <= "2015");
How is your data structured? Please provide an example of what your data looks like.
Is the research period always from 2000 to 2015 or can it be any 15 consecutive years?
This is not the simplest problem, since SAS processes a single observation at a time. It sounds like you need a tool to examine a set of observations (all the observations for a company) to see what data is there. Does that sound right?
Have you examined your data to determine whether it might contain two observations for the same company/year combination?
What would you like the result to be ... throw out companies with missing values? Fill in missing values with zeros?
The planning is the harder part. The programming might not be simple, but it isn't as difficult as the planning.
While PROC SQL can handle this in a straightforward way, my expertise is more in a DATA step:
proc sort data=have;
by company year;
run;
data want;
count=0;
do until (last.company);
set have;
where (2000 <= year <= 2015);
by company;
count + 1;
end;
do until (last.company);
set have;
where (2000 <= year <= 2015);
if count = 16 then output;
end;
drop count;
run;
The top loops counts the number of observations for a company, within the proper set of years. (Note that I allowed that a company might have data for years outside the desired range, but only counted those in the proper set of years.)
The bottom loop reads the same observations, then outputs them if all the data is there. Note that the range you specified is 16 years, not 15 years.
Yes. Use CODE instead of COMPANY.
It looks like you haven't yet replaced "company" with "code" as the variable name.
This means that YEAR (in your data set) is character, not numeric. Change both WHERE statements:
where ("2000" <= year <= "2015");
try using an in condition
data cashflow;
set Radwan;
if year in (2000, 2001, 2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.