BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Radwan
Quartz | Level 8

Hello everyone 

i have a huge data for 15 years and for more than 2500 firms. so far it is fine, but when i started analyzing the data i found out that some companies do not have complete data for 15 years. so i tried to use If statement as follows. 

data cashflow; set Radwan; 
if year= 2000 2001 2002 .... ; run;  

 unfortunately  it does not work except when i use only one year as follows: so please help me through giving me the right code to remove the firms that do not have the data for my research period. 

data cashflow; set Radwan; 
if year= 2000;  run; 
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

This means that YEAR (in your data set) is character, not numeric.  Change both WHERE statements:

 

where ("2000" <= year <= "2015");

View solution in original post

15 REPLIES 15
PeterClemmensen
Tourmaline | Level 20

How is your data structured? Please provide an example of what your data looks like.

 

Is the research period always from 2000 to 2015 or can it be any 15 consecutive years?

Radwan
Quartz | Level 8
yes, the research period is 2000 - 2015.
my data look like ( code , year, cashflow) those are the variables
Astounding
PROC Star

This is not the simplest problem, since SAS processes a single observation at a time.  It sounds like you need a tool to examine a set of observations (all the observations for a company) to see what data is there.  Does that sound right?

 

Have you examined your data to determine whether it might contain two observations for the same company/year combination?

 

What would you like the result to be ... throw out companies with missing values?  Fill in missing values with zeros?

 

The planning is the harder part.  The programming might not be simple, but it isn't as difficult as the planning.

Radwan
Quartz | Level 8
i have checked my data, as i said some companies do not have the data of 15 years so i need to delete such companies.
Astounding
PROC Star

While PROC SQL can handle this in a straightforward way, my expertise is more in a DATA step:

 

proc sort data=have;

by company year;

run;

 

data want;

count=0;

do until (last.company);

   set have;

   where (2000 <= year <= 2015);

   by company;

   count + 1;

end;

do until (last.company);

   set have;

   where (2000 <= year <= 2015);

   if count = 16 then output;

end;

drop count;

run;

 

The top loops counts the number of observations for a company, within the proper set of years.  (Note that I allowed that a company might have data for years outside the desired range, but only counted those in the proper set of years.)

 

The bottom loop reads the same observations, then outputs them if all the data is there.  Note that the range you specified is 16 years, not 15 years.

Radwan
Quartz | Level 8
do you mean (company) as code ?
because i use code as variables for the companies
Astounding
PROC Star

Yes.  Use CODE instead of COMPANY.

Radwan
Quartz | Level 8
i just tried this code but i got error message (proc sort data=have;

by company year;

run;



data want;

count=0;

do until (last.company);

set have;

where (2000 <= year <= 2015);

by company;

count + 1;

end;

do until (last.company);

set have;

where (2000 <= year <= 2015);

if count = 16 then output;

end;

drop count;

run;)
Astounding
PROC Star

It looks like you haven't yet replaced "company" with "code" as the variable name.

Radwan
Quartz | Level 8

111.png

Astounding
PROC Star

This means that YEAR (in your data set) is character, not numeric.  Change both WHERE statements:

 

where ("2000" <= year <= "2015");

Radwan
Quartz | Level 8
ohhhhh yes you are right . thanks
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

try using an in condition

 

data cashflow; 
set Radwan; 
if year in (2000, 2001, 2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015); 
run; 
Radwan
Quartz | Level 8
this code is working but it does not exclude the companies that do not have the data for all years.
thanks

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 15 replies
  • 1406 views
  • 3 likes
  • 4 in conversation