BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
amanjot_42
Fluorite | Level 6

Hi guys,

I am new to SAS!

Actually, I was trying to limit my sample into those firms which have only event_year or post-event sale values. Data for both pre and post events are in my sample. The event years are different for different firms.  If firms have both pre and post event sales, or only pre-event sales, I want to exclude such firms.

 Data look like:

Cusip Year Sales event_year

11000 2005 10  0

11000 2006  20  1

11000 2007  30  0

11000 2008  40  0

12000 2004  65  1

12000 2005  90  0

12000 2006  65  0

13000 2010  80  0

13000 2011  75  0 

13000 2012  95  1

14000 2008  65  0

14000 2009  45  0

14000 2010  70  1

14000 2011  90  0 

 

So according to the criteria, I want only cusip code 12000 because it has only post-event or event-year values. Please help me with the code!

Regards

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1. Create list of firms where 1 is the first record but not the last record. 

2. Filter main list for list of firms only

 

*assumes data is sorted already;

*create list of firms;
data list_firms;
set have;
by cusip;

if first.cusip and event_year = 1 and not last.cusip then output;

keep cusip;
run;

proc sql;
create table want as
select *
from have 
where cusip in (select cusip from list_firms);
quit;

 

 

If you're not familiar with BY group processing or FIRST/LAST logic I highly recommend it. If this doesn't work please include some cases where it doesn't work, this does work with your test data. 

 


@amanjot_42 wrote:

Hi guys,

I am new to SAS!

Actually, I was trying to limit my sample into those firms which have only event_year or post-event sale values. Data for both pre and post events are in my sample. The event years are different for different firms.  If firms have both pre and post event sales, or only pre-event sales, I want to exclude such firms.

 Data look like:

Cusip Year Sales event_year

11000 2005 10  0

11000 2006  20  1

11000 2007  30  0

11000 2008  40  0

12000 2004  65  1

12000 2005  90  0

12000 2006  65  0

13000 2010  80  0

13000 2011  75  0 

13000 2012  95  1

14000 2008  65  0

14000 2009  45  0

14000 2010  70  1

14000 2011  90  0 

 

So according to the criteria, I want only cusip code 12000 because it has only post-event or event-year values. Please help me with the code!

Regards


 

View solution in original post

6 REPLIES 6
Reeza
Super User
So, as sorted, if the first record isn't 1 you don't want that firm?
amanjot_42
Fluorite | Level 6

Thank for your reply,

Here 1 means the year in which event took place, so I need those firms where event_year is 1 AND have in post-event years at least one observation as well.

In this process, I want to EXCLUDE those firms which have both pre-event and post-event observations, pre-event years and event_year observations only, as well as which have only pre-event observations, i.e. before the event_year.

Reeza
Super User

1. Create list of firms where 1 is the first record but not the last record. 

2. Filter main list for list of firms only

 

*assumes data is sorted already;

*create list of firms;
data list_firms;
set have;
by cusip;

if first.cusip and event_year = 1 and not last.cusip then output;

keep cusip;
run;

proc sql;
create table want as
select *
from have 
where cusip in (select cusip from list_firms);
quit;

 

 

If you're not familiar with BY group processing or FIRST/LAST logic I highly recommend it. If this doesn't work please include some cases where it doesn't work, this does work with your test data. 

 


@amanjot_42 wrote:

Hi guys,

I am new to SAS!

Actually, I was trying to limit my sample into those firms which have only event_year or post-event sale values. Data for both pre and post events are in my sample. The event years are different for different firms.  If firms have both pre and post event sales, or only pre-event sales, I want to exclude such firms.

 Data look like:

Cusip Year Sales event_year

11000 2005 10  0

11000 2006  20  1

11000 2007  30  0

11000 2008  40  0

12000 2004  65  1

12000 2005  90  0

12000 2006  65  0

13000 2010  80  0

13000 2011  75  0 

13000 2012  95  1

14000 2008  65  0

14000 2009  45  0

14000 2010  70  1

14000 2011  90  0 

 

So according to the criteria, I want only cusip code 12000 because it has only post-event or event-year values. Please help me with the code!

Regards


 

amanjot_42
Fluorite | Level 6

thanks a lot!

It worked well for me

novinosrin
Tourmaline | Level 20

Hi @amanjot_42 

 

data have;
input Cusip Year Sales event_year ;
cards;
11000 2005 10  0
11000 2006  20  1
11000 2007  30  0
11000 2008  40  0
12000 2004  65  1
12000 2005  90  0
12000 2006  65  0
13000 2010  80  0
13000 2011  75  0 
13000 2012  95  1
14000 2008  65  0
14000 2009  45  0
14000 2010  70  1
14000 2011  90  0 
;

proc sql;
create table want(drop=t) as
select *
from 
(select * , year<max((event_year=1)*year) as t
from have
group by cusip)
group by cusip
having not max(t);
quit;
Astounding
PROC Star
One pass through the data, sequentially:

data want;
set have;
by cusip;
if first.cusip then do;
if event_year=1 and last.cusip=0
then out_flag='Y';
else out_flag='N';
end;
retain out_flag;
if out_flag='Y':
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 603 views
  • 0 likes
  • 4 in conversation