Hello,
I have a very large dataset with several variables, including stock ticker and security ID. I am trying to include only observations where the stock ticker was constant over the time period I have in my dataset. Here are samples of 2 stocks:
ABSIE | 5001 |
ABSIE | 5001 |
ABSIE | 5001 |
ABSIE | 5001 |
ABSIE | 5001 |
ZZZZ | 5001 |
ZZZZ | 5001 |
ZZZZ | 5001 |
ZZZZ | 5001 |
ZZZZ | 5001 |
ZZZZ | 5001 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
AAPL | 101594 |
I would want to keep all observations with security ID 101594 and delete all observations with security ID 5001. So, if there is a "ZZZZ" present in any single observation with a security ID, I need to delete all observations with the security ID. In other words, every observation with the same security ID should have the same ticker.
Thanks so much, I really appreciate it!
How large is large?
Here is one approach but I make no claim for efficiency:
proc sql; create table want as select have.* from ( select security_id,count(ticker) as tcount from (select distinct ticker, security_id from have ) group by security_id having calculated tcount = 1 ) as a left join have on a.security_id=have.security_id ; quit;
the innermost part gets a set of single ticker and security_id, then count to see if there are more than one ticker associated with the security_id. Then use the security_id with a count of 1 to match to your whole data set to select the desired tickets (and any other data in the set).
untested:
proc sql;
create table want as
select *
from have
group by security_id
having max(ticker)=min(ticker);
quit;
This might work:
proc sql;
create table want as
select *
from have
group by security_id
having count(distinct ticker) = 1;
select * from want;
quit;
If you're interested in DATA step solutions, you could use a double-DoW loop. It requires two passes of the data, and the data must be sorted. You process the data in groups, by Security_ID, first to identify whether Ticker changes within the group, and then a second time to output the records for Security_ID groups where Ticker is constant.
data want;
do until(last.Security_ID);
set have;
by Security_ID Ticker;
*If ticker changes within security_ID, flag the security_ID;
if first.Ticker=1 and first.Security_ID=0 then _deleteme=1;
end;
do until(last.Security_ID);
set have;
by Security_ID;
if NOT (_deleteme) then output;
end;
drop _: ;
run;
w
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.