BookmarkSubscribeRSS Feed
ag8711
Calcite | Level 5

Hello,

 

I have a very large dataset with several variables, including stock ticker and security ID.  I am trying to include only observations where the stock ticker was constant over the time period I have in my dataset. Here are samples of 2 stocks:

TICKER SECURITY_ID
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ

5001

AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL

101594

I would want to keep all observations with security ID 101594 and delete all observations with security ID 5001. So, if there is a "ZZZZ" present in any single observation with a security ID, I need to delete all observations with the security ID. In other words, every observation with the same security ID should have the same ticker.

 

Thanks so much, I really appreciate it!

 

 

 

4 REPLIES 4
ballardw
Super User

How large is large?

 

Here is one approach but I make no claim for efficiency:

proc sql;
   create table want as
   select have.* 
   from (
         select security_id,count(ticker) as tcount
         from (select distinct ticker, security_id from have )
         group by security_id
         having calculated tcount = 1
        ) as a
        left join have 
        on a.security_id=have.security_id
   ;
quit;

the innermost part gets a set of single ticker and security_id, then count to see if there are more than one ticker associated with the security_id. Then use the security_id with a count of 1 to match to your whole data set to select the desired tickets (and any other data in the set).

 

Reeza
Super User

untested:

 

proc sql;
create table want as
select *
from have
group by security_id
having max(ticker)=min(ticker);
quit;
PGStats
Opal | Level 21

This might work:

 

proc sql;
create table want as
select *
from have
group by security_id
having count(distinct ticker) = 1;
select * from want;
quit;
PG
Quentin
Super User

If you're interested in DATA step solutions, you could use a double-DoW loop.  It requires two passes of the data, and the data must be sorted.  You process the data in groups, by Security_ID, first to identify whether Ticker changes within the group, and then a second time to output the records for Security_ID groups where Ticker is constant.

 

data want;
  do until(last.Security_ID);
    set have;
    by Security_ID Ticker;

    *If ticker changes within security_ID, flag the security_ID;
    if first.Ticker=1 and first.Security_ID=0 then _deleteme=1;
  end;

  do until(last.Security_ID);
    set have;
    by Security_ID;

    if NOT (_deleteme) then output;
  end;

  drop _: ;
run;

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 4070 views
  • 1 like
  • 5 in conversation