BookmarkSubscribeRSS Feed
ag8711
Calcite | Level 5

Hello,

 

I have a very large dataset with several variables, including stock ticker and security ID.  I am trying to include only observations where the stock ticker was constant over the time period I have in my dataset. Here are samples of 2 stocks:

TICKER SECURITY_ID
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ

5001

AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL

101594

I would want to keep all observations with security ID 101594 and delete all observations with security ID 5001. So, if there is a "ZZZZ" present in any single observation with a security ID, I need to delete all observations with the security ID. In other words, every observation with the same security ID should have the same ticker.

 

Thanks so much, I really appreciate it!

 

 

 

4 REPLIES 4
ballardw
Super User

How large is large?

 

Here is one approach but I make no claim for efficiency:

proc sql;
   create table want as
   select have.* 
   from (
         select security_id,count(ticker) as tcount
         from (select distinct ticker, security_id from have )
         group by security_id
         having calculated tcount = 1
        ) as a
        left join have 
        on a.security_id=have.security_id
   ;
quit;

the innermost part gets a set of single ticker and security_id, then count to see if there are more than one ticker associated with the security_id. Then use the security_id with a count of 1 to match to your whole data set to select the desired tickets (and any other data in the set).

 

Reeza
Super User

untested:

 

proc sql;
create table want as
select *
from have
group by security_id
having max(ticker)=min(ticker);
quit;
PGStats
Opal | Level 21

This might work:

 

proc sql;
create table want as
select *
from have
group by security_id
having count(distinct ticker) = 1;
select * from want;
quit;
PG
Quentin
Super User

If you're interested in DATA step solutions, you could use a double-DoW loop.  It requires two passes of the data, and the data must be sorted.  You process the data in groups, by Security_ID, first to identify whether Ticker changes within the group, and then a second time to output the records for Security_ID groups where Ticker is constant.

 

data want;
  do until(last.Security_ID);
    set have;
    by Security_ID Ticker;

    *If ticker changes within security_ID, flag the security_ID;
    if first.Ticker=1 and first.Security_ID=0 then _deleteme=1;
  end;

  do until(last.Security_ID);
    set have;
    by Security_ID;

    if NOT (_deleteme) then output;
  end;

  drop _: ;
run;

BASUG is hosting free webinars Next up: Jane Eslinger presenting PROC REPORT and the ODS EXCEL destination on Mar 27 at noon ET. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2924 views
  • 1 like
  • 5 in conversation