SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to drop observations by group with a condition?

Reply
New Contributor
Posts: 3

How to drop observations by group with a condition?

Hello,

 

I have a very large dataset with several variables, including stock ticker and security ID.  I am trying to include only observations where the stock ticker was constant over the time period I have in my dataset. Here are samples of 2 stocks:

TICKER SECURITY_ID
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ABSIE5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ5001
ZZZZ

5001

AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL101594
AAPL

101594

I would want to keep all observations with security ID 101594 and delete all observations with security ID 5001. So, if there is a "ZZZZ" present in any single observation with a security ID, I need to delete all observations with the security ID. In other words, every observation with the same security ID should have the same ticker.

 

Thanks so much, I really appreciate it!

 

 

 

Super User
Posts: 13,008

Re: How to drop observations by group with a condition?

How large is large?

 

Here is one approach but I make no claim for efficiency:

proc sql;
   create table want as
   select have.* 
   from (
         select security_id,count(ticker) as tcount
         from (select distinct ticker, security_id from have )
         group by security_id
         having calculated tcount = 1
        ) as a
        left join have 
        on a.security_id=have.security_id
   ;
quit;

the innermost part gets a set of single ticker and security_id, then count to see if there are more than one ticker associated with the security_id. Then use the security_id with a count of 1 to match to your whole data set to select the desired tickets (and any other data in the set).

 

Super User
Posts: 22,823

Re: How to drop observations by group with a condition?

untested:

 

proc sql;
create table want as
select *
from have
group by security_id
having max(ticker)=min(ticker);
quit;
Esteemed Advisor
Posts: 5,392

Re: How to drop observations by group with a condition?

This might work:

 

proc sql;
create table want as
select *
from have
group by security_id
having count(distinct ticker) = 1;
select * from want;
quit;
PG
PROC Star
Posts: 1,428

Re: How to drop observations by group with a condition?

[ Edited ]

If you're interested in DATA step solutions, you could use a double-DoW loop.  It requires two passes of the data, and the data must be sorted.  You process the data in groups, by Security_ID, first to identify whether Ticker changes within the group, and then a second time to output the records for Security_ID groups where Ticker is constant.

 

data want;
  do until(last.Security_ID);
    set have;
    by Security_ID Ticker;

    *If ticker changes within security_ID, flag the security_ID;
    if first.Ticker=1 and first.Security_ID=0 then _deleteme=1;
  end;

  do until(last.Security_ID);
    set have;
    by Security_ID;

    if NOT (_deleteme) then output;
  end;

  drop _: ;
run;

Ask a Question
Discussion stats
  • 4 replies
  • 279 views
  • 1 like
  • 5 in conversation