BookmarkSubscribeRSS Feed
BCNAV
Quartz | Level 8

When aircraft fly in our airspace they are given a unique numeric ID called FLT_BIZ_UID. As messages are sent between the aircraft and ATC, they are stored (items stored include, the message, aircraft type and a bunch of other things). Every time there is a message by that specific flight a new FLT_BIZ_UID record is made (with the same numeric number of course as the flight is continuing on).

 

For example:

 

FLT_BIZ_UID     15 variables across    EDMONTON_DATALINK    GANDER_DATALINK     GANDER_VOICE  

123456                                                   1                                           0                                       0                           

123456                                                   1                                           0                                       0

333333                                                   1                                           0                                       0

333333                                                   1                                           0                                       0

333333                                                   1                                           0                                       1

 

So as you can see, there are multiple records for FLT_BIZ_UID based on the number of times the aircraft messages ATC

 

I would like to be able to generate a subset of this data.

 

I want all FLT_BIZ_UID observations that are all EDMONTON_DATALINK=1 GANDER_DATALINK=0 and GANDER_VOICE=0

 

So above, all 123456 records would be kept, while none of 333333 would be kept because the last observation of the group containing 333333 in the FLT_BIZ_UID has a GANDER_VOICE=1

 

Not sure how to approach this

 

thanks

8 REPLIES 8
SuryaKiran
Meteorite | Level 14

A sub-query might help. Create a query for finding all the FLT_BIZ_UID that are not in the condition your looking for and use that as a sub-query to eliminate those records.

 

Something like this (untested):

 

proc sql;
select * from have
where EDMONTON_DATALINK=1 and GANDER_DATALINK=0 and GANDER_VOICE=0
	and FLT_BIZ_UID not in (select  FLT_BIZ_UID from have where not (EDMONTON_DATALINK=1 and GANDER_DATALINK=0 and GANDER_VOICE=0));
quit;
Thanks,
Suryakiran
novinosrin
Tourmaline | Level 20

hi @BCNAV pretty straightforward with Boolean


data have;
input FLT_BIZ_UID  EDMONTON_DATALINK    GANDER_DATALINK     GANDER_VOICE ;
cards;
123456                                                   1                                           0                                       0                           
123456                                                   1                                           0                                       0
333333                                                   1                                           0                                       0
333333                                                   1                                           0                                       0
333333                                                   1                                           0                                       1
;

proc sql;
create table want as
select *
from have
group by FLT_BIZ_UID
having sum(EDMONTON_DATALINK=1 and GANDER_DATALINK=0 and GANDER_VOICE=0)=n(FLT_BIZ_UID);
quit;

 

 

novinosrin
Tourmaline | Level 20

The above can also be achieved with a Double DOW or an imposed interleave sorted by FLT_BIZ_UID too but lazy people like me who can't be bothered to type would prefer the above

mkeintz
PROC Star

Double DOW can appear busy, but here's a relatively brief equivalent:

 

data want (drop=_:);
  set have (in=in1) have (in=in2);
  by flt_biz_uid;
  if first.flt_biz_uid then call missing(_n1,_nqualify);
  _n1+in1;
  _nqualify+(in1=1 and EDMONTON_DATALINK=1 and GANDER_DATALINK=0 and GANDER_VOICE=0);
  if in2 and _n1=_nqualify;
run;

 

Unlike the proc sql solution, this requires data to be sorted by flt_biz_uid, but that tradeoff is that this would be faster for large datasets.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
novinosrin
Tourmaline | Level 20

" tradeoff is that this would be faster for large datasets." -- Fully agree 🙂 .

 

EDIT to make a point representing my classmates who are (beginners to  just intermediates):

Their concern:

"Hey Dude, Since you are regular in communities, can you let the community  know our concern i.e

-- Sophisticated  codes are lovely and we like it but makes us  go back to community over and over the very moment we run into a hiccup"--Our 2 cents mate!  DePaul CDM computing lab 

 

I guess yours and DOW etc would fall into high sophistication.  🙂

mkeintz
PROC Star

I would suggest the classmates, or any subscriber, treat Sas communities like any crowd-sourced, free resource.  It's a buffet - take only those offerings that are most suitable to your needs.  However, over my own subscription tenure, I've often only wanted a "non-sophisticated" solution to a problem, but when I occasionally had time, I was happy to see other solutions that stretched my understanding.

 

And Yo DePaul CDM computing lab :  There is no better status than student to justify putting forth a need for alternatives or more thorough explanations.   As far as I can tell - all sas-related comments are welcome on this forum.  So you don't need cover from @novinosrin - nobody will bite.  Probably this note is the harshest you'd ever get.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
DBailey
Lapis Lazuli | Level 10

similar to below but I read your requirements a bit differently:

 

proc sql;
select * from have
where FLT_BIZ_UID not in 
(select FLT_BIZ_UID from have where EDMONTON_DATALINK=0 or GANDER_DATALINK=1 or GANDER_VOICE=1); quit;
s_lassen
Meteorite | Level 14

I would do it like this:

data want;
  merge have(where=(not(Edmonton_datalink=1 and Gander_datalink=0 and Gander_voice=0)) in=wrong)
  have
  ;
  by FLT_BIZ_UID;
  if not wrong;
run;

 

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2701 views
  • 1 like
  • 6 in conversation