BookmarkSubscribeRSS Feed
BCNAV
Quartz | Level 8

I have a series of aviation messages that come in as text. I need to be able to use a "like" type of command.

 

At present I have 

 

(case when t1.MSG_RAW_TXT contains " CYQXZZZ" then 1 else 0 end) AS GANDER_VOICE

 

However, there can be messages in the MSG_RAW_TXT variable that are misidentified.  The ones I want ALWAYS have 6 numbers before it....for example EIAAYSYX 030800 CYQXZZZ would be a valid message to set GANDER_VOICE=1 because of the 030800 in front of CYQXZZZ. This example: EIAAYSYX CYQXZZZ would not be a valid message to set GANDER_VOICE=1.

 

I am not sure how this could be done in SAS.  A Microsoft SQL programmer at work here suggested something along the lines of:

 

Like ‘[0-9][0-9][0-9] [0-9] [0-9] [0-9] CYQXZZZ’

 

This would look for any sequence in the MSG_RAW_TXT variable that has a 6 digit number (each number is from 0-9 as stated above), then a space and then the CYQXZZZ.  In this general case we would set GANDER_VOICE=1

 

Thanks for any advice

3 REPLIES 3
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @BCNAV 

 

You can extract the wanted word with a regular expression like this:


data _null_;
	MSG_RAW_TXT = 'EIAAYSYX 030800 CYQXZZZ xyz';
	b = prxchange('s/(.*\b\d{6}) (\w+)(.*)/$2/',-1,MSG_RAW_TXT);
	put b=;
run;

It might be a better approach to extract the airport to a new variable.You could do that in a data step, and then use that  in your SQL, if for some reason you want the boolean variables.

 

 

mkeintz
PROC Star

You have a sequence of space separated words in a character variable, call it TXT.   You can

  1. Set gander=0.
  2. Find the word position W of "CYQXZZZ" in TXT  (the findw function).
  3. If  W>1 you can extract word number W-1 (the scan function), call it PRIOR_WRD
  4. If
    1. PRIOR_WRD is exactly 6 characters, and
    2. non of the characters is a non-digit
    3. then gander=1
data have;
  input txt $50.;
  put txt=;
datalines;
EIAAYSYX 030800 CYQXZZZ
EIAAYSYX CYQXZZZ
run;

data want (drop=prior_wrd);
  set have;
  w=findw(txt,'CYQXZZZ',' ','E');
  gander=0;
  if w>1 then do;
    prior_wrd=scan(txt,w-1);
    if length(prior_wrd)=6 and notdigit(trim(prior_wrd))=0 then gander=1;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PGStats
Opal | Level 21

Try

 

prxmatch("/\b\d{6}\sCYQXZZZ/o", t1.MSG_RAW_TXT) > 0 as GANDER_VOICE

 

Note: the "o" suffix in the pattern tells SAS that the pattern is a constant that should be compiled only once.

PG

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 897 views
  • 1 like
  • 4 in conversation