The SAS Output Delivery System and reporting techniques

PROC FORMAT with REGEXP

Reply
N/A
Posts: 0

PROC FORMAT with REGEXP

Hi,

Is it possible (and how) to use regular expresions in PROC FORMAT procedure?

I have the following data

DATA test;
INPUT variable;
CARDS;
AAA
ABB
AC
BAA
BBB
BC
C
CCC
;
RUN;

and I would like to use something like like this
PROC FORMAT;
VALUE myformat
"A*"="USA"
"BA*"="Canada"
"BC*"="Mexico"
OTHER="rest of world";
RUN;

(If the string stars by "A", then "USA"; if starts by "BA" then Canada,...)

Is there any way haw to manage this?

Thank you in advance!
Jaroslav
Valued Guide
Posts: 2,177

Re: PROC FORMAT with REGEXP

Posted in reply to deleted_user
"big birdie" had a project along the lines to use regex in user defined informats.
Lack of user interest/demand seems to have pushed it.
However, what your example demonstrates can be achieved without regex, by using ranges
PROC FORMAT;
VALUE myformat
"A" - "AZ" ="USA"
"BA" - 'BAZ' ="Canada"
"BC" - BCZ ="Mexico"
OTHER="rest of world";
here I used "Z" but if you are generating a cntlin= data set, you could use high values 'FF'x
PROC Star
Posts: 1,759

Re: PROC FORMAT with REGEXP

Posted in reply to deleted_user
why regexp?

PROC FORMAT;
VALUE myformat
"A" - < "B"="USA"
"BA" - < "BC"="Canada"
"BC" - "BC["="Mexico"
OTHER="rest of world"; * [ comes after Z in the ascii sequence;
RUN;

Can't test here, but I think this should work.
N/A
Posts: 0

Re: PROC FORMAT with REGEXP

My motivation tu use regexp was to be able to define format like

PROC FORMAT;
VALUE myformat2
".D*"="Daily data"
".M*"="Monthly data"
".A*"="Anual data";
RUN;

(depending on the second position of the string, without using extra variable nor reading all dataset for posible values and using them to define the format).

Does anybody know how to solve this 2nd example?

Thank you very much
Valued Guide
Posts: 2,177

Re: PROC FORMAT with REGEXP

Posted in reply to deleted_user
use substr() to start looking at a specific position.
OK, regex allows you to look for a pattern at a non-specific position.
N/A
Posts: 0

Re: PROC FORMAT with REGEXP

I wanted to apply the format over an existing dataset without creating new variable by substr... but it seems to be impossible...
Super Contributor
Super Contributor
Posts: 3,174

Re: PROC FORMAT with REGEXP

Posted in reply to deleted_user
Associate your output format name to your existing SAS variable, using the SAS FORMAT statement?

Scott Barry
SBBWorks, Inc.
PROC Star
Posts: 1,759

Re: PROC FORMAT with REGEXP

Posted in reply to deleted_user
I reckon it is possible.
2 ways you could still use the formatted value like you want without reading the dataset before hand:

1) bulldozer
=========
proc format ;
value $myformat
"AD" - "AD" 'ff'x='daily' /* as many entries as possible prefixes */
"BD" - "BD" 'ff'x='daily'
"CD" - "CD" 'ff'x='daily'
"DD" - "DD" 'ff'x='daily'
....
"AM" - "AM" 'ff'x='monthly'
"BM" - "BM" 'ff'x='monthly'
...
;


2) view
========
proc format ;
value $myformat
"D" - "D" 'ff'x='daily'
"M" - "M" 'ff'x='monthly'
;

data MYDATA_V/view=MYDATA_V;
set MYDATA;
MYVAR2=substr(MYVAR,2);
format MYVAR2 $x.;
proc print data=MYDATA_V(drop=MYVAR);
run;
Ask a Question
Discussion stats
  • 7 replies
  • 152 views
  • 0 likes
  • 4 in conversation