I want to create a function called is_date in FCMP and use it in various places within our application
function is_date(name $);
if input(name, ?? yymmddn.)=. then out=0; else out=1;
return(out);
endsub;
Though this works in DATA Step, I am getting an error message in FCMP. I tried browsing the documentation and could not find any such limitation in FCMP.
Error Message
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
Any suggestion is appreciated
Thanks
Selva.
You could take advantage of the fact that the ANYDT... series of informats do not trigger errors for unknown values.
proc fcmp outlib=work.temp.funcs;
function isdate(name $);
if inputn(name, 'anydtdte32.') > . then is_date=1; else is_date=0;
return(is_date);
endsub;
run;
options cmplib=work.temp;
data _null_;
do name = '20201010 ','20201010A';
if isdate(name) then is_date = 'YES'; else is_date = 'NO';
putlog 'Name : ' name 'Is_date : ' is_date;
end;
run;
But now your function is going to be limited to just a few types of data you can test.
Personally I have not found that defining functions is worth the effort or the extra maintenance costs.
%macro is_date(varname_or_quoted_value);
ifc(missing(input(&varname_or_quoted_value,??yymmdd10.)),'NO ','YES')
%mend is_date;
data _null_;
do name = '20201010 ','20201010A';
is_date = %is_date(name);
putlog 'Name : ' name 'Is_date : ' is_date;
end;
run;
Please show an example of the data step you reference.
The Input function is not the same as the Input statement.
Using the input function in a data step similar to your example:
87 data example; 88 name = '20200815'; 89 y = input(name,?? yymmddn.); -------- 48 ERROR 48-59: The informat YYMMDDN was not found or could not be loaded. 90 run;
There is no informat yymmddn.
You might try the yymmdd. , possibly testing the Length of Name to use the correct informat width as yymmdd defaults to 6 so may want yymmdd8 or other
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the <> to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the <> icon or attached as text to show exactly what you have and that we can test code against.
Sure. I will try to explain my scenario in detail
1) This is the data step (log) - That works
26 data _null_;
27 ** Valid Date **;
28 name = '20201001';
29 if input(name, ?? yymmdd8.) > . then is_date = 'YES'; else is_date = 'NO';
30 putlog 'Name : ' name 'Is_date : ' is_date;
31
32 ** InValid Date **;
33 name = '2020100A';
34 if input(name, ?? yymmdd8.) > . then is_date = 'YES'; else is_date = 'NO';
35 putlog 'Name : ' name 'Is_date : ' is_date;
36
37 run;
Name : 20201001 Is_date : YES
Name : 2020100A Is_date : NO
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
2) The same/similar code in FCMP (log) - Compilation Failed
26 proc fcmp outlib=work.temp;
27 function isdate(name $);
28 if input(name, ?? yymmdd8.) > . then is_date=1; else is_date=0;
_
22
200
ERROR 22-322: Expecting a format name.
ERROR 200-322: The symbol is not recognized and will be ignored.
29 return(is_date);
30 endsub;
31 run;
3) FCMP code with no ?? option in the INPUT function - Compiles Good
This code has both a data step calling the function with valid date, as well as one with an invalid date - The invalid date produces error
25 GOPTIONS ACCESSIBLE;
26 proc fcmp outlib=work.temp.funcs;
27 function isdate(name $);
28 if input(name, yymmdd8.) > . then is_date=1; else is_date=0;
29 return(is_date);
30 endsub;
31 run;
NOTE: Function isdate saved to work.temp.funcs.
NOTE: PROCEDURE FCMP used (Total process time):
real time 0.01 seconds
cpu time 0.02 seconds
32 quit;
33 options cmplib=work.temp;
34 data _null_;
35 ** Valid Date **;
36 name = '20201001';
37 if isdate(name) then is_date = 'YES'; else is_date = 'NO';
38 putlog 'Name : ' name 'Is_date : ' is_date;
39 run;
Name : 20201001 Is_date : YES
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
40
41 data _null_;
2 The SAS System
42 ** InValid Date **;
43 name = '2020100A';
44 if isdate(name) then is_date = 'YES'; else is_date = 'NO';
45 putlog 'Name : ' name 'Is_date : ' is_date;
46 run;
ERROR: An illegal argument is used in the function call in function 'INPUTN' in statement number 2 at line 5 column 4.
The statement was:
0 (5:4) ##dbl2 = INPUT( name="2020100A", YYMMDD8.(105312000) ) > .
ERROR: Exception occurred during subroutine call.
Name : 2020100A Is_date : NO
name=2020100A is_date=NO _ERROR_=1 _N_=1
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Hope this helps.
Thanks
Selva.
Functions created by FCMP can be used in other places than a data step. So I think what is going on here is that the FCMP processor is being more restrictive than the datastep function to be compatible in those other uses.
It may help to describe exactly what you trying to accomplish with this function.
Are you expecting to test a number of different informats with until you find one that reads a value as date?
You could take advantage of the fact that the ANYDT... series of informats do not trigger errors for unknown values.
proc fcmp outlib=work.temp.funcs;
function isdate(name $);
if inputn(name, 'anydtdte32.') > . then is_date=1; else is_date=0;
return(is_date);
endsub;
run;
options cmplib=work.temp;
data _null_;
do name = '20201010 ','20201010A';
if isdate(name) then is_date = 'YES'; else is_date = 'NO';
putlog 'Name : ' name 'Is_date : ' is_date;
end;
run;
But now your function is going to be limited to just a few types of data you can test.
Personally I have not found that defining functions is worth the effort or the extra maintenance costs.
%macro is_date(varname_or_quoted_value);
ifc(missing(input(&varname_or_quoted_value,??yymmdd10.)),'NO ','YES')
%mend is_date;
data _null_;
do name = '20201010 ','20201010A';
is_date = %is_date(name);
putlog 'Name : ' name 'Is_date : ' is_date;
end;
run;
My requirement is to create a generic process to ingest file with dynamic schema. In other words, I want this process to read schema type data (name, datatype, length, is_mandatory, etc.,), and read the corresponding file, validate and accept/reject the records.
I had this written completely using macro, with list of macro generating code first. But I had hard time having someone (coming from other language background) understand the macro code. So, in order to improve the readability of the code, thought of rewriting that with less of a macro and more of data step and fcmp.
But, whatever I have done so far, I am still not happy 🙂
See below for my approach (removed a bunch of details)
** Load Schema information for a file;
data schema;
infile datalines delimiter='~' dsd;
input vname :$40. vtype :$10. vlen :8. mandatory :8.;
datalines;
name~string~20~1
age~number~3~1
regdate~date~20~1
;
run;
** Create few macro variables to hold the list of fields and related statements **;
data schema;
set schema end=last;
format cols i_stmt $255.;
retain cols i_stmt;
cols = catx(' ',trim(cols),cats("'",vname,"'"));
i_stmt = catx(' ',i_stmt,vname,cats(':$',vlen,'.'));
if last then do;
call symputx('cols',cols,'G');
call symputx('istmt',i_stmt,'G');
end;
drop cols i_stmt;
run;
%put &cols;
%put &istmt;
** Read the data file as all character data type **;
data indata;
infile datalines delimiter='~' dsd;
input &istmt;
datalines;
John~34~20080101
Jack~27~20161001
Jill~4X1~20000101
~35~20100101
Rob~58~19970A01
Bob~26~20180101
run;
**Few Functions **;
proc fcmp outlib=work.temp.funcs;
** Get field attributes of a given field **;
subroutine getattr(vname $, rc, vtype $, vlen, mandatory );
outargs rc, vtype, vlen, mandatory;
declare hash attr(dataset:'work.schema');
rc=attr.definekey('vname');
rc=attr.definedata('vtype', 'vlen', 'mandatory');
rc=attr.definedone();
rc=attr.find();
endsub;
** Check for missing;
function isblank(name $);
return(missing(name));
endsub;
** Check for valid number;
function isnumber(name $);
return(if prxmatch('/^[+-]?((\d+(\.\d*)?)|(\.\d+))$/',trim(name)) eq 0 then 0 else 1);
endsub;
** Check for leap year;
function isleap(yr);
return ((mod(yr,4) = 0) and ((mod(yr,100) ne 0) or (mod(yr,400) = 0)));
endsub;
** Check for valid date;
function isdate(name $);
if isnumber(name) = 0 then return(0);
dt = input(name,best.);
if dt > 99999999 or dt < 15000000 then return(0);
y = int(dt/10000); m = mod(int(dt/100),100); d = mod(dt,100);
put y m d;
if y > 9999 or y < 1500 then return(0);
if m < 1 or m > 12 then return(0);
if d < 1 or d > 31 then return(0);
if m in (4,6,9,11) then return(d <= 30);
if m = 2 then if isleap(y) then return(d <= 29); else return(d <= 28);
return(1);
endsub;
run;
quit;
options cmplib=work.temp;
** Evaluate the input data and validate data;
data evaluate (drop=col: rc vtype vlen mandatory cvalue);
set indata;
array col $20. col1-col3 (&cols);
length vtype $10. failreason $32.;
do over col;
cvalue = vvaluex(col);
** the getattr function will give me the vtype vlen and mandatory for the given col;
call getattr(col,rc,vtype,vlen,mandatory);
if rc = 0 then do;
** some sample validation **;
if mandatory then do;
if isblank(cvalue) then do;
failreason = 'Missing '||col;
putlog cvalue ' is blank...';
continue;
end;
end;
if vtype = 'number' then do;
putlog 'cvalue = ' cvalue;
if not isnumber(cvalue) then do;
failreason = 'Invalid '||col;
continue;
end;
end;
if vtype = 'date' then do;
if not isdate(cvalue) then do;
failreason = 'Invalid '||col;
continue;
end;
end;
end;
end;
run;
Thanks for all your assistance.
Have fun learning how to use FCMP and let us know what you discover, especially anything cool with Python.
But it really seems this whole thread is an XY problem.
Just read the data into a tall skinny table and then join with the SCHEMA based on the column number. For that you can use POINT= option. If you had column headers then use a hashobject for the schema and you could join on the name.
data schema;
infile datalines dlm='~' dsd truncover;
column+1;
input vname :$40. vtype :$10. vlen mandatory ;
datalines;
name~string~20~1
age~number~3~1
regdate~date~20~1
;
data indata;
infile datalines dlm='~' dsd truncover length=ll column=cc;
row+1;
position=1;
do column=1 by 1 until(cc>ll);
input value :$100. @;
output;
position=cc;
end;
datalines4;
John~34~20080101
Jack~27~20161001
Jill~4X1~20000101
~35~20100101
Rob~58~19970A01
Bob~26~20180101
name~10~2019-12-31~Extra
;;;;
data check;
set indata ;
length failreason $32 ;
p=column;
if column <= ncols then do;
set schema point=p nobs=ncols;
if mandatory and missing(value) then failreason='Missing mandatory field';
else if vtype='number' and missing(input(value,??comma32.)) then
failreason='Invalid number';
else if vtype='date' and missing(input(value,??yymmdd10.)) then
failreason='Invalid date';
if length(value) > vlen then failreason=catx(' ',failreason,'Value too long');
end;
else do;
failreason='Too many values on line';
call missing(of vname vtype vlen mandatory);
end;
run;
proc print ;
where not missing(failreason);
run;
Obs row position column value failreason vname vtype vlen mandatory 8 3 6 2 4X1 Invalid number age number 3 1 10 4 1 1 Missing mandatory field name string 20 1 15 5 8 3 19970A01 Invalid date regdate date 20 1 22 7 20 4 Extra Too many values on line . .
Note that, as a rule of thumb, code created by macros will perform better than functions. Function calls are quite complicated operations for a processor with every single call, dynamically created code needs the extra effort only once.
Thanks.
Sadly proc FCMP is not quite finished. It is an internal tool that was made available to users. That's very useful, don't get me wrong, but it's not fully QA'ed.
Another example is the leak on put statement (the putlog statement is unsupported for some reason):
options cmplib=WORK.FUNCS;
proc fcmp outlib=WORK.FUNCS.MATH;
function aa(F $ ) $32767;
put 'sssssssss';
return('s');
endsub;
run;
data t; s=aa('a'); run; * goes to log;
ods _all_ close;
ods listing;
proc print data=sashelp.class;run; * goes to listing;
%put %sysfunc(aa(a)); * goes to listing;
ods _all_ close;
ods html file="&wdir\t.htm";
proc print data=sashelp.class;run; * goes to html;
%put %sysfunc(aa(s)); * goes to listing;
ods _all_ close;
proc print data=sashelp.class;run; * goes nowhere;
%put %sysfunc(aa(s)); * goes to listing;
or if you use put '%p'; the value of an internal fcmp pointer is returned instead of the %p string.
or this.
Small niggles, put quite few rough edges. @yabwon do you want to add your experience? @Casey_SAS Any comment?
Hi @ChrisNZ,
thanks for bringing it up.
First of all, I love FCMP it is one of my favourite procedure. and I'm using it extensively in my various packages, e.g DFA or BasePlus but there are some "annoying features" which makes it inconvenient sometimes.
Few Array or Hash table related:
options cmplib = _null_;
proc fcmp outlib = work.functions.package;
subroutine ERROR_IN_RESIZING(
);
array TEMP[1] / nosymbols;
static TEMP .;
call dynamic_array(TEMP, 17);
T = dim(TEMP);
put "want 17 -> " " dim(TEMP)=" T;
call dynamic_array(TEMP, 65535);
T = dim(TEMP);
put "want 65535 -> " " dim(TEMP)=" T;
call dynamic_array(TEMP, 42);
T = dim(TEMP);
put "want 42 -> " " dim(TEMP)=" T;
return;
endsub;
run;
options cmplib=work.functions;
data test;
call ERROR_IN_RESIZING();
run;
data have;
input country $ val;
cards;
PL 17
PL 42
US 13
;
run;
options cmplib = _null_;
proc fcmp outlib = work.f.p;
function test(county $);
length county $ 8;
declare hash lkup(dataset:"work.have", MULTIDATA: "Y");
rc = lkup.defineKey("county");
rc = lkup.defineData(all:"Y");
rc = lkup.defineDone();
declare hiter iter('lkup');
NUMBER = lkup.num_items();
put NUMBER;
return (.);
endsub;
run;
quit;
option dlcreatedir;
libname user "%sysfunc(pathname(work))/user";
data inUser;
do x = 1,2,3;
output;
end;
run;
data inUser2;
do x = 0,1,2,3,4;
output;
end;
run;
options cmplib = ();
proc fcmp outlib = work.f.p;
function test(x);
length x 8;
declare hash H(dataset:"inUser");
_RC_ = H.defineKey("x");
_RC_ = H.defineDone();
_RC_ = H.check();
return(not _RC_);
endsub;
run;
options cmplib = work.f;
data T2;
set inUser2;
if test(x) = 1 then output;
run;
options cmplib = _null_;
proc fcmp outlib = work.f.f;
subroutine sub2(val, x, h1, h2, h3, h4, h5, h6, h7);
outargs val;
array HH[1] / nosymbols;
call dynamic_array(HH,x); /* change array size */
HH[1] = h1;
HH[2] = h2;
HH[3] = h3;
HH[4] = h4;
HH[5] = h5;
HH[6] = h6;
HH[7] = h7;
val = SUM(of HH[*]) ;
return;
endsub;
run;
options cmplib = work.f;
data _null_;
val = 0;
x = 7;
call sub2(val, x, 10,20,30,40,50,60,70);
valSum=sum(10,20,30,40,50,60,70);
put "sub2) " _all_;
run;
From the list of "features I wish would be available": functions arguments can have default values (key=value) like macros have.
All the best
Bart
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.