BookmarkSubscribeRSS Feed
AJ_Brien
Quartz | Level 8

Hello mentors,

 

I have a question regarding check for empty formats and would appreciate your guidance 

 

I have datasets that contain ‘id’ values only and I have created separate formats from those id datasets.

There are 6 datasets: table 1-6, they all contain a column called id (char) such that:

-same id value could be present in multiple tables

-the tables are being created by some other process, so they could be blank.

 

%macro crt_frmt(ds);
proc sort data=&ds. nodupkey;
by id;
run;

data &ds._format;
set &ds.;

fmtname = "$&ds._format";
type    = 'c';
start   = strip(id);
end     = strip(id);
label   = 'YES';

run;

proc format cntlin= &ds._format;
run;
%mend;

%crt_frmt(table1);
%crt_frmt(table2);
%crt_frmt(table3);
%crt_frmt(table4);
%crt_frmt(table5);
%crt_frmt(table6);

I then use these table formats to flag observations and in the later part of the code delete and do other things my master dataset using these formats:

 

data master;
set master;
if put(id,$table1_format.) = "YES" then flag=1;
if put(id,$table2_format.) = "YES" then flag=2;
if put(id,$table3_format.) = "YES" then flag=3;
if put(id,$table4_format.) = "YES" then flag=4;
if put(id,$table5_format.) = "YES" then flag=5;
if put(id,$table6_format.) = "YES" then flag=6;
run;

proc freq data=master;
table flag /list missing;
run;

The issue is, there are times when some tables are empty, so that creates an empty format. When I create flags in the ‘master’ datastep, it gives me an error:

"ERROR: The format $table1_format was not found or could not be loaded"

 

Which is understandable since it was an empty dataset. But how do I conditionally run the put statements in my master dataset so that they run only when formats are present?

 

I tried adding this within the macro:

 

%let dsid=%sysfunc(open(&ds.));
%let &ds._format_nobs=%sysfunc(attrn(&dsid,nlobs));
%let dsid=%sysfunc(close(&dsid));
%put &ds._format_nobs= &ds._format_nobs;

Such that in the master dataset I could do:

 

data master;
set master;
if table1_format_nobs>0 then do;
if put(id,$table1_format.) = "YES" then flag=1;
end;
:
:

But this obviously isn’t right, since table1_format_nobs is a macro variable and cannot be referenced this way.

 

Appreciate the help and time!

 

Thank you.

8 REPLIES 8
ballardw
Super User

Instead of conditionally executing data step statements perhaps set a specific format for the formatted value.

%macro crt_frmt(ds);
%if %sysfunc(exist(&ds.)) %then %do;

   proc sort data=&ds. nodupkey;
   by id;
   run;

   data &ds._format;
   set &ds.;

   fmtname = "$&ds._format";
   type    = 'c';
   start   = strip(id);
   end     = strip(id);
   label   = 'YES';

   run;

   proc format cntlin= &ds._format;
   run;
%end;
%else %do;
/* code here to make sure you have a $&ds._format that will work as needed*/
/* maybe */
proc format;
   value  "$&ds._format"
   other = "NO"
   ;
run;
%end;
%mend;

You didn't provide much example of how that format is used so …

AJ_Brien
Quartz | Level 8
Thank you , that makes sense. But even if I do that, I will still get an error when this step executes since this datastep expects a format to be present but it might not be available sometimes.

data master;
set master;
if put(id,$table1_format.) = "YES" then flag=1;
if put(id,$table2_format.) = "YES" then flag=2;
if put(id,$table3_format.) = "YES" then flag=3;
if put(id,$table4_format.) = "YES" then flag=4;
if put(id,$table5_format.) = "YES" then flag=5;
if put(id,$table6_format.) = "YES" then flag=6;
run;

Is there a way to maybe execute each if condition in this datastep so that we first check to see if the fomat exists or not, If it does not, then it does not execute this condition if put(id,$table1_format.) = "YES" then flag=1;
AJ_Brien
Quartz | Level 8
Another way this format will be used is:

data test1 test2 test3 test4 test5 test6;
set master;
if put(id,$table1_format.) = "YES" then output test1;
if put(id,$table2_format.) = "YES" then output test2;
if put(id,$table3_format.) = "YES" then output test3;
if put(id,$table4_format.) = "YES" then output test4;
if put(id,$table5_format.) = "YES" then output test5;
if put(id,$table6_format.) = "YES" then output test6;
run;

There’s one more place where these formats are being used:

data remove;
set master;
if put(id,$table1_format.) = "YES" then delete;
if put(id,$table2_format.) = "YES" then delete;
if put(id,$table3_format.) = "YES" then delete;
if put(id,$table4_format.) = "YES" then delete;
if put(id,$table5_format.) = "YES" then delete;
if put(id,$table6_format.) = "YES" then delete;
run;
AJ_Brien
Quartz | Level 8
these format can definitely be empty sometimes, and I'm trying to figure out a way to know when each different format is null before they end up being executed in an 'if' condition and end up throwing an error.
Tom
Super User Tom
Super User

Your data step that is creating the dataset to feed to PROC FORMAT is wrong.

You have the SET statement as the first statement. That means that if the input is empty the data step stops before it can ever output anything.

So add an OTHER choice to your format definition.  Then empty datasets do not cause the format to not be defined.

Try something like this:

data &ds._format;
  retain fmtname "$&ds._format";
  retain type  'c';
  retain hlo ' ';
  retain label 'YES';
  if eof then do ;
    label='NO';
    hlo='O';
    output;
  end;
  set &ds. (rename=(id=start)) end=eof;
  output;
run;
AJ_Brien
Quartz | Level 8
Thank you for your response.

so this does create a non empty format
I'm also reading articles to learn how to create 'other' option when we create formats from a dataset, but noticed that other is mostly used when formats are created through proc formats and not through datasets: https://documentation.sas.com/?docsetId=proc&docsetTarget=n1e19y6lrektafn1kj6nbvhus59w.htm&docsetVer...
So if I try doing something like this:
%macro crt_frmt (ds);
data &ds._format;
retain fmtname "$&ds._format";
retain type 'c';
retain hlo ' ';
retain label 'YES';
if eof then do ;
label='NO';
hlo='O';
output;
end;
set &ds. (rename=( id=start)) end=eof;
output;
run;

proc format;
value "$&ds._format"
other = "NO"
;

%mend;

It gives this error even though it does create $table1_format successfully for an empty table1. So value "$&ds._format" would feel like a valid statement.
MPRINT(crt_frmt): value "$table1_format" other = "NO" ;
ERROR 22-322: Syntax error, expecting one of the following: a name, $.

ERROR 200-322: The symbol is not recognized and will be ignored.

Complete log:
%CRT_FRMT(table1);
MLOGIC(CRT_FRMT): Beginning execution.
MLOGIC(CRT_FRMT): Parameter DS has value table1
SYMBOLGEN: Macro variable DS resolves to table1
MPRINT(CRT_FRMT): data table1_format;
SYMBOLGEN: Macro variable DS resolves to table1
MPRINT(CRT_FRMT): retain fmtname "$table1_format";
MPRINT(CRT_FRMT): retain type 'c';
MPRINT(CRT_FRMT): retain hlo ' ';
MPRINT(CRT_FRMT): retain label 'YES';
MPRINT(CRT_FRMT): if eof then do ;
MPRINT(CRT_FRMT): label='NO';
MPRINT(CRT_FRMT): hlo='O';
MPRINT(CRT_FRMT): output;
MPRINT(CRT_FRMT): end;
SYMBOLGEN: Macro variable DS resolves to table1
MPRINT(CRT_FRMT): set table1 (rename=(id=start)) end=eof;
MPRINT(CRT_FRMT): output;
MPRINT(CRT_FRMT): run;

NOTE: There were 0 observations read from the data set WORK.TABLE1.
NOTE: The data set WORK.TABLE1_FORMAT has 1 observations and 5 variables.
NOTE: Compressing data set WORK.TABLE1_FORMAT increased size by 100.00 percent.
Compressed is 2 pages; un-compressed would require 1 pages.
MPRINT(CRT_FRMT): proc format;
SYMBOLGEN: Macro variable DS resolves to table1
NOTE: Line generated by the macro variable "DS".
26 "$table1_format
_________________________
22
200
MPRINT(CRT_FRMT): value "$table1_format" other = "NO" ;
ERROR 22-322: Syntax error, expecting one of the following: a name, $.

ERROR 200-322: The symbol is not recognized and will be ignored.

NOTE: The previous statement has been deleted.
MLOGIC(CRT_FRMT): Ending execution.
Tom
Super User Tom
Super User

To create a format from a dataset use the CNTLIN= option on the PROC FORMAT statement.

proc format cntlin=&ds._format;
run;
AJ_Brien
Quartz | Level 8
well I didn't figure out how to use the 'other' option, but I did find out how to make it work.
All I did was create a dummy dataset with a dummy id value and append that to &ds dataset, such that in the event that the dataset is empty, the format is still created with an id value. That helped solve the error. thank you everyone for your guidance and time 🙂

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1221 views
  • 0 likes
  • 3 in conversation