I am trying to create a table where column A = 'Y', if column A does not exist then where column B= 'Y'. Some months the source table has column A and other months it has column B.
I attempted to use the coalescec function but this does no work if the both columns are not present.
PROC SQL;
CREATE TABLE TBL_&DATE. AS
SELECT
NAME,
ADDRESS,
PRODUCT
FROM
SRC_&DATE.
WHERE
COALESCEC(COL_A,COL_B) = 'Y'
;QUIT;
I am using SAS EG 7.1
SQL's rigid/declarative structure is unsuited to dealing with varying inputs.
data TBL_&date.;
length COL_A COL_B $1;
set SRC_&date.;
COL=coalescec(COL_A, COL_B)
if COL = 'Y';
keep NAME ADDRESS PRODUCT COL;
run;
try below code
PROC SQL;
CREATE TABLE TBL_&DATE. AS
SELECT
NAME,
ADDRESS,
PRODUCT
FROM
SRC_&DATE.
WHERE
COL_A='Y' or COL_B= 'Y'
;QUIT;
A simple macro could do the trick:
%macro extract(date);
%local dsid varname;
%let dsid=%sysfunc(open(SRC_&date,i));
%if &dsid=0 %then %do;
%put ERROR: TABLE SRC_&date could not be opened;
%return;
%end;
%if %sysfunc(varnum(&dsid,COL_A)) %then
%let varname=COL_A;
%else
%let varname=COL_B;
%let dsid=%sysfunc(close(&dsid));
PROC SQL;
CREATE TABLE TBL_&DATE. AS
SELECT
NAME,
ADDRESS,
PRODUCT
FROM
SRC_&DATE.
WHERE
&varname='Y';
QUIT;
%mend;
%extract(02JUL2020);
The macro checks if COL_A is present on the input table and uses that if it is, otherwise COL_B.
There is a big difference between "Column A does not exist" and "the value of column a is missing".
"Does not exist" to me means that the variable is not in the data set and any reference attempting to use the variable will be an error, and rightfully so.
Your process should be fixed so that the variable name is consistent. I would spend more time tracking down why and fixing it much earlier in the process. When I see one issue like this I usually see multiples. And each causes a different set of headaches.
If you are "reading" external data from a source that cannot have consistent column headers, typical examples are spreadsheets with poor control over contents, with Proc Import then you need to address how the data is read. Proc Import guesses each and every time it is called to read a file. And differences will appear such as variable types changing, variable names as you have seen, lengths of variables. All of these make programming difficult. Since you have mentioned sometimes one name and sometimes another then that strongly implies you need to control the data read as you go forward to prevent problems.
Depending on your SAS version, this code may work for you
%let date=&sysdate;
data src_&date;
NAME='Some Name';
ADDRESS='Some Address';
PRODUCT='Some Product';
COL_A='Y';
*COL_B='Y';
run;
%let g_dynWhere=;
options mprint;
PROC SQL NOPRINT;
SELECT STRIP(name)||"='Y'"
INTO :g_dynWhere separated by ' OR '
FROM dictionary.columns
WHERE libname='WORK'
AND memname=upcase("SRC_&DATE")
AND name LIKE 'COL_%'
AND memtype='DATA';
/*%put &=g_dynWhere;*/
CREATE TABLE TBL_&DATE. AS
SELECT
NAME,
ADDRESS,
PRODUCT
FROM
SRC_&DATE.
%if (%superq(g_dynWhere) NE ) %then
%do;
WHERE &g_dynWhere
%end;
;
QUIT;
Back in 2018, @ChrisHemedinger posted the following Blog about SAS 9.4 M5. The code above would examine the structure of your source table and dynamically build the condition.
That condition then in turn, would conditionally be used in the Create Table Query.
Hope this helps,
Ahmed
I think you got the most of it. Here is my break down
1. Create sample/dummy data to simulate your initial situation, and provide working code.
2. dictionary.columns is a table that contains the records about the structure of the table. Think of it, as a table representation of the SQL Describe tablename statement.
3. Create Macro variable (g_dynWhere) that would hold a 'OR' separated string values of Col_A='Y' OR Col_B='Y'. Col_A and Col_B are values stored in the name column.
4. The condition
%if (%superq(g_dynWhere) NE ) %then
%do;
WHERE &g_dynWhere
%end;
Would only add the WHERE clause if the resolved value of the macro variable (g_dynWhere) was not blank. i.e. at least one of the two columns (Col_A, Col_B) existed in the Source Table with the required value.
Hope this clarify it for you
Ahmed
I would suggest this kind of approach (a view and coalescec with a default value) :
data t1;
do id = "a", "b", "c", "";
a = id;
output;
end;
run;
data t2;
do id = "a", "b", "c", "";
b = id;
output;
end;
run;
proc sql;
create table tt
(
table char(2),
id char(1),
c char(12),
whichColumn char(12)
);
quit;
%macro insert(t);
data _t / view=_t;
length a b $12;
set &t;
run;
proc sql;
insert into tt (table, id, c, whichColumn)
select
"&t." as table,
id,
coalescec(a, b, "x") as c,
choosec(1 + whichc(calculated c, a, b), "None", "Col a", "Col b") as whichColumn
from _t;
drop view _t;
quit;
%mend insert;
%insert(t1);
%insert(t2);
proc print data=tt noobs; run;
table id c whichColumn t1 a a Col a t1 b b Col a t1 c c Col a t1 x None t2 a a Col b t2 b b Col b t2 c c Col b t2 x None
SQL's rigid/declarative structure is unsuited to dealing with varying inputs.
data TBL_&date.;
length COL_A COL_B $1;
set SRC_&date.;
COL=coalescec(COL_A, COL_B)
if COL = 'Y';
keep NAME ADDRESS PRODUCT COL;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.