DATA Step, Macro, Functions and more

PROC SQL Case colulmn exists or does not exists

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

PROC SQL Case colulmn exists or does not exists

Hello. I would like to create a program something like this:

 

Proc sql;

create table TARGET as

Select

  case when column1 exists then get the value of column 1. Otherwise null end as COL1,

  case when column2 exists then get the value of column 2. Otherwise null end as COL2,

.....

From SOURCE;

quit;

 

May I know how I can convert this to sas/proc sql code? Thanks.


Accepted Solutions
Solution
‎06-08-2018 04:24 AM
Super User
Posts: 10,211

Re: PROC SQL Case colulmn exists or does not exists


@iSAS wrote:
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.

See this example, using sashelp.class as a guinea pig:

/* list of columns needed */
data columns_needed;
input name :$32. type :$1.;
name = upcase(name);
cards;
weight n
age n
sex c
;
run;

proc sort data=columns_needed;
by name;
run;

/* create some example data */
data have;
set sashelp.class (drop=age);
run;

/* retrieve metadata */
proc sql;
create table columns as
select upcase(name) as name
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE'
order by name;
quit;

/* compare existing columns with needed, and dynamically create a data step to create missing columns */
data _null_;
merge
  columns (in=have)
  columns_needed (in=need)
  end=eof
;
by name;
if _n_ = 1
then call execute('data want; set have;');
if need and not have
then do;
  if type = 'n'
  then call execute(name !! " = .;");
  else call execute(name !! " = '';");
end;
if eof then call execute('run;');
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Posts: 10,211

Re: PROC SQL Case colulmn exists or does not exists

Since the existence of a column will be known before the step, I'd retrieve that information from dictionary.columns and create code selectively.

If, OTOH, you are looking for missing values, the solution would be different. Which of the two is it?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 21

Re: PROC SQL Case colulmn exists or does not exists

Posted in reply to KurtBremser
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.
Solution
‎06-08-2018 04:24 AM
Super User
Posts: 10,211

Re: PROC SQL Case colulmn exists or does not exists


@iSAS wrote:
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.

See this example, using sashelp.class as a guinea pig:

/* list of columns needed */
data columns_needed;
input name :$32. type :$1.;
name = upcase(name);
cards;
weight n
age n
sex c
;
run;

proc sort data=columns_needed;
by name;
run;

/* create some example data */
data have;
set sashelp.class (drop=age);
run;

/* retrieve metadata */
proc sql;
create table columns as
select upcase(name) as name
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE'
order by name;
quit;

/* compare existing columns with needed, and dynamically create a data step to create missing columns */
data _null_;
merge
  columns (in=have)
  columns_needed (in=need)
  end=eof
;
by name;
if _n_ = 1
then call execute('data want; set have;');
if need and not have
then do;
  if type = 'n'
  then call execute(name !! " = .;");
  else call execute(name !! " = '';");
end;
if eof then call execute('run;');
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 21

Re: PROC SQL Case colulmn exists or does not exists

Posted in reply to KurtBremser
Hello KurtBremser. I tried it and it worked!Thank you so much for the big help
Super User
Posts: 10,211

Re: PROC SQL Case colulmn exists or does not exists

Now you should heed @RW9's advice. If there's actually data hidden in the column names, then a proper modeling of the input data will make all this unnecessary.

Or you simply do a transpose of your dataset:

data in;
input id col_1 col_2 col_4;
cards;
1 3 5 6
2 7 2 5
;
run;

proc transpose
  data=in
  out=out (rename=(_name_=colname col1=value))
;
by ID; /* add other variables that never change */
var col_:;
run;

and you have normalized dataset that needs no correction in code at all.

When you now merge that by colname to your dataset of needed columns (or any other dataset that has the required structure), the missing values will appear on their own.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Super User
Posts: 9,599

Re: PROC SQL Case colulmn exists or does not exists

Really a good idea to post test data in the form of a datastep and what you want out.  Now I am guessing that you don't mean, then the column is present or not, but when the data within the column is present or not?  If so your code is nearly there (note how I use the code window - its the {i} above post are):

proc sql;
  create table target as
  select case when col1 ne "" then col1 else "" end as col1,
         case when col2 ne "" then col2 else "" end as col2
  from have;
quit;

Note this assumes the variables are character, numeric would be ne .  However there doesn't seem to be any point to this code.  Simply:

proc sql;
  create table target as
  select col1,
         col2
  from have;
quit;

Will achieve exactly the same thing, you don't need to set missing's explicitly.  Maybe you meant something else, but I cant' tell from what you posted. 

Contributor
Posts: 21

Re: PROC SQL Case colulmn exists or does not exists

Hello RW9. Thank you for the response.

My issue is somewhat different: Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.
Super User
Super User
Posts: 9,599

Re: PROC SQL Case colulmn exists or does not exists

Then your problem is not with coding, but understanding the data and the modelling, and not having proper documentation on the data transfer - all common issues.  Why does the data structure change like this?  If indeed these columns change each time, then they should not be columns - they should be rows of data.  Columns are a fixed structure which can be agreed up front, and programmed against, rows, expand and shrink as the data needs.  This is basic data modelling.  So the sender should identify the data which changes and apply this in rows not columns, then you and the sender can agree a data transfer document which details the fixed structure - they can program the export and you the import based on the consistent agreed document.  

The alternative is that you try to fix this time, then next time they do something else, maybe they wont call them col next time, maybe they wont number them like that, thus you repeat this exercise every import.  A slight change nullifies this:

From 

Col1  Col2  Col3 Col4

a       b        c       d

 

To

Col     Result

1        a

2        b

3        c

4        d

This one can have 3 removed without any change to the import program or your coding which uses this data.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 155 views
  • 0 likes
  • 3 in conversation