BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
iSAS
Quartz | Level 8

Hello. I would like to create a program something like this:

 

Proc sql;

create table TARGET as

Select

  case when column1 exists then get the value of column 1. Otherwise null end as COL1,

  case when column2 exists then get the value of column 2. Otherwise null end as COL2,

.....

From SOURCE;

quit;

 

May I know how I can convert this to sas/proc sql code? Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

@iSAS wrote:
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.

See this example, using sashelp.class as a guinea pig:

/* list of columns needed */
data columns_needed;
input name :$32. type :$1.;
name = upcase(name);
cards;
weight n
age n
sex c
;
run;

proc sort data=columns_needed;
by name;
run;

/* create some example data */
data have;
set sashelp.class (drop=age);
run;

/* retrieve metadata */
proc sql;
create table columns as
select upcase(name) as name
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE'
order by name;
quit;

/* compare existing columns with needed, and dynamically create a data step to create missing columns */
data _null_;
merge
  columns (in=have)
  columns_needed (in=need)
  end=eof
;
by name;
if _n_ = 1
then call execute('data want; set have;');
if need and not have
then do;
  if type = 'n'
  then call execute(name !! " = .;");
  else call execute(name !! " = '';");
end;
if eof then call execute('run;');
run;

View solution in original post

8 REPLIES 8
Kurt_Bremser
Super User

Since the existence of a column will be known before the step, I'd retrieve that information from dictionary.columns and create code selectively.

If, OTOH, you are looking for missing values, the solution would be different. Which of the two is it?

iSAS
Quartz | Level 8
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.
Kurt_Bremser
Super User

@iSAS wrote:
Hello KurtBremser. My problem is the first suggestion you've mentioned.

I'm receiving a source file and its column/s changes consistently. Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.

See this example, using sashelp.class as a guinea pig:

/* list of columns needed */
data columns_needed;
input name :$32. type :$1.;
name = upcase(name);
cards;
weight n
age n
sex c
;
run;

proc sort data=columns_needed;
by name;
run;

/* create some example data */
data have;
set sashelp.class (drop=age);
run;

/* retrieve metadata */
proc sql;
create table columns as
select upcase(name) as name
from dictionary.columns
where libname = 'WORK' and memname = 'HAVE'
order by name;
quit;

/* compare existing columns with needed, and dynamically create a data step to create missing columns */
data _null_;
merge
  columns (in=have)
  columns_needed (in=need)
  end=eof
;
by name;
if _n_ = 1
then call execute('data want; set have;');
if need and not have
then do;
  if type = 'n'
  then call execute(name !! " = .;");
  else call execute(name !! " = '';");
end;
if eof then call execute('run;');
run;
iSAS
Quartz | Level 8
Hello KurtBremser. I tried it and it worked!Thank you so much for the big help
Kurt_Bremser
Super User

Now you should heed @RW9's advice. If there's actually data hidden in the column names, then a proper modeling of the input data will make all this unnecessary.

Or you simply do a transpose of your dataset:

data in;
input id col_1 col_2 col_4;
cards;
1 3 5 6
2 7 2 5
;
run;

proc transpose
  data=in
  out=out (rename=(_name_=colname col1=value))
;
by ID; /* add other variables that never change */
var col_:;
run;

and you have normalized dataset that needs no correction in code at all.

When you now merge that by colname to your dataset of needed columns (or any other dataset that has the required structure), the missing values will appear on their own.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Really a good idea to post test data in the form of a datastep and what you want out.  Now I am guessing that you don't mean, then the column is present or not, but when the data within the column is present or not?  If so your code is nearly there (note how I use the code window - its the {i} above post are):

proc sql;
  create table target as
  select case when col1 ne "" then col1 else "" end as col1,
         case when col2 ne "" then col2 else "" end as col2
  from have;
quit;

Note this assumes the variables are character, numeric would be ne .  However there doesn't seem to be any point to this code.  Simply:

proc sql;
  create table target as
  select col1,
         col2
  from have;
quit;

Will achieve exactly the same thing, you don't need to set missing's explicitly.  Maybe you meant something else, but I cant' tell from what you posted. 

iSAS
Quartz | Level 8
Hello RW9. Thank you for the response.

My issue is somewhat different: Example, today I will receive the file with Col1,Col2, Col3, Col4 and tomorrow I will only receive Col1,Col2, and Col4. Thus I would like to create a program that:
If a certain column exists on the current source file, it will get the data of that column. Otherwise, it will just put null.
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then your problem is not with coding, but understanding the data and the modelling, and not having proper documentation on the data transfer - all common issues.  Why does the data structure change like this?  If indeed these columns change each time, then they should not be columns - they should be rows of data.  Columns are a fixed structure which can be agreed up front, and programmed against, rows, expand and shrink as the data needs.  This is basic data modelling.  So the sender should identify the data which changes and apply this in rows not columns, then you and the sender can agree a data transfer document which details the fixed structure - they can program the export and you the import based on the consistent agreed document.  

The alternative is that you try to fix this time, then next time they do something else, maybe they wont call them col next time, maybe they wont number them like that, thus you repeat this exercise every import.  A slight change nullifies this:

From 

Col1  Col2  Col3 Col4

a       b        c       d

 

To

Col     Result

1        a

2        b

3        c

4        d

This one can have 3 removed without any change to the import program or your coding which uses this data.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5900 views
  • 0 likes
  • 3 in conversation