SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to create flexible variable/column creation in a dataset

Reply
Occasional Contributor
Posts: 6

How to create flexible variable/column creation in a dataset

Hi, I hope somebody can help me or give some idea how to create a dataset where the number of variables will be determine by number of records retrieve from another dataset.

I have the following textfile (this is just a sample; the actual layout has more than 50 records) which I've imported to a temporary dataset called Layout_Format

Report_TypeGroup_Description Group_Detail Group_Order Detail_Order
============================ ============ =========== ============
Credit     Fixed Expances    Rent         C01         C0101
Credit     Fixed Expances    Car Loan     C01         C0102
Credit     Fixed Expances    Student Loan C01         C0103
Credit     Float Expances    Food         C02         C0201
Credit     Float Expances    Traveling    C02         C0202
Credit     Float Expances    Bills        C02         C0203
Debit      Fixed Income      Salary       D01         D0101
Debit      Non-Fixed Income  Bonus        D02         D0201

I want to create another table as name Report_Layout, it will contain fixed variable; DATE, NAME, and additional variable based on variable "DETAIL_ORDER" from Layout_Format dataset. In this case it will become DATE, NAME, C0101,C0102, C0103,C0201,C0202, C0203, D0101, D0201. I need to become flexible because subject to changes with rules & regulation or user requirement, the layout will be add or remove.

I have another textfile that will contain Date, Name, Account_Code, Amount. This textfile will be joined with Code_Structure that has the following variable; Account_Code, Account_Description, Group_Detail, Group_Type

Thank you in advance. Smiley Happy

Respected Advisor
Posts: 3,908

Re: How to create flexible variable/column creation in a dataset

Super Contributor
Posts: 349

Re: How to create flexible variable/column creation in a dataset

Hi,

In DI Studio, the column names cannot be named dynamically.

Alternative method: Use transpose transformation: inside the transpose transformation properties, in the options tab where you assign your ID and VAR values, click on the "Additional options" on the left and set the "Update the metadata for the target tables" to yes

But, each time you run your job you will get a prompt that asks if you want to update the metadata, click yes , your ID values that you want to transpose will be in the output table.

Hope this helps.

Thanks,

Shiva

Respected Advisor
Posts: 3,908

Re: How to create flexible variable/column creation in a dataset

Hi shivas

Have you ever used this automatic table metadata updating? I've taken so far distance from it as I can't see how such a changing metadata table could be reasonably used later on. So whenever dynamic variable creation can't be avoided, I take refuge to user written code.

I normally try to keep tables "long" so I wouldn't change the structure of the table as the OT asks for but would prefer to have a stable set of columns.

Super User
Posts: 5,260

Re: How to create flexible variable/column creation in a dataset

I've used this feature (delicately) in development only, just to have column metadata populate initially. In that sense, not very dynamically...Smiley Wink

Agree on the usability of this feature, can't see the use of populate metadata tables dynamically with new/deleted column names. Somewhere these columns needs to me mapped, either to succeeding transformations, stored process, information maps or whatever. Try to automate that...

Data never sleeps
Respected Advisor
Posts: 3,908

Re: How to create flexible variable/column creation in a dataset

Thanks Linus. That's what I'm thinking as well.

Really curious to learn if someone has a real use case where it makes sense to use this automatic table metadata update in a regular production job.

Occasional Contributor
Posts: 6

Re: How to create flexible variable/column creation in a dataset

Thanks for the reply.

As for the dynamic column creation will be created in temporary table; 1 step before I add into the main table.

Will it be a problem if I append the transpose data which might not have more than one of the columns into a fix table. Example I only have the following in my transpose table:

  1. Date
  2. Name
  3. D0102
  4. C0101
  5. C0103
  6. C0202

Whereelse my final table must have the following column :

  1. Date
  2. Name
  3. D0101
  4. D0102
  5. D0103
  6. D0104
  7. D0201
  8. D0202
  9. D0203
  10. C0101
  11. C0102
  12. C0103
  13. C0201
  14. C0202
  15. C0203

I make the final table with fixed column because the formula on the final report is based on those columns.

Respected Advisor
Posts: 3,908

Re: How to create flexible variable/column creation in a dataset

Is this still something you look for an answer? Or have you found a solution in-between?

I've given your question some thought and I believe you shouldn't use "update metadata" at all. Proc Transpose is a very tricky procedure for metadata driven development and I believe SAS R&D has probabely not though it through when they've implemented this "update metadata" option. It might be that there is some very special use case covered requested by a customer which might have driven this - but I still can't see how one could reasonably use this option.

For your case: It appears that you know which columns proc transpose could generate for you. So why don't you just define all possible columns in your target table. The Transform node generating the Proc Transpose statement will still create a physical table only containing the actual columns but then in the next node where you have the loader you can map all columns from source to target (as here you're using metadata information).

The metadata table object and the underlying physical table won't be in sync so you willl most likely get some kind of error if selecting "view data" on the metadata table.

I believe most (if not all) loader generated code will deal nicely with a physical table which has less columns than it's corresponding metadata table object. Just give it a try.

Ask a Question
Discussion stats
  • 7 replies
  • 587 views
  • 0 likes
  • 4 in conversation