10-28-2016 03:24 PM - edited 10-28-2016 03:24 PM
We're trying to incoroprate Data Management Studio into our process by cleaning up data before doing our transformations and such in DI Studio.
Source data comes from Excel, and we use a user written code node in DI Studio to write a proc import statement to turn the excel spreadsheets into sas datasets. The output datasets look like attached image, but we want them to look as follows:
ID Name Date
1 John Fall 2013
2 Juan Fall 2013
3 Kara Fall 2013
So, my questions are...
1) Is there a way to do this in the proc import statement? Create a new column and provide the value for it?
2) If not, can this be done in Dataflux - Data Management Studio?
The idea is that we convert the spreadsheets into sas datasets, Dataflux cleans the data, then DI Studio continues processing the clean datasets.
I can do the above in DI Studio (and probably the data standardization with lookup tables), but again we want to do cleaning and standardizing in Dataflux if possible.
Are we going about this the wrong way?
10-29-2016 11:42 PM
Please post your PROC IMPORT code. You should be able to get the desired output by using either the RANGE, STARTCOL, or STARTROW statements in your PROC IMPORT code. Check the following link to confirm what options are available for your type of Excel workbook:
10-31-2016 09:40 AM - edited 10-31-2016 09:42 AM
Ok, the code is below. If I could output all the sheets into one dataset and capture that semester/year data from one cell to populate into a new column....that would be a big help. I'm currently outputting each sheet as it's own dataset.
libname cbsch 'S:\path\; %macro pim(sheet); proc import out=sch datafile = 'S:\path\file.xlsx' out = cbsch.CB_SCH_&sheet dbms = Excel replace; sheet = "&sheet"; mixed = yes; getnames = yes; run; %mend pim; %pim(sffall2013); %pim(sfspring2014); %pim(sfsummer2014); %pim(sfFY14);