02-04-2016 03:29 PM
I would like to code out an import I was able to perform using the import Wizard in SAS Enterprise Guide 9.4.
It allowed for me to:
import a CSV file
set the first observation row
get the names for use as column names
Spefcify where the next observation will start
define how many observations to get there after
my data is really fun and looks something like the following
Start Stats, type, location, value, quanity value,
,new, Central, 100, 500,
End Stats, type, location, value, quantiy value,
,used, Central, 50, 10,
the end result is to:
populate the column names using each Stats line
ommit the dash lines
populate the data portion into table values.
Later I will merge all the items into one table on one line item for each file.
I have several tables and the column names can change so getting the column names is very important.
I'm unable to get the Wiard to provide it's import method to understand what's going on there to springboard my project.
02-04-2016 03:37 PM
If you're looking for the Import Data task to give you reusable code, check the following options.
The resulting code should be something that will run as-is in its own program, as long as the CSV file that you reference is in a path that the SAS session can get to.
02-04-2016 04:49 PM
yes it does and no it doesn't....
The resulting code does't show the method used to get the column names.
It only shows hard coded outcomes of evaluation for column names.
This is great but doesn't lend it's self well to a dynamic process.
This is still a step forward from not using these steps.
Previouse the file was only represented by datalines4
02-04-2016 05:07 PM
Well...no, the column names are part of the code.
If you want SAS to guess at the column names you can use PROC IMPORT instead, but it can guess incorrectly, which sets up issues down the road. You may also run into issues with where EG is located and your files are located, ie the server not seeing your local directory issues.
02-04-2016 05:13 PM
I would like to use proc import method and have SAS guess at the names.
And I've tried without success .. my column names contiue to populate as VARx and I'm not able to pinpoint where the data starts to omit the dash line.
I was hoping to mimic the import wizard and modified as needed to produce the final outcome.
02-04-2016 05:20 PM
Have you specified GETNAMES=Yes?
The options are here:
I think you're also interested in NAMEROW and DATAROW options as mentioned in the GUI.
02-04-2016 05:35 PM
yes so using the code like
import datafile='C:test.CSV' dbms=csv out=work.test replace;
the data as provided before demonstrates VARx for all column names.
My first guess is SAS is protecting it's self from values that have spaces
02-04-2016 05:43 PM
02-05-2016 08:29 AM
SAS can handle var names with spaces if you specify OPTIONS VALIDVARNAME=ANY. Then you can reference the variable with a special literal notation in code, like this:
length 'my variable'n 8;
But not sure that's what is getting in your way here. If you can post a sample data file with a few records (beyond what you've already supplied), I'm sure that someone will help with an approach. I suspect the answer is not PROC IMPORT but a DATA step that reads and "fixes" the data lines on the way in. Or two passes: one to clean the file and produce a more typical CSV, and the second to run the PROC IMPORT that will determine column attributes.
02-05-2016 12:24 PM
02-05-2016 12:48 PM
That is very possible to read using a DATA step, but it will require a completely custom approach. It's a far stretch to call that a CSV file -- it does not comply with any variation of CSV I've encountered. I don't see how the Import tasks can help you at all.
Here's what I suggest - create another top in the Base Programming board with a title like "Help reading in text file with custom layout". Post the example layout and ask for help with the technique.
Who knows? Someone might solve it for you. More likely people will point you to examples if using the INPUT statement and @ to control input position and other necessary tricks.
02-05-2016 01:01 PM
You're attempting to fix a broken process with code.
That output comes from a reporting system that probably stores the data in a better format.
Go back to the source and ask for datasets not text files.
Fix the process instead of work around it, yes, it's more work.
And yes, its not always possible
02-05-2016 02:57 PM - last edited on 02-05-2016 03:05 PM by ChrisHemedinger
Unfortunatly I'm working in an established proccess without wiggle room to request changes The method I've come up with by combining stuff from other post is something like grab all directory information from the home folder where file is CSV--
%let dirname = C:\Test; filename DIRLIST pipe "dir /B &dirname\*.csv"; data dirlist; length fname $256; infile dirlist length=reclen; input fname $varying256. reclen; run; /*Loop through each file grabbing only specific data lines-- This does not allow for files where column ccounts may be more or less then what I've specified so I've over requested columns. */ data rep_date; length myfilename $100; set dirlist; filepath = "&dirname\"||fname; /* change firstobs to assure that the column header is first observation in new table */ infile dummy filevar = filepath dlm='2C0D'x dsd missover lrecl=10000 firstobs=3 obs=3 end=done missover; do while(not done); myfilename = filepath; INPUT F1 : $CHAR96. F2 : $CHAR33. F3 : $CHAR18. F4 : $CHAR55. F5 : $CHAR11. F6 : $CHAR11. F7 : $CHAR11. F8 : $CHAR11. F9 : $CHAR11. F10 : $CHAR11. F11 : $CHAR11. F12 : $CHAR11. F13 : $CHAR11.; output; end; run; /*Assemble all the pieces- but this method does not create variables for the column names so it does not allow for varrying column names */ proc sql; create table final as ( select s.batch_id , s.extract_date , s.submit_date , s.system_name , s.record_count , s.encounter_count , s.svc_from_date , s.svc_to_date , a.F1 as rep_date , case when c.f2 like '%SUB%' then c1.f2 end as enc_SUBMIT , case when c.f3 like '%PROC%' then c1.f3 end as enc_PROCES , case when c.f4 like '%CLOSE%' then c1.f4 end as enc_CLOSED , case when c.f5 like '%DUP%' then c1.f5 end as enc_DUPLICATE , case when c.f6 like '%ACCEPT%' then c1.f6 end as enc_ACCEPTED , case when c.f7 like '%INFOR%' then c1.f7 end as enc_INFORMATION , case when c.f8 like '%JECTED%' then c1.f8 end as enc_REJECTED , case when c.f9 like '%PART RE%' then c1.f9 end as enc_PART_REJECT , case when d.f2 like '%SUB%' then d1.f2 end as diag_SUBMIT , case when d.f3 like '%PROC%' then d1.f3 end as diag_PROCES , case when d.f4 like '%CLOSE%' then d1.f4 end as diag_CLOSED , case when d.f5 like '%DUP%' then d1.f5 end as diag_DUPLICATE , case when d.f6 like '%ACCEPT%' then d1.f6 end as diag_ACCEPTED , case when D.f7 like '%INFOR%' then D1.f7 end as DIAG_INFORMATION , case when D.f8 like '%JECTED%' then D1.f8 end as DIAG_REJECTED , case when D.f9 like '%PART RE%' then D1.f9 end as DIAG_PART_REJECT from rep_date a inner join batch b on a.fname =b.fname INNER JOIN submitted S ON b.F2 =S.BATCH_ID inner join( select * from enc where f1 is not null) c on a.fname =c.fname inner join( select * from enc where f1 is null and f2 not like '%-%') c1 on a.fname =c1.fname inner join( select * from diag where f1 is not null) d on a.fname =d.fname inner join( select * from diag where f1 is null and f2 not like '%-%') d1 on a.fname =d1.fname where b.f2 not like'%BATCH%' and b.f2 not like'%-%' ); quit;