Hi, I'm doing an update on an existing legacy process so don't want to do too much of an overhaul.
I've got a CSV I've got setup at an delimited external file linked to a file reader. One of the text fields can contain commas which would obviously ordinarily cause issues as it would see it as a new column and split it there, unless it was wrapped in quotes...
Unfortunately the way the file is provided they only wrap that field in quotes if it contains a comma, but due to this I believe DI is seeing other values unquoted in the same field and not then not treating them properly, anyway around this?
So as an example
Row | Customer | Text |
1 | 17 | hello |
2 | 23 | |
3 | 56 | "hello, person" |
The Text field for Row 3 is being read in as
"hello |
Rather than the full value. Ideally I'd like to stick with the file reader transformation but will switch to code if needed, but ideally not. I'm fairly sure it would work if all values in that field were quoted but as it's only quoted if it contains a comma it's not doing it.
Thanks
Is the example in your original post the actual data that is having trouble?
Can you share the actual text file that causes the trouble? And the code that DI generated to read it.
A CSV file to contain what you shared should look like:
Row,Customer,Text 1,17,hello 2,23, 3,56,"hello, person"
And the SAS code to read it should look like:
data want;
infile csv dsd truncover firstobs=2;
input Row Customer Text :$20.;
run;
Let's try it.
options parmcards=csv;
filename csv temp;
parmcards4;
Row,Customer,Text
1,17,hello
2,23,
3,56,"hello, person"
;;;;
data want;
infile csv dsd truncover firstobs=2;
input Row Customer Text :$20.;
run;
proc print;
run;
Result
In normal SAS code that is what the DSD option on the INFILE statement does.
If the code that the DI wizard generated does not have the DSD option on the INFILE statement then try using this method to modify the INFILE statement.
https://documentation.sas.com/doc/en/etlug/4.904/p04pztrut3whjln1ejyho313zx4b.htm
Thanks, I've had a look though and the infile statement already has dsd as an option.
infile "&Filename."
lrecl = 256
delimiter = ','
dsd
missover
firstobs = 2;
Then something else is happening.
Perhaps the LRECL is too short and that is why the value is truncated? If it does not see the closing quote that would explain why the opening quote is there.
Perhaps the quotes are not actually quotes? Perhaps they are "Smart" quotes (or really stupid quotes) that WORD likes to insert?
Is the example in your original post the actual data that is having trouble?
Can you share the actual text file that causes the trouble? And the code that DI generated to read it.
A CSV file to contain what you shared should look like:
Row,Customer,Text 1,17,hello 2,23, 3,56,"hello, person"
And the SAS code to read it should look like:
data want;
infile csv dsd truncover firstobs=2;
input Row Customer Text :$20.;
run;
Let's try it.
options parmcards=csv;
filename csv temp;
parmcards4;
Row,Customer,Text
1,17,hello
2,23,
3,56,"hello, person"
;;;;
data want;
infile csv dsd truncover firstobs=2;
input Row Customer Text :$20.;
run;
proc print;
run;
Result
I'm back sorry, so it's sorted now, but I'm not overly pleased why. Essentially I tried tweaking the external file metadata but no joy, however if I copied it then repointed it at some of the data in a new file it worked. I then decided to make new external file metadata objects pointing at the original files, I used identical field names, formats etc. I verified the infile code was identical, but for some reason the new ones work fine and the old ones don't. I've been through every single thing I can check and they look identical, but one loads in the full text and one doesn't. I'm baffled but at least it's solved.
I had two different files with different layouts and both had the same issue and this solved it for both. Just another 'SASism' to deal with.
Thanks for the help.
Does DI actually let you see the code it generated and the SAS log from when it ran the generated code?
Can you see any difference in the two versions?
I made a new job and added the old and new metadata objects each linked to a file reader.
Old metadata object, doesn't work
data work.W1C68GP0 / view = work.W1C68GP0 ;
infile &Filename.
lrecl = 256
delimiter = ','
dsd
missover
firstobs = 2;
;
attrib Period length = $100;
attrib OrgCode length = $9;
attrib Quarter length = 8;
attrib Allocation length = $150;
attrib SpendServices length = $150;
attrib SpendPharma length = $150;
attrib SpendOther length = $150;
attrib SetQuitDateLastYear length = $150;
attrib SetQuitDateThisYear length = $150;
attrib UnknownQuitStatusThisYear length = $150;
attrib SuccessfulQuittersLastYear length = $150;
attrib SuccessfulQuittersThisYear length = $150;
attrib BreachComments length = $1000
format = $char1000.
informat = $char1000.;
input Period OrgCode Quarter Allocation SpendServices SpendPharma SpendOther
SetQuitDateLastYear SetQuitDateThisYear UnknownQuitStatusThisYear
SuccessfulQuittersLastYear SuccessfulQuittersThisYear BreachComments;
run;
New metadata object, does work
data work.W1CAEHBW / view = work.W1CAEHBW ;
infile &Filename.
lrecl = 1250
delimiter = ','
dsd
missover
firstobs = 2;
;
attrib Period length = $100;
attrib OrgCode length = $9;
attrib Quarter length = 8;
attrib Allocation length = $150;
attrib SpendServices length = $150;
attrib SpendPharma length = $150;
attrib SpendOther length = $150;
attrib SetQuitDateLastYear length = $150;
attrib SetQuitDateThisYear length = $150;
attrib UnknownQuitStatusThisYear length = $150;
attrib SuccessfulQuittersLastYear length = $150;
attrib SuccessfulQuittersThisYear length = $150;
attrib BreachComments length = $1000
format = $char1000.
informat = $char1000.;
input Period OrgCode Quarter Allocation SpendServices SpendPharma SpendOther
SetQuitDateLastYear SetQuitDateThisYear UnknownQuitStatusThisYear
SuccessfulQuittersLastYear SuccessfulQuittersThisYear BreachComments;
run;
The code looks identical but you can see in the results one works, one doesn't.
Only difference is in the way it is specifying the filename.
In the NEW one the code is adding quotes around the value of the macro variable FILENAME.
In the OLD one the code is just using the value of the macro variable directly.
Does the macro variable FILENAME end up having different values between the two jobs.
Notice that both are not defining a dataset. Instead they are defining a view. So if the file the data step is referencing changes between the different times you try to use the view then the results will be different.
I don't know how DI works, but how can you check the results without running a "job"? Are you sure you aren't just looking at some old version of the data from some earlier run?
Also if there was already a dataset named work.W1C6U328 and you tried to run that OLD data step to define a view named work.W1C6U328 it would fail. Or if there was already a view named work.W1C6U328 and you ran that OLD data step with a value of FILENAME that caused the INFILE statement to be invalid syntax then the older view would not be replaced.
I updated the code in the previous post. You can preview the data outside a job on it's own but there's no log then. I updated both metadata objects to point at the actual physical file rather than a macro variable and got the exact same results (I left it as &filename. in the sample code as I can't post filenames/paths here). I noticed the Record lengths were different, although that wasn't the issue as longer records without commas were being read in fine (the comma was at position 77), however, updating the length seems to have forced some kind of refresh and that metadata object now works too.
DI has always sadly been the blacksheep of the SAS family (even though it's great) and can have bugs like this sometimes where it can get caught up and have weird metadata bugs, often coming back the next day can solve them but this one stuck around, I was hesitant to start making changes to a process that's run for 5+ years, but altering the file metadata seems to have done it (I'll stick with the new metadata objects anyway to be safe). I found a bug in DI years ago that after a lot of back and forth SAS accepted DI was generating incorrect code and for years kept been told it was being fixed in the next version and I don't think it ever was, I gave up after a few years of asking and our Admin have implemented a workaround.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.