"6/23/2015 8:55:36 AM,6/23/2015 7:57:55 AM,test,A B,""C, D "",E-MA,F,Personal G,G one,test - TWO THREE,""I LIKE APPLE"",""ONE, TWO"",FCB,6/27/2015 - 6/27/2016,6/23/2015,BIZ,Personal,MA,NY,Personal,Group,""NYC Ins. Companies"",,,""$NYC NY-MA, (Group)"",""$NYC NY-MA, (Group)"",,,,"
The data looks like this but with 10k rows. I was first trying to use PROC IMPORT but it didn't recognize the comma(,) as a delimiter in the file. Then I tried DATA STEP INFILE but still didn't work out the problem.
Does anyone have any experience importing files with text qualifiers?
Thank you.
-----------------------------------------------------------------------------------------------------------------
Update
I first attached .txt data and below is the .csv data.
Raw data,
6/23/2015 8:28:23 AM,6/23/2015 7:31:48 AM,PERSON0,PERSON1,"LNAME, FNAME ",DT,AREA0,TYP_CD,CD_NEW,CD_R,"DISPOSTION","LNAME, FNAME",DATE,DATE1,TYPE1,TYPE2,AREA,AREA1,TYPE3,NAME,NAME,,,SERIES,SERIES2,,TYPE5,, |
Expected data,
Var1 | Var2 | Var3 | Var4 | Var5 | Var6 | Var7 | Var8 | Var9 | Var10 | Var11 | Var12 | Var13 | Var14 | Var15 | Var16 | Var17 | Var18 | Var19 | Var20 | Var21 | Var22 | Var23 | Var24 | Var25 | Var26 | Var27 | Var28 | Var29 | Var30 |
6/23/2015 8:28 | 6/23/2015 7:31 | PERSON0 | PERSON1 | LNAME, FNAME | DT | AREA0 | TYP_CD | CD_NEW | CD_R | DISPOSTION | LNAME, FNAME | DATE | DATE1 | TYPE1 | TYPE2 | AREA | AREA1 | TYPE3 | NAME | NAME | SERIES | SERIES2 | TYPE5 |
Thank you again.
Did you specify DLM="," and dSD in your import code?
Yes, I specify the DLM=',' and DSD option.
The proliferation of two double=quotes ( "") in that data tends to make SAS see each of these rows below as a single variable, when using DLM=',' and DSD
specifies that when data values are enclosed in quotation marks, delimiters
within the value are treated as character data
so NONE of your commas are seen as delimiters:
"6/23/2015 8:55:36 AM,6/23/2015 7:57:55 AM,test,A B,"
"C, D "
",E-MA,F,Personal G,G one,test - TWO THREE,"
"I LIKE APPLE"
","
"ONE, TWO"
",FCB,6/27/2015 - 6/27/2016,6/23/2015,BIZ,Personal,MA,NY,Personal,Group,"
"NYC Ins. Companies"
",,,"
"$NYC NY-MA, (Group)"
","
"$NYC NY-MA, (Group)"
",,,,"
The double quotes are text qualifiers and I guess my issue is how to deal with them.
The csv file is sort of different and it didn't treat comma as delimiter either.
I would expect a single double quote as a text identifier, which is also what SAS expects. Your very first value starts with a single double quote and starts looking for the next double quote to close the first variable, which occurs after a comma, so the comma cannot be treated as a delimiter. And since there is another " in what seems to be an inappropriate location that continues.
Also, examine your "csv" file with a text editor, not a spreadsheet. You may see things that a spreadsheet hides.
Suggest you might share what you would expect the resulting parsed, column-oriented data row to look like -- post an AFTER-parsing data representation. For example you have a quoted, character-type first column revealed, but that data-represented is not terminated with a closing double-quote, as far as I see with your pasted data.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.