DATA Step, Macro, Functions and more

how to read a comma delimited file with human error "enter"

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

how to read a comma delimited file with human error "enter"

hi, 

I typically don't deal with "unstructured" text and I know there is a solution to this, I'd appreciate for any guidance and help. 

I have a comma delimited file read as such, string variable will be between two quotes, and numeric will not have quotes.

The first line is the variable name, y1, y2, y3, where y1 and y2 are strings and y3 is numeric. 

as you can notice the second row, there is a human error of "enter" to a new line 

 

v1: (with human "enter")

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3

 

v2: (this is how it's supposed to be)

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,gsdfj",3

 

v3Smiley Sadoutput desired)

y1y2y3
ASLGSDF5
asldhl ser,gsdfj3

 

Would someone please help me, is there anyway I can read the data from v1 to v3?

 

Thanks!

 

Joanne


Accepted Solutions
Solution
‎12-27-2016 09:33 PM
Respected Advisor
Posts: 4,649

Re: how to read a comma delimited file with human error "enter"

[ Edited ]

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG

View solution in original post


All Replies
Respected Advisor
Posts: 3,892

Re: how to read a comma delimited file with human error "enter"

If this is a once-off excercise and you don't have to many data then I would either send the data back to the sender and ask to fix it or then just read in the data into a single string, count the double quotes and then go into the lines with odd counts and fix it manually.

 

IF the manual enters are not end of line delimiters - eg. under Windows end of line would be CRLF but a "manual" enter could just be an LF - then the issue wouldn't be that big and SAS would still treat the record as being on a single line. You would then just have to remove the LF programmatically.

I normally use Notepad ++ to determine how things really look like. Just open the text file and under View/Show Symbol select "show all characters".

 

Ideally: Post an attachment with your data and also tell us under which OS you're running SAS.

Super User
Posts: 17,829

Re: how to read a comma delimited file with human error "enter"

What happens when you import now? 

Post your code please, if using a data step make sure to specify the DSD option

I would consider a FIND-REPLACE ALL in a text editor such as Notepad++

Solution
‎12-27-2016 09:33 PM
Respected Advisor
Posts: 4,649

Re: how to read a comma delimited file with human error "enter"

[ Edited ]

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG
Super User
Posts: 9,681

Re: how to read a comma delimited file with human error "enter"



data want;
infile 'c:\temp\test.txt' recfm=n dsd dlm='0A2C'x;
input (y1 y2 y3) (:$20.) @@;
run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 227 views
  • 1 like
  • 5 in conversation