BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

hi, 

I typically don't deal with "unstructured" text and I know there is a solution to this, I'd appreciate for any guidance and help. 

I have a comma delimited file read as such, string variable will be between two quotes, and numeric will not have quotes.

The first line is the variable name, y1, y2, y3, where y1 and y2 are strings and y3 is numeric. 

as you can notice the second row, there is a human error of "enter" to a new line 

 

v1: (with human "enter")

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3

 

v2: (this is how it's supposed to be)

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,gsdfj",3

 

v3:(output desired)

y1y2y3
ASLGSDF5
asldhl ser,gsdfj3

 

Would someone please help me, is there anyway I can read the data from v1 to v3?

 

Thanks!

 

Joanne

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

If this is a once-off excercise and you don't have to many data then I would either send the data back to the sender and ask to fix it or then just read in the data into a single string, count the double quotes and then go into the lines with odd counts and fix it manually.

 

IF the manual enters are not end of line delimiters - eg. under Windows end of line would be CRLF but a "manual" enter could just be an LF - then the issue wouldn't be that big and SAS would still treat the record as being on a single line. You would then just have to remove the LF programmatically.

I normally use Notepad ++ to determine how things really look like. Just open the text file and under View/Show Symbol select "show all characters".

 

Ideally: Post an attachment with your data and also tell us under which OS you're running SAS.

Reeza
Super User

What happens when you import now? 

Post your code please, if using a data step make sure to specify the DSD option

I would consider a FIND-REPLACE ALL in a text editor such as Notepad++

PGStats
Opal | Level 21

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG
Ksharp
Super User


data want;
infile 'c:\temp\test.txt' recfm=n dsd dlm='0A2C'x;
input (y1 y2 y3) (:$20.) @@;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 978 views
  • 1 like
  • 5 in conversation