BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

hi, 

I typically don't deal with "unstructured" text and I know there is a solution to this, I'd appreciate for any guidance and help. 

I have a comma delimited file read as such, string variable will be between two quotes, and numeric will not have quotes.

The first line is the variable name, y1, y2, y3, where y1 and y2 are strings and y3 is numeric. 

as you can notice the second row, there is a human error of "enter" to a new line 

 

v1: (with human "enter")

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3

 

v2: (this is how it's supposed to be)

"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,gsdfj",3

 

v3:(output desired)

y1y2y3
ASLGSDF5
asldhl ser,gsdfj3

 

Would someone please help me, is there anyway I can read the data from v1 to v3?

 

Thanks!

 

Joanne

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

If this is a once-off excercise and you don't have to many data then I would either send the data back to the sender and ask to fix it or then just read in the data into a single string, count the double quotes and then go into the lines with odd counts and fix it manually.

 

IF the manual enters are not end of line delimiters - eg. under Windows end of line would be CRLF but a "manual" enter could just be an LF - then the issue wouldn't be that big and SAS would still treat the record as being on a single line. You would then just have to remove the LF programmatically.

I normally use Notepad ++ to determine how things really look like. Just open the text file and under View/Show Symbol select "show all characters".

 

Ideally: Post an attachment with your data and also tell us under which OS you're running SAS.

Reeza
Super User

What happens when you import now? 

Post your code please, if using a data step make sure to specify the DSD option

I would consider a FIND-REPLACE ALL in a text editor such as Notepad++

PGStats
Opal | Level 21

You could do:

 

data test;
infile datalines firstobs=2 truncover;
length y1 $12 y2 $12 y3 8 q $200;
do until (mod(countc(q,'"'),2)=0);
    input line $200.;
    q = cats(q, line);
    end;
y1 = scan(q,1,",","qr");
y2 = scan(q,2,",","qr");
y3 = input(scan(q,3,",","qr"), ?? best.);
drop line q;
datalines;
"y1","y2","y3"
"ASLG","SDF",5
"asldhl", "ser,g
sdfj",3
;

proc print data=test; run;

PG
Ksharp
Super User


data want;
infile 'c:\temp\test.txt' recfm=n dsd dlm='0A2C'x;
input (y1 y2 y3) (:$20.) @@;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1171 views
  • 1 like
  • 5 in conversation