BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Andrea_Peng
Obsidian | Level 7

I got a datasets with some cells contain multiple lines. Like row3 under SrreComm column, there are two lines (see picture below. When importing the data, sas will automatically read this one cell as 2 records. I tried to use TERMSTR=CRLF but did not work. Anyone know how to deal with it?

 

Thanks,

Andrea

 

My codes:

 

data WORK.TUMOUR2 ;
%let _EFIERR_ = 0;
infile '&dir2.\&foldername.\tumour_noclass_MA.csv' delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 TERMSTR=CRLF;
informat DDtlODsc $89. ;
informat SrceComm $255. ;
informat COUNT best12.;
informat tumour_type $9. ;
informat indicator_type $10. ;
informat MA best12. ;
format DDtlODsc $89. ;
format SrceComm $255. ;
format COUNT best12. ;
format tumour_type $9. ;
format indicator_type $10. ;
format MA best12. ;
input DDtlODsc $ SrceComm $ COUNT tumour_type $ indicator_type $ MA;
if _ERROR_ then call symputx('_EFIERR_',1);
run;

Capture.JPG

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20

Notepad is useless for this as it doesn't display non-printable characters.

You need to identify what character is causing the issue.

Run this

data TEST3;
  infile "&dir2.\&foldername.\tumour_noclass_MA.csv" recfm=n;
  input X $1. @@;
  putlog X $1. @3 X $hex2.;
run; 

It will print one line par character in the log, together with its hexadecimal code.

Look after 133,,, see what you have.

Regardless, you should clean your data *before* you export it to CSV, and remove non-printable characters from strings if they are unneeded, like this:

STRING=compress(STRING,,'wk');

Non-printable characters will always mess up CSV files.

 

 

 

View solution in original post

11 REPLIES 11
Reeza
Super User

What happens if you remove the MISSOVER, should be TRUNCOVER anyways and DSD. 

 

Usually this gets read in properly in my experience, but I'm assuming you created this CSV from Excel for some reason?

Are you sure that comment field doesn't have comma's in it?

 

Andrea_Peng
Obsidian | Level 7

Hi Reeza,

 

Yes. The data comes from an excel file, which was from SAS! Some cells do have cooma but the major problem seems like coming from the split lines instead of comma. Not quite understand the reason. I swap missover to truncover now and put ignoredoseof. Get nothing! And no error in the log.

 

Andrea

Reeza
Super User

Did you figure out the character that's causing the issue?

Is it 

'041'x?

BrunoMueller
SAS Super FREQ

Hi

 

To read a file that has LF,  x"0a" as part of their data value, you can use the option TERMSTR=CRLF, this was alread metioned.

 

If you are using SAS Enterprise Guide, by default you will not see the text that comes after the LF. Better do a Proc PRINT from your data.

 

The following code:

filename xcsv "c:\temp\sample.csv" termstr=crlf;
proc import file=xcsv out=want dbms=csv replace ;
run;
proc print data=want;
run;

Will read a file that looks like this:

Snap8.png

 

 

Andrea_Peng
Obsidian | Level 7

Hi Bruno,

 

An exellent idea! But SAS Enterprise Guide seems can't find the file. I will have a look the problem.

 

Cheers,

Andrea

Reeza
Super User

Most likely EG is on the server and the file is on your local drive. 

 

To see 'non-printable' characters you need to use a decent text editor, ie NotePad++ or TextWranger are the two that I use. 

Andrea_Peng
Obsidian | Level 7

Hi Reeze,

 

I just used Chris's idea, remove the non-printable characters and exported the file again! It is sovled!

 

Thanks for all you guys suggestions!

 

Cheers,

Andrea

Andrea_Peng
Obsidian | Level 7

Hi Reeze,

 

I use notepad to look at the data, finding there is a quote when data is split into 2 lines. I suspect it is the quote but not sure, I will have a try.

 

Capture.JPG

 

Cheers,

Andrea

ChrisNZ
Tourmaline | Level 20

Notepad is useless for this as it doesn't display non-printable characters.

You need to identify what character is causing the issue.

Run this

data TEST3;
  infile "&dir2.\&foldername.\tumour_noclass_MA.csv" recfm=n;
  input X $1. @@;
  putlog X $1. @3 X $hex2.;
run; 

It will print one line par character in the log, together with its hexadecimal code.

Look after 133,,, see what you have.

Regardless, you should clean your data *before* you export it to CSV, and remove non-printable characters from strings if they are unneeded, like this:

STRING=compress(STRING,,'wk');

Non-printable characters will always mess up CSV files.

 

 

 

Andrea_Peng
Obsidian | Level 7

Hi Chris,

 

I just removed the non-printable characters and exported the file again! It works well now!

 

Thanks,

Andrea

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 3663 views
  • 1 like
  • 4 in conversation