BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Yura2301
Quartz | Level 8

Hi all,

I have a problem with exporting data into CSV file. When I'm trying to export data using DATA step it works fine, but it puts blank spaces when string is empty. This case can be fixed after I add DSD file option but another problem occurs, it's additional quotation marks. And I don't need them.
Data example:
options missing='';
data test;
a="";
b="sad";
c="";
d='"asd" asd';
run;
Export using data step:
data _null_;
set test;
FILE "....\test.csv" DLM="#" LRECL=32767;
PUT a b c d;
run;
I also tried PROC EXPORT but it works the same way as DATA STEP with DSD option.

1 ACCEPTED SOLUTION

Accepted Solutions
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, so its the extra space thats the problem, this below should solve that.  However do bear in mind my other points about delimiter - anyone looking at the file will expect CSV data which yours isn't.

data test;
    col1='test';
    col2='';
    col3='"test" data';
run;
data _null_;
  set test;
  file "s:temp\rob\test.csv" dlm="," lrecl=32767;
  line=cats(col1,",",col2,",",col3);
  put line;
run;

View solution in original post

13 REPLIES 13
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Firstly it is not recommend to use the file extension CSV = Comma Separated Variable file, when you are using a has symbol as a delimiter.  TBH its not a good idea to use a has as delimiter at all.

 

Secondly I am not sure what the question is in reference to.  In the example you provide, you explicitly add quotes into the text string:

d='"asd" asd';

Means that the variable d contains the string: "asd" asd

Which when written to the file, looks like that.  Are you saying you want to string quotes out from the variable, then just add in the datastep:

data _null_;
  set test;
  file "s:temp\rob\test.csv" dlm="," lrecl=32767;
  d=compress(d,'"');  /* Single quote, double quote, single quote */
  put a b c d;
run;

This will remove double quotes from your data.

Yura2301
Quartz | Level 8

Sorry for this misunderstanding. Let me explain again.
I have input data like this. All character values. col2 is empty string and col3 contains quotation marks as a part of the string.

Untitled.png

You can use this script to generate this dataset:

data test;
    col1='test';
    col2='';
    col3='"test" data';
run;

I used following export script:

data _null_;
    set test;
    FILE "....\test.csv" DLM="#" LRECL=32767;
    PUT col1 col2 col3;
run;

I run my export script without DSD option and get this result (Blank space at col2 where string is empty):

test# #"test" data

I've also tried to use DSD option:

data _null_;
    set test;
    FILE "....\test.csv" DLM="#" DSD LRECL=32767;
    PUT col1 col2 col3;
run;

I've got the following result after I run my export script with DSD option (No space at col2 but additional quotation marks at col3):

test##"""test"" data"

The problem is that I need neither first nor second result. I need result the same as input data. Like this:

test##"test" data

Can you help me to reach this result?

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Ah, so its the extra space thats the problem, this below should solve that.  However do bear in mind my other points about delimiter - anyone looking at the file will expect CSV data which yours isn't.

data test;
    col1='test';
    col2='';
    col3='"test" data';
run;
data _null_;
  set test;
  file "s:temp\rob\test.csv" dlm="," lrecl=32767;
  line=cats(col1,",",col2,",",col3);
  put line;
run;
Yura2301
Quartz | Level 8

Hi RW9,

 

Thanks for your answer.

But I have one more issue that appeared with your solution.

I have a dataset that contains more columns. You can create it using this script.

data test;
	col1 = 0;
	col2 = 1;
	col3 = 2;
	col4 = 3;
	col5 = '';
	col6 = 'ThisIsTestStringThatIsCuttedSomehow';
	col7 = '';
	col8 = 4567;
	col9 = 8;
	col10 = .;
	col11 = 'TestString';
	col12 = 9;
	col13 = 10;
	col14 = 'TestStringWith"Quotation"Marks';
run;

So, if I use your solution it cuts out part of the string in column "col6".

Funny thing is when I change value of the "col1" to something else except 0, string in "col6" will not be cutted.

Below is the script that I used:

option missing='';
data tmp1;
	length line $32767;
	set test;
	file "...\test.csv" dlm="," lrecl=32767;
	line = cats(col1,",",col2,",",col3,",",col4,",",col5,",",col6,",",col7,",",col8,",",col9,",",col10,",",col11,",",col12,",",col13,",",col14);
	put line;
run;

Hopefully, you have some other solution or know how to fix that.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

It is  because your length statement is invalid, lengths can only be $2000 max, the below works.  What I would say however is does the space really matter that much?  I just think your creating a lot of work for yourself over such a small matter.  You could simply do:

data _null_;
	set test;
	file "s:\temp\rob\test.csv" dlm="," lrecl=32767;
	put (_all_) (~);
run;

The space doesn't really matter that much.  The reason that it comes out is strings are alway one space. 

 

data test;
	col1 = 0;
	col2 = 1;
	col3 = 2;
	col4 = 3;
	col5 = '';
	col6 = 'ThisIsTestStringThatIsCuttedSomehow';
	col7 = '';
	col8 = 4567;
	col9 = 8;
	col10 = .;
	col11 = 'TestString';
	col12 = 9;
	col13 = 10;
	col14 = 'TestStringWith"Quotation"Marks';
run;

data _null_;
	length line $2000;
	set test;
	file "s:\temp\rob\test.csv" dlm="," lrecl=32767;
	line = cats(col1,",",col2,",",col3,",",col4,",",col5,",",col6,",",col7,",",col8,",",col9,",",col10,",",col11,",",col12,",",col13,",",col14);
	put line;
run;
Yura2301
Quartz | Level 8

Hi RW9,

 

It works because you lost "missing" option

option missing='';

After you add this option, it will cut the "col6". Really don't know why is it so...

RW9
Diamond | Level 26 RW9
Diamond | Level 26

No, I get exactly the same result with or without the missing.

 

That being said though Patrick does raise a good point, why not just use proc export?  If you do that, and then import it again, the data is correct , i.e. your text string still has " " in it.  The reason is puts additional quotes around that is so that it can handle the quotes in the text.  So if you do this, you get the same input as output:

data test;
	col1 = 0;
	col2 = 1;
	col3 = 2;
	col4 = 3;
	col5 = '';
	col6 = 'ThisIsTestStringThatIsCuttedSomehow';
	col7 = '';
	col8 = 4567;
	col9 = 8;
	col10 = .;
	col11 = 'TestString';
	col12 = 9;
	col13 = 10;
	col14 = 'TestStringWith"Quotation"Marks';
  output; output; output; output; output;
run;
PROC EXPORT DATA=work.test
  OUTFILE="s:\temp\rob\test2.csv"
  REPLACE
  DBMS=CSV;
  PUTNAMES=NO;
run;
proc import datafile="s:\temp\rob\test2.csv" out=test2;
run;

 

Yura2301
Quartz | Level 8

@RW9 the problem is that I'm using CSV file not for import into SAS, I'm using it in other system, that's why I need so special result.

I'll use you solution with CATS function without MISSING option and then just remove ",.," with regular expresion.

Thanks everybody for your answer.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Well, I am glad you have a solution.  The . is created because the variables are numeric.  You could just convert them to character.  As for the other system, it shouldn't matter to it if there is a space there, though I can understand it probably doesn't like "." as that is a SAS specific thing.  You could of course do:

line=compress(cats(col1,col2,col3),".");

To get rid of the dot, though as always, the cats string is doing an implicit conversion to character on your numeric values.  I advise against letting it guess this and put it in explicitly:

line=compress(cats(put(col1,3.),put(col2,3.),col3),".");

In the above you can set the format of the number to whatever you want, and its clear that it is number to character conversion.

ballardw
Super User

 

Line = catx(',', col1,col2,col3,col4 ..., col14);  /* I'm too lazy this morning to type out all the variable names*/

Yura2301
Quartz | Level 8

I've tried already this way. "Catx" removes all missing strings. In my example it removes "col5" and "col7", so looks looks like this:

0,1,2,3,ThisIsTestStringThatIsCuttedSomehow,4567,8,.,TestString,9,10,TestStringWith"Quotation"Marks

It breaks the file struckture. And also if you add option missing='' it will cut the "col6"...

Patrick
Opal | Level 21

@Yura2301 PROC EXPORT should give you what you're after without a lot of coding. It also deals with the quoting for csv's.

 


PROC EXPORT DATA=work.test
  OUTFILE="c:\temp\test2.csv"
  REPLACE
  DBMS=CSV;
  PUTNAMES=NO;
run;
Yura2301
Quartz | Level 8

@Patrick this solution works fine except quotation marks. That's a string I have:

TestStringWith"Quotation"Marks

And what I've got:

"TestStringWith""Quotation""Marks"

But I'm really don't need these additional quotation marks.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 11983 views
  • 0 likes
  • 4 in conversation