BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Arbpg
Calcite | Level 5

Hello,

I need to create a CSV pipe delimited file, including header rows, with no spaces for missing fields, from a SAS dataset.

I used proc export and almost everything in the CSV file was correct except for a few records.  The original data has a few records where the text term has quotation marks.  Example:

StudyCountryRecordVisitnamePageTerm
001US00123B6BaselineMEDICAL PROBLEM STOMACH
001US00123B7Visit 1MEDICAL PROBLEM "HEAD"

and the CSV file is getting incorrectly output to look like this:

Study|Country|Record|Visitname|Page|Term

001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

001|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""

where everything is correct except for the additional quotes surrounding the term field.

Is there a way that proc export can output the data to look like this? --As it is originally.

Study|Country|Record|Visitname|Page|Term

001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"

I've also tried using both ods as well as a file csv statement with specifying a delimiter but have not found the right comination of statements.  In these cases, the header row did not output and an additional space was added for missing fields (this was the file csv where I used a data _null_ and put(all) statement ) or the delimiter didn't work properly and every field was put in quotations (used an options (delimiter= '|') in the ods statement.

Any suggestions greatly appreciated.

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

You can get around that by grabbing the variable names using proc sql.  e.g.:

data medical;

  length Study 8 Country Record Visitname $8 Page 8 Term $30 ;

  infile cards dsd dlm='|';

  input Study Country Record Visitname Page Term;

  file log dsd dlm='|' ;

  put (study -- term) (:);

cards;

001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"

run;

proc sql noprint;

  select name

    into :names separated by "|"

      from dictionary.columns

        where libname="WORK" and

              memname="MEDICAL"

  ;

quit;

filename csv "location\output\Medical_problems.csv";

data _null_;

  set medical;  * from a previous data step, where I have thousands of records;

  file csv dlm = '|';

  if _n_ eq 1 then put "&names.";

  put (_all_) (:);

run;

View solution in original post

9 REPLIES 9
Reeza
Super User

What was your export code?

Arbpg
Calcite | Level 5

My original code was:

proc export data=medical;

outfile="location\output\Medical_Problems.csv";

dbms=dlm replace;

delimiter = '|';

run;


Reeza
Super User

You're not creating a CSV file, because CSV stands for Comma Separated Value.

Try using the menu to do a proc export to DELIMITED TXT file and click options to specify your delimiter, which is the | symbol.  You'll get the Data _NULL_ statements in the log.

Copy those statements,remove the DSD option from the file statement and re-run the code to get what you want.

Or write the code from scratch does work.

Tom
Super User Tom
Super User

It is putting the quotes around the values that contain quotes to prevent confusion. SAS will read that string and automatically remove the extra quoting.

You could try outputting without the DSD option. That will remove the quoting, but it will also cause missing values to display differently.

data have ;

  length Study 8 Country Record Visitname $8 Page 8 Term $30 ;

  infile cards dsd dlm='|';

  input Study -- Term;

  file log dsd dlm='|' ;

  put (study -- term) (:);

cards;

001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

001|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""

run;

With DSD option the missing values are represented as adjacent delimiters and strings with quotes or embedded delimiters are quoted.

1|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

1|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""

Without DSD option the missing values are printed and quoting of problematic strings is not done.

1|USA|00123B6|Baseline|.|MEDICAL PROBLEM STOMACH

1|USA|00123B7|Visit 1|.|MEDICAL PROBLEM "HEAD"

Arbpg
Calcite | Level 5

I've gotten close with this code:

filename csv "location\output\Medical_problems.csv"

data _null_;

set medical;  * from a previous data step, where I have thousands of records;

file csv dlm = ' |';

put (_all_) (:);

run;

except that the column header row is not output

art297
Opal | Level 21

You can get around that by grabbing the variable names using proc sql.  e.g.:

data medical;

  length Study 8 Country Record Visitname $8 Page 8 Term $30 ;

  infile cards dsd dlm='|';

  input Study Country Record Visitname Page Term;

  file log dsd dlm='|' ;

  put (study -- term) (:);

cards;

001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH

001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"

run;

proc sql noprint;

  select name

    into :names separated by "|"

      from dictionary.columns

        where libname="WORK" and

              memname="MEDICAL"

  ;

quit;

filename csv "location\output\Medical_problems.csv";

data _null_;

  set medical;  * from a previous data step, where I have thousands of records;

  file csv dlm = '|';

  if _n_ eq 1 then put "&names.";

  put (_all_) (:);

run;

Tom
Super User Tom
Super User

One easy way to get the header line is to let PROC EXPORT do it for you.

proc export data=medical (obs=0);

outfile='location\output\Medical_Problems.csv';

dbms=dlm replace;

delimiter = '|';

run;

Then you can use the MOD option on the FILE statement in the step that generates the actual data lines.

file 'location\output\Medical_Problems.csv' mod dlm='|' ;

I am still not sure why the extra quotes are a problem.  If the target system does not automatically remove them then you can use the equivalent of the SAS dequote() to remove them.

Also if you are outputting without the DSD option then you need to watch out for values with | in them.  If that value is not quoted then the target file will not be interpretable.

Arbpg
Calcite | Level 5

Sorry for the delay in replying.  I hadn't been able to get back to working on this until today.  Using the sql code worked great to get the variable names, in conjuction with the data _null_ step, and the quotes came out as needed (only if originally in the data).

Thank you everyone for your help and input.  I truly appreciate it!

Tom
Super User Tom
Super User

According to Wikipedia strings with double quote characters should be quoted, just as in your example.

Comma-separated values - Wikipedia, the free encyclopedia

They cite http://tools.ietf.org/html/rfc4180

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 11842 views
  • 3 likes
  • 4 in conversation