Hello,
I need to create a CSV pipe delimited file, including header rows, with no spaces for missing fields, from a SAS dataset.
I used proc export and almost everything in the CSV file was correct except for a few records. The original data has a few records where the text term has quotation marks. Example:
Study | Country | Record | Visitname | Page | Term |
---|---|---|---|---|---|
001 | US | 00123B6 | Baseline | MEDICAL PROBLEM STOMACH | |
001 | US | 00123B7 | Visit 1 | MEDICAL PROBLEM "HEAD" |
and the CSV file is getting incorrectly output to look like this:
Study|Country|Record|Visitname|Page|Term
001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
001|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""
where everything is correct except for the additional quotes surrounding the term field.
Is there a way that proc export can output the data to look like this? --As it is originally.
Study|Country|Record|Visitname|Page|Term
001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"
I've also tried using both ods as well as a file csv statement with specifying a delimiter but have not found the right comination of statements. In these cases, the header row did not output and an additional space was added for missing fields (this was the file csv where I used a data _null_ and put(all) statement ) or the delimiter didn't work properly and every field was put in quotations (used an options (delimiter= '|') in the ods statement.
Any suggestions greatly appreciated.
Thank you!
You can get around that by grabbing the variable names using proc sql. e.g.:
data medical;
length Study 8 Country Record Visitname $8 Page 8 Term $30 ;
infile cards dsd dlm='|';
input Study Country Record Visitname Page Term;
file log dsd dlm='|' ;
put (study -- term) (:);
cards;
001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"
run;
proc sql noprint;
select name
into :names separated by "|"
from dictionary.columns
where libname="WORK" and
memname="MEDICAL"
;
quit;
filename csv "location\output\Medical_problems.csv";
data _null_;
set medical; * from a previous data step, where I have thousands of records;
file csv dlm = '|';
if _n_ eq 1 then put "&names.";
put (_all_) (:);
run;
What was your export code?
My original code was:
proc export data=medical;
outfile="location\output\Medical_Problems.csv";
dbms=dlm replace;
delimiter = '|';
run;
You're not creating a CSV file, because CSV stands for Comma Separated Value.
Try using the menu to do a proc export to DELIMITED TXT file and click options to specify your delimiter, which is the | symbol. You'll get the Data _NULL_ statements in the log.
Copy those statements,remove the DSD option from the file statement and re-run the code to get what you want.
Or write the code from scratch does work.
It is putting the quotes around the values that contain quotes to prevent confusion. SAS will read that string and automatically remove the extra quoting.
You could try outputting without the DSD option. That will remove the quoting, but it will also cause missing values to display differently.
data have ;
length Study 8 Country Record Visitname $8 Page 8 Term $30 ;
infile cards dsd dlm='|';
input Study -- Term;
file log dsd dlm='|' ;
put (study -- term) (:);
cards;
001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
001|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""
run;
With DSD option the missing values are represented as adjacent delimiters and strings with quotes or embedded delimiters are quoted.
1|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
1|USA|00123B7|Visit 1||"MEDICAL PROBLEM ""HEAD"""
Without DSD option the missing values are printed and quoting of problematic strings is not done.
1|USA|00123B6|Baseline|.|MEDICAL PROBLEM STOMACH
1|USA|00123B7|Visit 1|.|MEDICAL PROBLEM "HEAD"
I've gotten close with this code:
filename csv "location\output\Medical_problems.csv"
data _null_;
set medical; * from a previous data step, where I have thousands of records;
file csv dlm = ' |';
put (_all_) (:);
run;
except that the column header row is not output
You can get around that by grabbing the variable names using proc sql. e.g.:
data medical;
length Study 8 Country Record Visitname $8 Page 8 Term $30 ;
infile cards dsd dlm='|';
input Study Country Record Visitname Page Term;
file log dsd dlm='|' ;
put (study -- term) (:);
cards;
001|USA|00123B6|Baseline||MEDICAL PROBLEM STOMACH
001|USA|00123B7|Visit 1||MEDICAL PROBLEM "HEAD"
run;
proc sql noprint;
select name
into :names separated by "|"
from dictionary.columns
where libname="WORK" and
memname="MEDICAL"
;
quit;
filename csv "location\output\Medical_problems.csv";
data _null_;
set medical; * from a previous data step, where I have thousands of records;
file csv dlm = '|';
if _n_ eq 1 then put "&names.";
put (_all_) (:);
run;
One easy way to get the header line is to let PROC EXPORT do it for you.
proc export data=medical (obs=0);
outfile='location\output\Medical_Problems.csv';
dbms=dlm replace;
delimiter = '|';
run;
Then you can use the MOD option on the FILE statement in the step that generates the actual data lines.
file 'location\output\Medical_Problems.csv' mod dlm='|' ;
I am still not sure why the extra quotes are a problem. If the target system does not automatically remove them then you can use the equivalent of the SAS dequote() to remove them.
Also if you are outputting without the DSD option then you need to watch out for values with | in them. If that value is not quoted then the target file will not be interpretable.
Sorry for the delay in replying. I hadn't been able to get back to working on this until today. Using the sql code worked great to get the variable names, in conjuction with the data _null_ step, and the quotes came out as needed (only if originally in the data).
Thank you everyone for your help and input. I truly appreciate it!
According to Wikipedia strings with double quote characters should be quoted, just as in your example.
Comma-separated values - Wikipedia, the free encyclopedia
They cite http://tools.ietf.org/html/rfc4180
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.