I tried to import a file in csv format in sas but not without problem.
First I did it normally like:
proc import datafile=reffile
out=test_Eems
dbms=csv
replace;
getnames=yes;
run;
without success and I saw in the log that there could be talk of delimiter ';'.
So I added delimiter ';'.
proc import datafile=reffile
out=test_Eems
dbms=csv
replace;
delimiter=";" ;
getnames=yes;
run;
I get a better result but not always correct because a column (called Bericht) contains all kinds of characters also the semicolon.
Of course I get an error in the log.
What surprises me is that I could open the same file without problem in Excel or a google sheet but not in Sas.
I have converted the same file to an excel file using excel and I could easily read and open it in sas.
My question to you is if I could open the file in Excel and google sheet then it should also work in Sas.
I would like your help with that.
Below are the file that I could not import in csv format but in excel and the log that I got when I tried with csv format
I suspect your file as embedded LF characters?
Try telling SAS to use TERMSTR=CRLF when reading the file.
data test;
infile reffile dsd dlm=';' firstobs=2 termstr=crlf truncover;
input (var1-var12) (:$200.);
run;
If you want to continue to use PROC IMPORT to make GUESSES about what the variables are and what types of data the variables have then you can add the TERMSTR= option to the FILENAME statement that defined the fileref REFFILE you used in your code:
filename reffile 'physical filename' termstr=crlf ;
If I use CSV2DS macro to guess how to read the file it defines the variables this way:
1089 +data test2; 1090 + infile CSV termstr=crlf dlm=';' dsd truncover firstobs=2 ; 1091 + length Datum 8 Url $341 Sentiment $8 Type $7 Discussie_lengte 8 Views 8 1092 + Auteur $81 Volgers 8 Bron $9 Titel $172 Bericht $9010 Labels $1 1093 + ; 1094 + informat Datum anydtdtm. ; 1095 + format Datum datetime19. ; 1096 + label Discussie_lengte='Discussie lengte' ; 1097 + input Datum -- Labels ; 1098 +run;
I suspect your file as embedded LF characters?
Try telling SAS to use TERMSTR=CRLF when reading the file.
data test;
infile reffile dsd dlm=';' firstobs=2 termstr=crlf truncover;
input (var1-var12) (:$200.);
run;
If you want to continue to use PROC IMPORT to make GUESSES about what the variables are and what types of data the variables have then you can add the TERMSTR= option to the FILENAME statement that defined the fileref REFFILE you used in your code:
filename reffile 'physical filename' termstr=crlf ;
If I use CSV2DS macro to guess how to read the file it defines the variables this way:
1089 +data test2; 1090 + infile CSV termstr=crlf dlm=';' dsd truncover firstobs=2 ; 1091 + length Datum 8 Url $341 Sentiment $8 Type $7 Discussie_lengte 8 Views 8 1092 + Auteur $81 Volgers 8 Bron $9 Titel $172 Bericht $9010 Labels $1 1093 + ; 1094 + informat Datum anydtdtm. ; 1095 + format Datum datetime19. ; 1096 + label Discussie_lengte='Discussie lengte' ; 1097 + input Datum -- Labels ; 1098 +run;
Thans for your answer.
I like it and it all works.
But I am still a beginner and the original file(Eems_Dollard.csv) had a starting column called "Zoekopdracht".
When i apply your answer to the original file it does not work and I do not know why.
If you succeed I would like to hear from you.
Thanks again for the effort.
The best way to read this file into SAS is to use the Data Step.
Thank you for your answer.
I like it and it all works.
When i apply your answer to the original file it does not work and I do not know why.
If you succeed I would like to hear from you.
Thanks again for the effort.
You have an additional column in this file , here's the revised code:
data test;
infile "c:\temp\Eems_Dollard.csv" dlm=';' dsd missover firstobs=2;
input Zoekopdracht : $ 200. Datnum : $15. Url : $100. Sentiment : $10. Type : $10.
Discussie_Lengte : 8. Views : 8. Auteur : $25. Volgers : 8.
Bron : $15. Title : $200. Bericht : $300. Labels : $10.;
run;
proc print data=test;
run;
Thanks for your quick reply.
But it doesn't work completely, look at the screen shot(in pdf formaat).
Thanks again.
Using the SAS EG import wizard to generate data step code appears to work.
Below the generated code with some manual changes applied (like longer character variables so they hopefully are also sufficiently long for all your data and not only the sample).
DATA WORK.test_Eems;
LENGTH
Datum 8
Url $ 500
Sentiment $ 8
Type $ 7
Discussie_lengte 8
Views 8
Auteur $ 250
Volgers 8
Bron $ 9
Titel $ 250
Bericht $ 1500
Labels $ 10 ;
LABEL
Discussie_lengte = "Discussie lengte" ;
FORMAT
Datum date9.
Url $500.
Sentiment $8.
Type $7.
Discussie_lengte best32.
Views best32.
Auteur $250.
Volgers best32.
Bron $9.
Titel $250.
Bericht $1500.
Labels $10. ;
INFORMAT
Datum ddmmyy20.
Url $500.
Sentiment $8.
Type $7.
Discussie_lengte best32.
Views best32.
Auteur $250.
Volgers best32.
Bron $9.
Titel $250.
Bericht $1500.
Labels $10. ;
INFILE 'C:\temp\test_Eems.csv'
LRECL=30000
ENCODING="UTF-8"
TERMSTR=CRLF
DLM=';'
truncover
DSD ;
INPUT
Datum
Url
Sentiment
Type
Discussie_lengte
Views
Auteur
Volgers
Bron
Titel
Bericht
Labels
;
RUN;
Thank you for your answer.
I like it and it all works.
But I am still a beginner and the original file(Eems_Dollard.csv) had a starting column called "Zoekopdracht".
When i apply your answer to the original file it does not work and I do not know why.
If you succeed I would like to hear from you.
Thanks again for the effort.
Where are you getting these files? Perhaps you should talk to the providers and get them to clarify what type of files they are producing?
If you look at the first few bytes of that file:
data _null_; infile "C:\downloads\Eems_Dollard.csv" lrecl=100 recfm=f obs=5; input; list; run;
It seems to be using LF only as the end of line marker. Which means that if there are ANY other embedded LF characters (like you had in the other file) then there is no way for SAS to tell the difference between a real END OF LINE marker and these other LF characters.
You might try seeing if the extra LF characters are at least nested inside of double quotes by running this macro, %replace_crlf ,to make a new version of the file where such embedded LF (and or CR) characters are either removed or replaced with some other character.
That seems to work fine for the supplied file.
filename csv "C:\downloads\Eems_Dollard.csv";
filename csv2 temp;
%replace_crlf(csv,csv2,lf=' ',cr=)
%csv2ds(csv2,dlm=';',out=want,replace=YES)
Making a file that can be read by SAS.
5 +data want; 6 + infile CSV2 dlm=';' dsd truncover firstobs=2 ; 7 + length Zoekopdracht $34 Datum 8 Url $341 Sentiment $8 Type $7 8 + Discussie_lengte 8 Views 8 Auteur $81 Volgers 8 Bron $9 Titel $172 9 + Bericht $9010 Labels $1 10 + ; 11 + informat Datum anydtdtm. ; 12 + format Datum datetime19. ; 13 + label Discussie_lengte='Discussie lengte' ; 14 + input Zoekopdracht -- Labels ; 15 +run; NOTE: The infile CSV2 is: (system-specific pathname), (system-specific file attributes) NOTE: 2124 records were read from the infile (system-specific pathname). The minimum record length was 175. The maximum record length was 9208. NOTE: The data set WORK.WANT has 2124 observations and 13 variables. NOTE: DATA statement used (Total process time): real time 0.02 seconds cpu time 0.01 seconds
Thanks for your quick reply.
As I said I am still a beginner so it will be hard enough for me.
But I have already learned something from you so thanks again.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.