- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hi all:
I am trying to read in a CSV file with DLM="|" , there is only one column
but I would need to a SAS file with different columns
Question:
should I use data step or PROC import ?
thanks a ton.
/*===please see the code ====*/
filename myfile "C:\test.csv" lrecl=10000;
data want2;
infile myfile dlm="|" dsd truncover firstobs=2;
input
STUDYID $
SUBJECT $
DEVICEID $
PLACE $
DATE iso8601.
TOTALSITMIN
RUNMINUTES ;
run;
/*========log error: ==========*/
NOTE: Invalid data for DATE in line 2 20-38.
NOTE: EQUAL SIGN not found.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 001|20-1|C1| Wrist|2020-02-12T00:00:00|696|168| 47
STUDYID=001 SUBJECT=20-1 DEVICEID=C1 PLACE=Wrist DATE= INFORMAT= TOTALSITMIN=696 RUNMINUTES=168 _ERROR_=1 _N_=1
NOTE: Invalid data for DATE in line 3 20-38.
/*===please see attached file test.csv==*/
STUDYID|SUBJECT|DEVICEID|PLACE|DATE|TOTALSITMIN|RUNMINUTES| |
001|20-1|C1| Wrist|2020-02-12T00:00:00|696|168| |
001|21-5|C1| Wrist|2020-02-12T00:00:00||69| |
001|25-5|C1||2020-02-12T00:00:00|1066|69| |
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend trying PROC IMPORT if you're having issues.
Then you can at least check the code generated in the log to see if it's close to yours or where it's different.
proc import data=want2 datafile="C:\test.csv" dbms=dlm replace;
delimiter='|';
guessingrows=max;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I would recommend trying PROC IMPORT if you're having issues.
Then you can at least check the code generated in the log to see if it's close to yours or where it's different.
proc import data=want2 datafile="C:\test.csv" dbms=dlm replace;
delimiter='|';
guessingrows=max;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
hi Reeza:
that is a great solution and explanation and it worked.
thanks so much.
/*===================*/
filename myfile "C;\test.cs" lrecl=10000;
proc import datafile=myfile
out=want
/*****DBMS= specifies that the input file is a delimited external file.
If a ASCII file is delimited by other than a space, you need to explicitly specify delimiter using DELIMITER (or DLM) option with using various characters or combinations of characters as a delimiter.
For example,
DLM='09'x for Tab, DLM='&' for ampersand , DLM='^' for caret, delimiter="|" for pipe which is a space delimited ASCII text file,******/
dbms=dlm
replace;
delimiter='|'; /*specify here the delimiter*/
guessingrows=max;
run;
/*added to make the same format of DATE variable */
data want2;
set want ;
format Date e8601dt19.;
run;
/***please see the difference***/
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There is no ISO informat, and even if there was I doubt it would allow reading of 8,601 bytes. Looks like you want the E8610DT informat instead.
Do not use FORMATTED input mode when reading a delimited file. If you read a fixed number of bytes from the line you might:
1) Read the delimiter in front of the value.
2) Read the delimiter after the value and possibly part of the following value.
3) Not read the full value, which could mess up this variable and reading of the next variable.
If you want to include an informat specification in the INPUT statement then always include the : modifier so that the variable is read using LIST MODE so that the delimiters are honored.
input
STUDYID $
SUBJECT $
DEVICEID $
PLACE $
DATE :E8601DT.
TOTALSITMIN
RUNMINUTES
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi Tom:
I updated the code per your suggestion and added format of DATE , here is the result. It works too.
thanks again.
/**====CODE====**/
filename myfile "C:\test.csv" lrecl=10000;
data want2noformat;
infile myfile dlm="|" dsd truncover firstobs=2;
input
STUDYID $
SUBJECT $
DEVICEID $
PLACE $
DATE :E8601DT.
TOTALSITMIN
RUNMINUTES ;
run;
data wantfomat;
infile myfile dlm="|" dsd truncover firstobs=2;
input
STUDYID $
SUBJECT $
DEVICEID $
PLACE $
DATE :E8601DT.
TOTALSITMIN
RUNMINUTES ;
format date e8601dt.;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Just to be clear that variable does not contain DATE values. It contains DATETIME values. Using a variable name of DATE will likely cause confusion. SAS stores dates as number of days and datetime as number of seconds.