hello all,
I am reading a simple .csv file in SAS separately by following two codes, but both fail. I don't why I can't do it. could
the first log is:
NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M3 MBCS3170)
Licensed to UNIVERSITY OF NEWCASTLE UPON TYNE, Site 70117723.
NOTE: This session is executing on the X64_7PRO platform.
NOTE: Updated analytical products:
SAS/STAT 14.1
SAS/ETS 14.1
SAS/OR 14.1
SAS/IML 14.1
SAS/QC 14.1
NOTE: Additional host information:
X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation
NOTE: SAS initialization used:
real time 8.97 seconds
cpu time 0.82 seconds
NOTE: AUTOEXEC processing beginning; file is C:\Users\B\autoexec.sas.
NOTE: Libref SASDATA was successfully assigned as follows:
Engine: V9
Physical Name: R:\L\SASDATA
NOTE: AUTOEXEC processing completed.
1 data P.Publications2 ;
2 infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2 ! firstobs = 2 ;
3 input pat_publn_id :29.
4 publn_auth :$29.
5 publn_nr :$29.
6 publn_nr_original :$29.
7 publn_kind :$29.
8 appln_id :29.
9 publn_date :YYMMDD10.
10 publn_lg :$29.
11 publn_first_grant:29.
12 publn_claims :29. ;
13 format publn_date :YYMMDDd10. ;
14 run ;
NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
Filename=R:\L\P\tls211_pat_publn.csv,
RECFM=V,LRECL=131068,File Size (bytes)=532,
Last Modified=11 May 2018 16:47:59 o'clock,
Create Time=11 May 2018 16:47:58 o'clock
NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2 390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;" ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr= publn_nr_original=
publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3 390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;" ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr= publn_nr_original=
publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4 900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;" ";0;0 62
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5 900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;" ";0;0 63
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6 900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;" ";0;0 63
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7 900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;" ";0;0 62
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
The minimum record length was 62.
The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.54 seconds
cpu time 0.07 seconds
the second is:
1 data P.Publications2 ;
2 infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2 ! firstobs = 2 ;
3 input pat_publn_id :29.
4 publn_auth :$29.
5 publn_nr :$29.
6 publn_nr_original :$29.
7 publn_kind :$29.
8 appln_id :29.
9 publn_date :YYMMDD10.
10 publn_lg :$29.
11 publn_first_grant:29.
12 publn_claims :29. ;
13 format publn_date :YYMMDDd10. ;
14 run ;
NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
file encoding does not match the session encoding. Cross Environment Data Access will be
used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
Filename=R:\L\P\tls211_pat_publn.csv,
RECFM=V,LRECL=131068,File Size (bytes)=532,
Last Modified=11 May 2018 16:47:59 o'clock,
Create Time=11 May 2018 16:47:58 o'clock
NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2 390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;" ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr= publn_nr_original=
publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3 390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;" ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr= publn_nr_original=
publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4 900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;" ";0;0 62
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5 900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;" ";0;0 63
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6 900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;" ";0;0 63
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7 900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;" ";0;0 62
pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=.
publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
The minimum record length was 62.
The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.54 seconds
cpu time 0.07 seconds
could you please give me some suggestion about how to read pat_publn_id in SAS? thanks
The reason both of these fail is the same. You are actually attempting to read the data as fixed column input not delimited. And the data shown indicates that the data is semicolon delimited, not CSV. So your delimiter is wrong.
And why the -bleep- do you have all of the informats attempting to read 29 columns???
See if this works any better.
data P.Publications2 ; infile 'R/L/P/tls211_pat_publn.csv' DLM = ';' DSD missover lrecl = 32767 firstobs = 2 ; informat pat_publn_id 29. publn_auth $29. publn_nr $29. publn_nr_original $29. publn_kind $29. appln_id 29. publn_date YYMMDD10. publn_lg $29. publn_first_grant 29. publn_claims 29. ; format publn_date YYMMDDd10. ; input pat_publn_id publn_auth $ publn_nr $ publn_nr_original $ publn_kind $ appln_id publn_date publn_lg $ publn_first_grant publn_claims ; run ;
Looks like your TXT file is delimited with simi-colon not comma.
@France wrote:
hello all,
I am reading a simple .csv file in SAS separately by following two codes, but both fail. I don't why I can't do it. could
the first log is:
NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.4 (TS1M3 MBCS3170) Licensed to UNIVERSITY OF NEWCASTLE UPON TYNE, Site 70117723. NOTE: This session is executing on the X64_7PRO platform. NOTE: Updated analytical products: SAS/STAT 14.1 SAS/ETS 14.1 SAS/OR 14.1 SAS/IML 14.1 SAS/QC 14.1 NOTE: Additional host information: X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation NOTE: SAS initialization used: real time 8.97 seconds cpu time 0.82 seconds NOTE: AUTOEXEC processing beginning; file is C:\Users\B\autoexec.sas. NOTE: Libref SASDATA was successfully assigned as follows: Engine: V9 Physical Name: R:\L\SASDATA NOTE: AUTOEXEC processing completed. 1 data P.Publications2 ; 2 infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767 2 ! firstobs = 2 ; 3 input pat_publn_id :29. 4 publn_auth :$29. 5 publn_nr :$29. 6 publn_nr_original :$29. 7 publn_kind :$29. 8 appln_id :29. 9 publn_date :YYMMDD10. 10 publn_lg :$29. 11 publn_first_grant:29. 12 publn_claims :29. ; 13 format publn_date :YYMMDDd10. ; 14 run ; NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance. NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is: Filename=R:\L\P\tls211_pat_publn.csv, RECFM=V,LRECL=131068,File Size (bytes)=532, Last Modified=11 May 2018 16:47:59 o'clock, Create Time=11 May 2018 16:47:58 o'clock NOTE: Invalid data for pat_publn_id in line 2 1-32. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--- 2 390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;" ";0;0 72 pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1 NOTE: Invalid data for pat_publn_id in line 3 1-32. 3 390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;" ";0;0 72 pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2 NOTE: Invalid data for pat_publn_id in line 4 1-62. 4 900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;" ";0;0 62 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3 NOTE: Invalid data for pat_publn_id in line 5 1-63. 5 900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;" ";0;0 63 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4 NOTE: Invalid data for pat_publn_id in line 6 1-63. 6 900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;" ";0;0 63 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5 NOTE: Invalid data for pat_publn_id in line 7 1-62. 7 900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;" ";0;0 62 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6 NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'. The minimum record length was 62. The maximum record length was 72. NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables. NOTE: DATA statement used (Total process time): real time 0.54 seconds cpu time 0.07 seconds
the second is:
1 data P.Publications2 ; 2 infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767 2 ! firstobs = 2 ; 3 input pat_publn_id :29. 4 publn_auth :$29. 5 publn_nr :$29. 6 publn_nr_original :$29. 7 publn_kind :$29. 8 appln_id :29. 9 publn_date :YYMMDD10. 10 publn_lg :$29. 11 publn_first_grant:29. 12 publn_claims :29. ; 13 format publn_date :YYMMDDd10. ; 14 run ; NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the file encoding does not match the session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce performance. NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is: Filename=R:\L\P\tls211_pat_publn.csv, RECFM=V,LRECL=131068,File Size (bytes)=532, Last Modified=11 May 2018 16:47:59 o'clock, Create Time=11 May 2018 16:47:58 o'clock NOTE: Invalid data for pat_publn_id in line 2 1-32. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--- 2 390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;" ";0;0 72 pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1 NOTE: Invalid data for pat_publn_id in line 3 1-32. 3 390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;" ";0;0 72 pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2 NOTE: Invalid data for pat_publn_id in line 4 1-62. 4 900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;" ";0;0 62 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3 NOTE: Invalid data for pat_publn_id in line 5 1-63. 5 900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;" ";0;0 63 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4 NOTE: Invalid data for pat_publn_id in line 6 1-63. 6 900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;" ";0;0 63 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5 NOTE: Invalid data for pat_publn_id in line 7 1-62. 7 900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;" ";0;0 62 pat_publn_id=. publn_auth= publn_nr= publn_nr_original= publn_kind= appln_id=. publn_date=. publn_lg= publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6 NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'. The minimum record length was 62. The maximum record length was 72. NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables. NOTE: DATA statement used (Total process time): real time 0.54 seconds cpu time 0.07 seconds
could you please give me some suggestion about how to read pat_publn_id in SAS? thanks
The reason both of these fail is the same. You are actually attempting to read the data as fixed column input not delimited. And the data shown indicates that the data is semicolon delimited, not CSV. So your delimiter is wrong.
And why the -bleep- do you have all of the informats attempting to read 29 columns???
See if this works any better.
data P.Publications2 ; infile 'R/L/P/tls211_pat_publn.csv' DLM = ';' DSD missover lrecl = 32767 firstobs = 2 ; informat pat_publn_id 29. publn_auth $29. publn_nr $29. publn_nr_original $29. publn_kind $29. appln_id 29. publn_date YYMMDD10. publn_lg $29. publn_first_grant 29. publn_claims 29. ; format publn_date YYMMDDd10. ; input pat_publn_id publn_auth $ publn_nr $ publn_nr_original $ publn_kind $ appln_id publn_date publn_lg $ publn_first_grant publn_claims ; run ;
The only change you really needed to make was to DLM
DLM = ';'
Your original input statement with colon INFORMAT modifiers is a correct alternative to using an INFORMAT statement.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.