BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
France
Quartz | Level 8

hello all,

 

I am reading a simple .csv file in SAS separately by following two codes, but both fail. I don't why I can't do it. could

the first log is:

 

NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M3 MBCS3170)
      Licensed to UNIVERSITY OF NEWCASTLE UPON TYNE, Site 70117723.
NOTE: This session is executing on the X64_7PRO  platform.



NOTE: Updated analytical products:

      SAS/STAT 14.1
      SAS/ETS 14.1
      SAS/OR 14.1
      SAS/IML 14.1
      SAS/QC 14.1

NOTE: Additional host information:

 X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation

NOTE: SAS initialization used:
      real time           8.97 seconds
      cpu time            0.82 seconds


NOTE: AUTOEXEC processing beginning; file is C:\Users\B\autoexec.sas.

NOTE: Libref SASDATA was successfully assigned as follows:
      Engine:        V9
      Physical Name: R:\L\SASDATA

NOTE: AUTOEXEC processing completed.

1    data P.Publications2 ;
2    infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2  ! firstobs = 2 ;
3    input pat_publn_id :29.
4    publn_auth :$29.
5    publn_nr :$29.
6    publn_nr_original :$29.
7    publn_kind :$29.
8    appln_id :29.
9    publn_date :YYMMDD10.
10   publn_lg :$29.
11   publn_first_grant:29.
12   publn_claims :29. ;
13   format publn_date :YYMMDDd10. ;
14   run ;

NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
      file encoding does not match the session encoding. Cross Environment Data Access will be
      used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
      Filename=R:\L\P\tls211_pat_publn.csv,
      RECFM=V,LRECL=131068,File Size (bytes)=532,
      Last Modified=11 May 2018 16:47:59 o'clock,
      Create Time=11 May 2018 16:47:58 o'clock

NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2         390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3         390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4         900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5         900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6         900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7         900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
      The minimum record length was 62.
      The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.54 seconds
      cpu time            0.07 seconds


the second is:

1    data P.Publications2 ;
2    infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2  ! firstobs = 2 ;
3    input pat_publn_id :29.
4    publn_auth :$29.
5    publn_nr :$29.
6    publn_nr_original :$29.
7    publn_kind :$29.
8    appln_id :29.
9    publn_date :YYMMDD10.
10   publn_lg :$29.
11   publn_first_grant:29.
12   publn_claims :29. ;
13   format publn_date :YYMMDDd10. ;
14   run ;

NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
      file encoding does not match the session encoding. Cross Environment Data Access will be
      used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
      Filename=R:\L\P\tls211_pat_publn.csv,
      RECFM=V,LRECL=131068,File Size (bytes)=532,
      Last Modified=11 May 2018 16:47:59 o'clock,
      Create Time=11 May 2018 16:47:58 o'clock

NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2         390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3         390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4         900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5         900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6         900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7         900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
      The minimum record length was 62.
      The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.54 seconds
      cpu time            0.07 seconds


could you please give me some suggestion about how to read pat_publn_id in SAS? thanks

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The reason both of these fail is the same. You are actually attempting to read the data as fixed column input not delimited. And the data shown indicates that the data is semicolon delimited, not CSV. So your delimiter is wrong.

And why the -bleep- do you have all of the informats attempting to read 29 columns???

 

See if this works any better.

data P.Publications2 ;
   infile 'R/L/P/tls211_pat_publn.csv' DLM = ';' DSD missover lrecl = 32767
   firstobs = 2 ;
   informat 
      pat_publn_id 29.
      publn_auth $29.
      publn_nr $29.
      publn_nr_original $29.
      publn_kind $29.
      appln_id 29.
      publn_date YYMMDD10.
      publn_lg $29.
      publn_first_grant 29.
      publn_claims 29. 
   ;
   format publn_date YYMMDDd10. ;
   input 
      pat_publn_id 
      publn_auth $
      publn_nr $
      publn_nr_original $
      publn_kind $
      appln_id 
      publn_date 
      publn_lg $
      publn_first_grant 
      publn_claims  
  ;
run ;

 

View solution in original post

4 REPLIES 4
data_null__
Jade | Level 19

Looks like your TXT file is delimited with simi-colon not comma.

 


@France wrote:

hello all,

 

I am reading a simple .csv file in SAS separately by following two codes, but both fail. I don't why I can't do it. could

the first log is:

 

NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA.
NOTE: SAS (r) Proprietary Software 9.4 (TS1M3 MBCS3170)
      Licensed to UNIVERSITY OF NEWCASTLE UPON TYNE, Site 70117723.
NOTE: This session is executing on the X64_7PRO  platform.



NOTE: Updated analytical products:

      SAS/STAT 14.1
      SAS/ETS 14.1
      SAS/OR 14.1
      SAS/IML 14.1
      SAS/QC 14.1

NOTE: Additional host information:

 X64_7PRO WIN 6.1.7601 Service Pack 1 Workstation

NOTE: SAS initialization used:
      real time           8.97 seconds
      cpu time            0.82 seconds


NOTE: AUTOEXEC processing beginning; file is C:\Users\B\autoexec.sas.

NOTE: Libref SASDATA was successfully assigned as follows:
      Engine:        V9
      Physical Name: R:\L\SASDATA

NOTE: AUTOEXEC processing completed.

1    data P.Publications2 ;
2    infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2  ! firstobs = 2 ;
3    input pat_publn_id :29.
4    publn_auth :$29.
5    publn_nr :$29.
6    publn_nr_original :$29.
7    publn_kind :$29.
8    appln_id :29.
9    publn_date :YYMMDD10.
10   publn_lg :$29.
11   publn_first_grant:29.
12   publn_claims :29. ;
13   format publn_date :YYMMDDd10. ;
14   run ;

NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
      file encoding does not match the session encoding. Cross Environment Data Access will be
      used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
      Filename=R:\L\P\tls211_pat_publn.csv,
      RECFM=V,LRECL=131068,File Size (bytes)=532,
      Last Modified=11 May 2018 16:47:59 o'clock,
      Create Time=11 May 2018 16:47:58 o'clock

NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2         390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3         390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4         900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5         900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6         900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7         900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
      The minimum record length was 62.
      The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.54 seconds
      cpu time            0.07 seconds


the second is:

1    data P.Publications2 ;
2    infile 'R:/L/P/tls211_pat_publn.csv' DLM = ',' DSD missover lrecl = 32767
2  ! firstobs = 2 ;
3    input pat_publn_id :29.
4    publn_auth :$29.
5    publn_nr :$29.
6    publn_nr_original :$29.
7    publn_kind :$29.
8    appln_id :29.
9    publn_date :YYMMDD10.
10   publn_lg :$29.
11   publn_first_grant:29.
12   publn_claims :29. ;
13   format publn_date :YYMMDDd10. ;
14   run ;

NOTE: Data file P.PUBLICATIONS2.DATA is in a format that is native to another host, or the
      file encoding does not match the session encoding. Cross Environment Data Access will be
      used, which might require additional CPU resources and might reduce performance.
NOTE: The infile 'R:/L/P/tls211_pat_publn.csv' is:
      Filename=R:\L\P\tls211_pat_publn.csv,
      RECFM=V,LRECL=131068,File Size (bytes)=532,
      Last Modified=11 May 2018 16:47:59 o'clock,
      Create Time=11 May 2018 16:47:58 o'clock

NOTE: Invalid data for pat_publn_id in line 2 1-32.
RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+---
2         390735978;"HK";"1053433";"09/465,054";"A1";275562685;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=054";"A1";275562685;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=1
NOTE: Invalid data for pat_publn_id in line 3 1-32.
3         390753387;"HK";"1092990";"60/523,466";"A1";275562719;2010-03-26;"  ";0;0 72
pat_publn_id=. publn_auth=466";"A1";275562719;2010-03-2 publn_nr=  publn_nr_original=
publn_kind=  appln_id=. publn_date=. publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=2
NOTE: Invalid data for pat_publn_id in line 4 1-62.
4         900093015;"JP";"1174460";"";"A ";930063756;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=3
NOTE: Invalid data for pat_publn_id in line 5 1-63.
5         900063480;"JP";"03291571";"";"A ";930090062;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=4
NOTE: Invalid data for pat_publn_id in line 6 1-63.
6         900091165;"JP";"61193139";"";"A ";930101529;9999-12-31;"  ";0;0 63
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=5
NOTE: Invalid data for pat_publn_id in line 7 1-62.
7         900122268;"JP";"5861053";"";"A ";930111753;9999-12-31;"  ";0;0 62
pat_publn_id=. publn_auth=  publn_nr=  publn_nr_original=  publn_kind=  appln_id=. publn_date=.
publn_lg=  publn_first_grant=. publn_claims=. _ERROR_=1 _N_=6
NOTE: 6 records were read from the infile 'R:/Lipeng_Wang/PATSTAT/tls211_pat_publn.csv'.
      The minimum record length was 62.
      The maximum record length was 72.
NOTE: The data set P.PUBLICATIONS2 has 6 observations and 10 variables.
NOTE: DATA statement used (Total process time):
      real time           0.54 seconds
      cpu time            0.07 seconds


could you please give me some suggestion about how to read pat_publn_id in SAS? thanks

 


 

ballardw
Super User

The reason both of these fail is the same. You are actually attempting to read the data as fixed column input not delimited. And the data shown indicates that the data is semicolon delimited, not CSV. So your delimiter is wrong.

And why the -bleep- do you have all of the informats attempting to read 29 columns???

 

See if this works any better.

data P.Publications2 ;
   infile 'R/L/P/tls211_pat_publn.csv' DLM = ';' DSD missover lrecl = 32767
   firstobs = 2 ;
   informat 
      pat_publn_id 29.
      publn_auth $29.
      publn_nr $29.
      publn_nr_original $29.
      publn_kind $29.
      appln_id 29.
      publn_date YYMMDD10.
      publn_lg $29.
      publn_first_grant 29.
      publn_claims 29. 
   ;
   format publn_date YYMMDDd10. ;
   input 
      pat_publn_id 
      publn_auth $
      publn_nr $
      publn_nr_original $
      publn_kind $
      appln_id 
      publn_date 
      publn_lg $
      publn_first_grant 
      publn_claims  
  ;
run ;

 

France
Quartz | Level 8
many thanks for your help.
data_null__
Jade | Level 19

The only change you really needed to make was to DLM

 

DLM = ';'

Your original input statement with colon INFORMAT modifiers is a correct alternative to using an INFORMAT statement.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1113 views
  • 3 likes
  • 3 in conversation