BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Spartan63611
Calcite | Level 5

New to SAS 

Attempting to Import a .TSV Dataset and I get error message.

I included a screenshot of the log, the import command and a sample of the data in wordpad.

I greatly appreciate any help. 

Thank you,

proc import datafile='C:\2605SAS\IMDB\Unzippeddata\name.basics.tsv'
            out=name
            dbms=dlm
            replace;
	   datarow=10;
     delimiter='09'x;
run;

log :

116 dbms=dlm
117 replace;
118 datarow=10;
119 delimiter='09'x;
120 run;

121 /**********************************************************************
122 * PRODUCT: SAS
123 * VERSION: 9.4
124 * CREATOR: External File Interface
125 * DATE: 08NOV18
126 * DESC: Generated SAS Datastep Code
127 * TEMPLATE SOURCE: (None Specified.)
128 ***********************************************************************/
129 data WORK.NAME ;
130 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
131 infile 'C:\2605SAS\IMDB\Unzippeddata\name.basics.tsv' delimiter='09'x MISSOVER DSD
131! lrecl=32767 firstobs=10 ;
132 informat nconst $9. ;
133 informat primaryName $19. ;
134 informat birthYear best32. ;
135 informat deathYear $4. ;
136 informat primaryProfession $37. ;
137 informat knownForTitles $39. ;
138 format nconst $9. ;
139 format primaryName $19. ;
140 format birthYear best12. ;
141 format deathYear $4. ;
142 format primaryProfession $37. ;
143 format knownForTitles $39. ;
144 input
145 nconst $
146 primaryName $
147 birthYear
148 deathYear $
149 primaryProfession $
150 knownForTitles $
151 ;
152 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */
153 run;

NOTE: The infile 'C:\2605SAS\IMDB\Unzippeddata\name.basics.tsv' is:
Filename=C:\2605SAS\IMDB\Unzippeddata\name.basics.tsv,
RECFM=V,LRECL=32767,
File Size (bytes)=539959919,
Last Modified=November 08, 2018 07:20:56,
Create Time=November 08, 2018 07:20:56

NOTE: Invalid data for birthYear in line 84 23-24.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----

84 CHAR nm0000083.Alan Miller.\N.\N.editor,writer,director.tt0969216,tt0170560,tt0424773 80
ZONE 66333333304666246666705405406667672776767266766767077333333327733333332773333333
NUMR ED000008391C1E0D9CC529CE9CE95494F2C729452C492534F29440969216C440170560C440424773
nconst=nm0000083 primaryName=Alan Miller birthYear=. deathYear=\N
primaryProfession=editor,writer,director knownForTitles=tt0969216,tt0170560,tt0424773 _ERROR_=1
_N_=75
NOTE: Invalid data for birthYear in line 95 21-22.

95 CHAR nm0000094.J. Reifel.\N.\N.writer.tt0118631,tt0117915,tt0116030,tt0118886 72
ZONE 663333333042256666605405407767670773333333277333333327733333332773333333
NUMR ED00000949AE025965C9CE9CE97294529440118631C440117915C440116030C440118886
nconst=nm0000094 primaryName=J. Reifel birthYear=. deathYear=\N primaryProfession=writer
knownForTitles=tt0118631,tt0117915,tt0116030,tt0118886 _ERROR_=1 _N_=86
NOTE: Invalid data for birthYear in line 648 24-25.


RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----

648 CHAR nm0000647.Alan Smithee.\N.\N.director,actor,writer.tt0118577,tt2076794,tt0086491,tt011651
ZONE 66333333304666256676660540540667667672667672776767077333333327733333332773333333277333333
NUMR ED000064791C1E03D948559CE9CE9492534F2C134F2C7294529440118577C442076794C440086491C44011651
90 4 90
nconst=nm0000647 primaryName=Alan Smithee birthYear=. deathYear=\N
primaryProfession=director,actor,writer knownForTitles=tt0118577,tt2076794,tt0086491,tt0116514
_ERROR_=1 _N_=639
NOTE: Invalid data for birthYear in line 713 23-24.

713 CHAR nm0000712.Steve Cohen.\N.\N.assistant_director,miscellaneous,director.tt0089218,tt0098844
ZONE 66333333305767624666605405406776776675667667672667666666667726676676707733333332773333333
NUMR ED000071293456503F85E9CE9CE91339341E4F492534F2CD9335CC1E5F53C492534F29440089218C440098844
90 ,tt0143213,tt0363685 109
nconst=nm0000712 primaryName=Steve Cohen birthYear=. deathYear=\N
primaryProfession=assistant_director,miscellaneous,dire
knownForTitles=tt0089218,tt0098844,tt0143213,tt0363685 _ERROR_=1 _N_=704
NOTE: Invalid data for birthYear in line 1269 24-25.

1269CHAR nm0001269.Cynthia Gibb.\N.\N.actress,soundtrack.tt0091886,tt0107065,tt0099385,tt0083412 87
ZONE 663333333047676662466605405406677677276766776660773333333277333333327733333332773333333
NUMR ED0001269939E4891079229CE9CE91342533C3F5E44213B9440091886C440107065C440099385C440083412
nconst=nm0001269 primaryName=Cynthia Gibb birthYear=. deathYear=\N
primaryProfession=actress,soundtrack knownForTitles=tt0091886,tt0107065,tt0099385,tt0083412
_ERROR_=1 _N_=1260


 
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @Spartan63611,

 

It looks like unknown values of birthyear are coded as "\N" in the data. The same is true for deathyear, but SAS decided (based on the first 20 data rows) to treat birthyear as a numeric variable (for which \N is invalid), but deathyear as a character variable. So, a first step towards a solution is to insert the statement

guessingrows=200;

into the PROC IMPORT step.

View solution in original post

7 REPLIES 7
PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS community 🙂 I don't see any attached files?

Spartan63611
Calcite | Level 5

Thanks for the notice I will upload them again 🙂

Spartan63611
Calcite | Level 5
It's strange I uploaded the screen shots put it does not show in the posted version...
Any idea what I could be doing wrong?
ballardw
Super User

@Spartan63611 wrote:
It's strange I uploaded the screen shots put it does not show in the posted version...
Any idea what I could be doing wrong?

One thing, do not post screen shots of LOGS. The log is plain text so copy the text from the log and paste into a code box opened using the forum's {I} or "running man" icon.

That way if someone want to comment on a specific part of the code they can copy, paste and correct, highlight or insert comments in the correct place.

 

Did you use the PHOTO icon in the forum to past pictures?

Spartan63611
Calcite | Level 5
Yes I used photos.
I will simply copy paste like you suggested thank you
FreelanceReinh
Jade | Level 19

Hi @Spartan63611,

 

It looks like unknown values of birthyear are coded as "\N" in the data. The same is true for deathyear, but SAS decided (based on the first 20 data rows) to treat birthyear as a numeric variable (for which \N is invalid), but deathyear as a character variable. So, a first step towards a solution is to insert the statement

guessingrows=200;

into the PROC IMPORT step.

Spartan63611
Calcite | Level 5
Thank you


SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 958 views
  • 0 likes
  • 4 in conversation