BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Nietzsche
Lapis Lazuli | Level 10

Hi I am reading the official SAS specialist prep guide, on page 39, there is an example on importing space-delimited .txt files.

 

Nietzsche_0-1667560393531.png

 

I did the exact same code as in the example like this

options validvarname=v7;

filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt'
lrecl=100;
proc import datafile=stdata
dbms=dlm
out=states
replace;
delimiter=' ';
getnames=yes;
run;

proc print data=states;
run;

and I got lots of empty variables like this 

Nietzsche_1-1667560506363.png

 

I had a look at the .txt file itself and it has lots of spaces after the last data in very line, which I think is the reason why the empty variables were created.

 

how can I prevent SAS from creating the empty variables? 

 

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

That file has fixed length 80 byte records. 

So TONS of extra spaces.

Since you told PROC IMPORT to treat each space as a delimiter that means there are tons of empty values there.

40   data _null_;
41    infile 'c:\downloads\state_data.txt' ;
42    input;
43    list;
44   run;

NOTE: The infile 'c:\downloads\state_data.txt' is:
      Filename=c:\downloads\state_data.txt,
      RECFM=V,LRECL=32767,File Size (bytes)=486,
      Last Modified=04Nov2022:17:53:09,
      Create Time=04Nov2022:17:53:09

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         Region State Capital Bird                                                        80
2         South Georgia Atlanta 'Brown Thrasher'                                           80
3         South 'North Carolina' Raleigh Cardinal                                          80
4         North Connecticut Hartford Robin                                                 80
5         West Washington Olympia 'American Goldfinch'                                     80
6         Midwest Illinois Springfield Cardinal                                            80
NOTE: 6 records were read from the infile 'c:\downloads\state_data.txt'.
      The minimum record length was 80.
      The maximum record length was 80.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

I suspect that the file used by the training did NOT have those spaces padded on the end of the lines.

 

If you were using the plain OLD normal SAS program editor you could just open that file and resave it and the spurious trailing spaces would not be written.  The newer "enchanced" editors from SAS make that harder.

 

Otherwise used the data step I posted before the create a version that does not have the extra spaces.

50   data _null_;
51    infile 'c:\downloads\state_data.fixed' ;
52    input;
53    list;
54   run;

NOTE: The infile 'c:\downloads\state_data.fixed' is:
      Filename=c:\downloads\state_data.fixed,
      RECFM=V,LRECL=32767,File Size (bytes)=227,
      Last Modified=04Nov2022:18:03:26,
      Create Time=04Nov2022:17:59:44

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         Region State Capital Bird 25
2         South Georgia Atlanta 'Brown Thrasher' 38
3         South 'North Carolina' Raleigh Cardinal 39
4         North Connecticut Hartford Robin 32
5         West Washington Olympia 'American Goldfinch' 44
6         Midwest Illinois Springfield Cardinal 37
NOTE: 6 records were read from the infile 'c:\downloads\state_data.fixed'.
      The minimum record length was 25.
      The maximum record length was 44.

View solution in original post

13 REPLIES 13
ballardw
Super User

What does you log show when you run the proc import code?

 

What program did you use to look at the TXT file and see the trailing spaces?

 

Nietzsche
Lapis Lazuli | Level 10

Hi Ballardw:

 

here is the log when I ran my code

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 70         
 71         options validvarname=v7;
 72         
 73         filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt'
 74         lrecl=100;
 75         proc import datafile=stdata
 76         dbms=dlm
 77         out=states
 78         replace;
 79         delimiter=' ';
 80         getnames=yes;
 81         run;
 
 NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to 
 WORK.PARMS.PARMS.SLIST.
 Number of names found is greater than number of variables found. 
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Name  is not a valid SAS name.
 Problems were detected with provided names.  See LOG. 
 82          /**********************************************************************
 83          *   PRODUCT:   SAS
 84          *   VERSION:   9.4
 85          *   CREATOR:   External File Interface
 86          *   DATE:      04NOV22
 87          *   DESC:      Generated SAS Datastep Code
 88          *   TEMPLATE SOURCE:  (None Specified.)
 89          ***********************************************************************/
 90             data WORK.STATES    ;
 91             %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
 92             infile STDATA delimiter = ' ' MISSOVER DSD  firstobs=2 ;
 93                informat Region $7. ;
 94                informat State $16. ;
 95                informat Capital $11. ;
 96                informat Bird $20. ;
 97                informat VAR5 $1. ;
 98                informat VAR6 $1. ;
 99                informat VAR7 $1. ;
 100               informat VAR8 $1. ;
 101               informat VAR9 $1. ;
 102               informat VAR10 $1. ;
 103               informat VAR11 $1. ;
 104               informat VAR12 $1. ;
 105               informat VAR13 $1. ;
 106               informat VAR14 $1. ;
 107               informat VAR15 $1. ;
 108               informat VAR16 $1. ;
 109               informat VAR17 $1. ;
 110               informat VAR18 $1. ;
 111               informat VAR19 $1. ;
 112               informat VAR20 $1. ;
 113               informat VAR21 $1. ;
 114               informat VAR22 $1. ;
 115               informat VAR23 $1. ;
 116               informat VAR24 $1. ;
 117               informat VAR25 $1. ;
 118               informat VAR26 $1. ;
 119               informat VAR27 $1. ;
 120               informat VAR28 $1. ;
 121               informat VAR29 $1. ;
 122               informat VAR30 $1. ;
 123               informat VAR31 $1. ;
 124               informat VAR32 $1. ;
 125               informat VAR33 $1. ;
 126               informat VAR34 $1. ;
 127               informat VAR35 $1. ;
 128               informat VAR36 $1. ;
 129               informat VAR37 $1. ;
 130               informat VAR38 $1. ;
 131               informat VAR39 $1. ;
 132               informat VAR40 $1. ;
 133               informat VAR41 $1. ;
 134               informat VAR42 $1. ;
 135               informat VAR43 $1. ;
 136               informat VAR44 $1. ;
 137               informat VAR45 $1. ;
 138               informat VAR46 $1. ;
 139               informat VAR47 $1. ;
 140               informat VAR48 $1. ;
 141               informat VAR49 $1. ;
 142               informat VAR50 $1. ;
 143               informat VAR51 $1. ;
 144               informat VAR52 $1. ;
 145               format Region $7. ;
 146               format State $16. ;
 147               format Capital $11. ;
 148               format Bird $20. ;
 149               format VAR5 $1. ;
 150               format VAR6 $1. ;
 151               format VAR7 $1. ;
 152               format VAR8 $1. ;
 153               format VAR9 $1. ;
 154               format VAR10 $1. ;
 155               format VAR11 $1. ;
 156               format VAR12 $1. ;
 157               format VAR13 $1. ;
 158               format VAR14 $1. ;
 159               format VAR15 $1. ;
 160               format VAR16 $1. ;
 161               format VAR17 $1. ;
 162               format VAR18 $1. ;
 163               format VAR19 $1. ;
 164               format VAR20 $1. ;
 165               format VAR21 $1. ;
 166               format VAR22 $1. ;
 167               format VAR23 $1. ;
 168               format VAR24 $1. ;
 169               format VAR25 $1. ;
 170               format VAR26 $1. ;
 171               format VAR27 $1. ;
 172               format VAR28 $1. ;
 173               format VAR29 $1. ;
 174               format VAR30 $1. ;
 175               format VAR31 $1. ;
 176               format VAR32 $1. ;
 177               format VAR33 $1. ;
 178               format VAR34 $1. ;
 179               format VAR35 $1. ;
 180               format VAR36 $1. ;
 181               format VAR37 $1. ;
 182               format VAR38 $1. ;
 183               format VAR39 $1. ;
 184               format VAR40 $1. ;
 185               format VAR41 $1. ;
 186               format VAR42 $1. ;
 187               format VAR43 $1. ;
 188               format VAR44 $1. ;
 189               format VAR45 $1. ;
 190               format VAR46 $1. ;
 191               format VAR47 $1. ;
 192               format VAR48 $1. ;
 193               format VAR49 $1. ;
 194               format VAR50 $1. ;
 195               format VAR51 $1. ;
 196               format VAR52 $1. ;
 197            input
 198                        Region  $
 199                        State  $
 200                        Capital  $
 201                        Bird  $
 202                        VAR5  $
 203                        VAR6  $
 204                        VAR7  $
 205                        VAR8  $
 206                        VAR9  $
 207                        VAR10  $
 208                        VAR11  $
 209                        VAR12  $
 210                        VAR13  $
 211                        VAR14  $
 212                        VAR15  $
 213                        VAR16  $
 214                        VAR17  $
 215                        VAR18  $
 216                        VAR19  $
 217                        VAR20  $
 218                        VAR21  $
 219                        VAR22  $
 220                        VAR23  $
 221                        VAR24  $
 222                        VAR25  $
 223                        VAR26  $
 224                        VAR27  $
 225                        VAR28  $
 226                        VAR29  $
 227                        VAR30  $
 228                        VAR31  $
 229                        VAR32  $
 230                        VAR33  $
 231                        VAR34  $
 232                        VAR35  $
 233                        VAR36  $
 234                        VAR37  $
 235                        VAR38  $
 236                        VAR39  $
 237                        VAR40  $
 238                        VAR41  $
 239                        VAR42  $
 240                        VAR43  $
 241                        VAR44  $
 242                        VAR45  $
 243                        VAR46  $
 244                        VAR47  $
 245                        VAR48  $
 246                        VAR49  $
 247                        VAR50  $
 248                        VAR51  $
 249                        VAR52  $
 250            ;
 251            if _ERROR_ then call symputx('_EFIERR_',1);  /* set ERROR detection macro variable */
 252            run;
 
 NOTE: The infile STDATA is:
       Filename=/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt,
       Owner Name=u62043935,Group Name=oda,
       Access Permission=-rw-r--r--,
       Last Modified=30 October 2022 15:10:58,
       File Size (bytes)=486
 
 NOTE: 5 records were read from the infile STDATA.
       The minimum record length was 80.
       The maximum record length was 80.
 NOTE: The data set WORK.STATES has 5 observations and 52 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              11079.00k
       OS Memory           38440.00k
       Timestamp           04/11/2022 07:31:57 PM
       Step Count                        24  Switch Count  2
       Page Faults                       0
       Page Reclaims                     230
       Page Swaps                        0
       Voluntary Context Switches        11
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           272
       
 
 5 rows created in WORK.STATES from STDATA.
   
   
   
 NOTE: WORK.STATES data set was successfully created.
 NOTE: The data set WORK.STATES has 5 observations and 52 variables.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.10 seconds
       user cpu time       0.08 seconds
       system cpu time     0.04 seconds
       memory              11079.00k
       OS Memory           38440.00k
       Timestamp           04/11/2022 07:31:57 PM
       Step Count                        24  Switch Count  58
       Page Faults                       0
       Page Reclaims                     5310
       Page Swaps                        0
       Voluntary Context Switches        392
       Involuntary Context Switches      0
       Block Input Operations            8
       Block Output Operations           384
       
 
 253        
 254        proc print data=states;
 255        run;
 
 NOTE: There were 5 observations read from the data set WORK.STATES.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.10 seconds
       user cpu time       0.10 seconds
       system cpu time     0.01 seconds
       memory              6698.71k
       OS Memory           37808.00k
       Timestamp           04/11/2022 07:31:57 PM
       Step Count                        25  Switch Count  0
       Page Faults                       0
       Page Reclaims                     1505
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           24
       
 
 256        
 257        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 268        

I used SAS studio on OnDemand to see the .txt file, you can see when I highlight the data you can see that there are many spaces after the text. 

 

Nietzsche_0-1667590288604.png

 

 

 

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
Anne_A
Quartz | Level 8

Hello @Nietzsche ,

One thing you can do (as suggested in this paper) is copy-paste from the log the code (a data step using the infile statement) generated by PROC IMPORT and adapt it so that it suits your needs (it's basically a lot of lines to delete, and some row numbers to remove):

data WORK.STATES;

	%let _EFIERR_ = 0; 	/* set the ERROR detection macro variable */
	infile STDATA delimiter=' ' MISSOVER DSD firstobs=2;
	
	informat Region $7.;
	informat State $16.;
	informat Capital $11.;
	informat Bird $20.;
	
	format Region $7.;
	format State $16.;
	format Capital $11.;
	format Bird $20.;
	
	input Region $  State $  Capital $   Bird $;

	if _ERROR_ then call symputx('_EFIERR_', 1); /* set ERROR detection macro variable */
run;

HTH

 

Anne.

Tom
Super User Tom
Super User

Why would you use PROC IMPROT to read a text file with only four variables?

Typing the data step your self will be less code than the PROC IMPORT code.  

Plus you then have complete control over how the variables are defined.

data states;
  input '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' dsd dlm=' ' truncover firstobs=2;
  input region :$20.  state :$30. capital :$30. bird :$40. ;
run;

If you want to remove the trailing spaces from the file here is a simple data step.

filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' ;
filename fixed temp;
data _null_;
  infile stdata truncover;
  file fixed;
  input line $char100.;
  len=lengthn(line);
  put line $varying100. len ;
run;
proc import datafile=fixed 
...
Nietzsche
Lapis Lazuli | Level 10

Because I am learning and I am just following the book right now. Chapter 4.2 is about PROC IMPORT and the book is just demonstrating how to import various types of file formats with PROC IMPORT. 

 

The result printed on the book did not have any empty variables where as I did, hence why I created this thread, I am sure there better ways of doing certain things in SAS, but that is not the point, I just wish to know why I have many empty variables and book does not, and if it due to empty spaces after the text in the .txt, how can do eliminate it with PROC IMPORT.

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
Reeza
Super User

@Nietzsche wrote:

, I just wish to know why I have many empty variables and book does not, and if it due to empty spaces after the text in the .txt, how can do eliminate it with PROC IMPORT.


I suspect the End of Line character in the text file is not the same as the system you're running the code. 

https://chortle.ccsu.edu/finiteautomata/Section09/sect09_9.html

 

I would explicitly specify the TERMSTR option on the FILE statement to fix this within PROC IMPORT.

https://documentation.sas.com/doc/en/pgmsascdc/v_032/lestmtsref/n15o12lpyoe4gfn1y1vcp6xs6966.htm

 

If you'd like assistance confirming this is an issue please upload the text file.

Nietzsche
Lapis Lazuli | Level 10

I have attached the .txt file below.

SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
Tom
Super User Tom
Super User

That file has fixed length 80 byte records. 

So TONS of extra spaces.

Since you told PROC IMPORT to treat each space as a delimiter that means there are tons of empty values there.

40   data _null_;
41    infile 'c:\downloads\state_data.txt' ;
42    input;
43    list;
44   run;

NOTE: The infile 'c:\downloads\state_data.txt' is:
      Filename=c:\downloads\state_data.txt,
      RECFM=V,LRECL=32767,File Size (bytes)=486,
      Last Modified=04Nov2022:17:53:09,
      Create Time=04Nov2022:17:53:09

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         Region State Capital Bird                                                        80
2         South Georgia Atlanta 'Brown Thrasher'                                           80
3         South 'North Carolina' Raleigh Cardinal                                          80
4         North Connecticut Hartford Robin                                                 80
5         West Washington Olympia 'American Goldfinch'                                     80
6         Midwest Illinois Springfield Cardinal                                            80
NOTE: 6 records were read from the infile 'c:\downloads\state_data.txt'.
      The minimum record length was 80.
      The maximum record length was 80.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

I suspect that the file used by the training did NOT have those spaces padded on the end of the lines.

 

If you were using the plain OLD normal SAS program editor you could just open that file and resave it and the spurious trailing spaces would not be written.  The newer "enchanced" editors from SAS make that harder.

 

Otherwise used the data step I posted before the create a version that does not have the extra spaces.

50   data _null_;
51    infile 'c:\downloads\state_data.fixed' ;
52    input;
53    list;
54   run;

NOTE: The infile 'c:\downloads\state_data.fixed' is:
      Filename=c:\downloads\state_data.fixed,
      RECFM=V,LRECL=32767,File Size (bytes)=227,
      Last Modified=04Nov2022:18:03:26,
      Create Time=04Nov2022:17:59:44

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1         Region State Capital Bird 25
2         South Georgia Atlanta 'Brown Thrasher' 38
3         South 'North Carolina' Raleigh Cardinal 39
4         North Connecticut Hartford Robin 32
5         West Washington Olympia 'American Goldfinch' 44
6         Midwest Illinois Springfield Cardinal 37
NOTE: 6 records were read from the infile 'c:\downloads\state_data.fixed'.
      The minimum record length was 25.
      The maximum record length was 44.
Reeza
Super User

That's probably worth raising to SAS e-learning and they'll usually update the materials. 

 

 

Nietzsche
Lapis Lazuli | Level 10
is there a forum for that? One of the file mentioned in the prep guide is missing from the data set. How can I request one?
SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
Nietzsche
Lapis Lazuli | Level 10
btw, chapter 7 in the prep guide will discuss DATA step in detail, I hope to be able to study that chapter today.
SAS Base Programming (2022 Dec), Preparing for SAS Advanced Programming (Cancelled).
Tom
Super User Tom
Super User

Once you realize which variables are extra it is simple enough to remove them.

data states_fixed;
  set states;
  drop var: ;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2283 views
  • 3 likes
  • 5 in conversation