Hi I am reading the official SAS specialist prep guide, on page 39, there is an example on importing space-delimited .txt files.
I did the exact same code as in the example like this
options validvarname=v7; filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' lrecl=100; proc import datafile=stdata dbms=dlm out=states replace; delimiter=' '; getnames=yes; run; proc print data=states; run;
and I got lots of empty variables like this
I had a look at the .txt file itself and it has lots of spaces after the last data in very line, which I think is the reason why the empty variables were created.
how can I prevent SAS from creating the empty variables?
That file has fixed length 80 byte records.
So TONS of extra spaces.
Since you told PROC IMPORT to treat each space as a delimiter that means there are tons of empty values there.
40 data _null_; 41 infile 'c:\downloads\state_data.txt' ; 42 input; 43 list; 44 run; NOTE: The infile 'c:\downloads\state_data.txt' is: Filename=c:\downloads\state_data.txt, RECFM=V,LRECL=32767,File Size (bytes)=486, Last Modified=04Nov2022:17:53:09, Create Time=04Nov2022:17:53:09 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 Region State Capital Bird 80 2 South Georgia Atlanta 'Brown Thrasher' 80 3 South 'North Carolina' Raleigh Cardinal 80 4 North Connecticut Hartford Robin 80 5 West Washington Olympia 'American Goldfinch' 80 6 Midwest Illinois Springfield Cardinal 80 NOTE: 6 records were read from the infile 'c:\downloads\state_data.txt'. The minimum record length was 80. The maximum record length was 80. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
I suspect that the file used by the training did NOT have those spaces padded on the end of the lines.
If you were using the plain OLD normal SAS program editor you could just open that file and resave it and the spurious trailing spaces would not be written. The newer "enchanced" editors from SAS make that harder.
Otherwise used the data step I posted before the create a version that does not have the extra spaces.
50 data _null_; 51 infile 'c:\downloads\state_data.fixed' ; 52 input; 53 list; 54 run; NOTE: The infile 'c:\downloads\state_data.fixed' is: Filename=c:\downloads\state_data.fixed, RECFM=V,LRECL=32767,File Size (bytes)=227, Last Modified=04Nov2022:18:03:26, Create Time=04Nov2022:17:59:44 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 Region State Capital Bird 25 2 South Georgia Atlanta 'Brown Thrasher' 38 3 South 'North Carolina' Raleigh Cardinal 39 4 North Connecticut Hartford Robin 32 5 West Washington Olympia 'American Goldfinch' 44 6 Midwest Illinois Springfield Cardinal 37 NOTE: 6 records were read from the infile 'c:\downloads\state_data.fixed'. The minimum record length was 25. The maximum record length was 44.
What does you log show when you run the proc import code?
What program did you use to look at the TXT file and see the trailing spaces?
Hi Ballardw:
here is the log when I ran my code
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 70 71 options validvarname=v7; 72 73 filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' 74 lrecl=100; 75 proc import datafile=stdata 76 dbms=dlm 77 out=states 78 replace; 79 delimiter=' '; 80 getnames=yes; 81 run; NOTE: Unable to open parameter catalog: SASUSER.PARMS.PARMS.SLIST in update mode. Temporary parameter values will be saved to WORK.PARMS.PARMS.SLIST. Number of names found is greater than number of variables found. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Name is not a valid SAS name. Problems were detected with provided names. See LOG. 82 /********************************************************************** 83 * PRODUCT: SAS 84 * VERSION: 9.4 85 * CREATOR: External File Interface 86 * DATE: 04NOV22 87 * DESC: Generated SAS Datastep Code 88 * TEMPLATE SOURCE: (None Specified.) 89 ***********************************************************************/ 90 data WORK.STATES ; 91 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */ 92 infile STDATA delimiter = ' ' MISSOVER DSD firstobs=2 ; 93 informat Region $7. ; 94 informat State $16. ; 95 informat Capital $11. ; 96 informat Bird $20. ; 97 informat VAR5 $1. ; 98 informat VAR6 $1. ; 99 informat VAR7 $1. ; 100 informat VAR8 $1. ; 101 informat VAR9 $1. ; 102 informat VAR10 $1. ; 103 informat VAR11 $1. ; 104 informat VAR12 $1. ; 105 informat VAR13 $1. ; 106 informat VAR14 $1. ; 107 informat VAR15 $1. ; 108 informat VAR16 $1. ; 109 informat VAR17 $1. ; 110 informat VAR18 $1. ; 111 informat VAR19 $1. ; 112 informat VAR20 $1. ; 113 informat VAR21 $1. ; 114 informat VAR22 $1. ; 115 informat VAR23 $1. ; 116 informat VAR24 $1. ; 117 informat VAR25 $1. ; 118 informat VAR26 $1. ; 119 informat VAR27 $1. ; 120 informat VAR28 $1. ; 121 informat VAR29 $1. ; 122 informat VAR30 $1. ; 123 informat VAR31 $1. ; 124 informat VAR32 $1. ; 125 informat VAR33 $1. ; 126 informat VAR34 $1. ; 127 informat VAR35 $1. ; 128 informat VAR36 $1. ; 129 informat VAR37 $1. ; 130 informat VAR38 $1. ; 131 informat VAR39 $1. ; 132 informat VAR40 $1. ; 133 informat VAR41 $1. ; 134 informat VAR42 $1. ; 135 informat VAR43 $1. ; 136 informat VAR44 $1. ; 137 informat VAR45 $1. ; 138 informat VAR46 $1. ; 139 informat VAR47 $1. ; 140 informat VAR48 $1. ; 141 informat VAR49 $1. ; 142 informat VAR50 $1. ; 143 informat VAR51 $1. ; 144 informat VAR52 $1. ; 145 format Region $7. ; 146 format State $16. ; 147 format Capital $11. ; 148 format Bird $20. ; 149 format VAR5 $1. ; 150 format VAR6 $1. ; 151 format VAR7 $1. ; 152 format VAR8 $1. ; 153 format VAR9 $1. ; 154 format VAR10 $1. ; 155 format VAR11 $1. ; 156 format VAR12 $1. ; 157 format VAR13 $1. ; 158 format VAR14 $1. ; 159 format VAR15 $1. ; 160 format VAR16 $1. ; 161 format VAR17 $1. ; 162 format VAR18 $1. ; 163 format VAR19 $1. ; 164 format VAR20 $1. ; 165 format VAR21 $1. ; 166 format VAR22 $1. ; 167 format VAR23 $1. ; 168 format VAR24 $1. ; 169 format VAR25 $1. ; 170 format VAR26 $1. ; 171 format VAR27 $1. ; 172 format VAR28 $1. ; 173 format VAR29 $1. ; 174 format VAR30 $1. ; 175 format VAR31 $1. ; 176 format VAR32 $1. ; 177 format VAR33 $1. ; 178 format VAR34 $1. ; 179 format VAR35 $1. ; 180 format VAR36 $1. ; 181 format VAR37 $1. ; 182 format VAR38 $1. ; 183 format VAR39 $1. ; 184 format VAR40 $1. ; 185 format VAR41 $1. ; 186 format VAR42 $1. ; 187 format VAR43 $1. ; 188 format VAR44 $1. ; 189 format VAR45 $1. ; 190 format VAR46 $1. ; 191 format VAR47 $1. ; 192 format VAR48 $1. ; 193 format VAR49 $1. ; 194 format VAR50 $1. ; 195 format VAR51 $1. ; 196 format VAR52 $1. ; 197 input 198 Region $ 199 State $ 200 Capital $ 201 Bird $ 202 VAR5 $ 203 VAR6 $ 204 VAR7 $ 205 VAR8 $ 206 VAR9 $ 207 VAR10 $ 208 VAR11 $ 209 VAR12 $ 210 VAR13 $ 211 VAR14 $ 212 VAR15 $ 213 VAR16 $ 214 VAR17 $ 215 VAR18 $ 216 VAR19 $ 217 VAR20 $ 218 VAR21 $ 219 VAR22 $ 220 VAR23 $ 221 VAR24 $ 222 VAR25 $ 223 VAR26 $ 224 VAR27 $ 225 VAR28 $ 226 VAR29 $ 227 VAR30 $ 228 VAR31 $ 229 VAR32 $ 230 VAR33 $ 231 VAR34 $ 232 VAR35 $ 233 VAR36 $ 234 VAR37 $ 235 VAR38 $ 236 VAR39 $ 237 VAR40 $ 238 VAR41 $ 239 VAR42 $ 240 VAR43 $ 241 VAR44 $ 242 VAR45 $ 243 VAR46 $ 244 VAR47 $ 245 VAR48 $ 246 VAR49 $ 247 VAR50 $ 248 VAR51 $ 249 VAR52 $ 250 ; 251 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */ 252 run; NOTE: The infile STDATA is: Filename=/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt, Owner Name=u62043935,Group Name=oda, Access Permission=-rw-r--r--, Last Modified=30 October 2022 15:10:58, File Size (bytes)=486 NOTE: 5 records were read from the infile STDATA. The minimum record length was 80. The maximum record length was 80. NOTE: The data set WORK.STATES has 5 observations and 52 variables. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.00 seconds system cpu time 0.00 seconds memory 11079.00k OS Memory 38440.00k Timestamp 04/11/2022 07:31:57 PM Step Count 24 Switch Count 2 Page Faults 0 Page Reclaims 230 Page Swaps 0 Voluntary Context Switches 11 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 272 5 rows created in WORK.STATES from STDATA. NOTE: WORK.STATES data set was successfully created. NOTE: The data set WORK.STATES has 5 observations and 52 variables. NOTE: PROCEDURE IMPORT used (Total process time): real time 0.10 seconds user cpu time 0.08 seconds system cpu time 0.04 seconds memory 11079.00k OS Memory 38440.00k Timestamp 04/11/2022 07:31:57 PM Step Count 24 Switch Count 58 Page Faults 0 Page Reclaims 5310 Page Swaps 0 Voluntary Context Switches 392 Involuntary Context Switches 0 Block Input Operations 8 Block Output Operations 384 253 254 proc print data=states; 255 run; NOTE: There were 5 observations read from the data set WORK.STATES. NOTE: PROCEDURE PRINT used (Total process time): real time 0.10 seconds user cpu time 0.10 seconds system cpu time 0.01 seconds memory 6698.71k OS Memory 37808.00k Timestamp 04/11/2022 07:31:57 PM Step Count 25 Switch Count 0 Page Faults 0 Page Reclaims 1505 Page Swaps 0 Voluntary Context Switches 0 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 24 256 257 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 268
I used SAS studio on OnDemand to see the .txt file, you can see when I highlight the data you can see that there are many spaces after the text.
Hello @Nietzsche ,
One thing you can do (as suggested in this paper) is copy-paste from the log the code (a data step using the infile statement) generated by PROC IMPORT and adapt it so that it suits your needs (it's basically a lot of lines to delete, and some row numbers to remove):
data WORK.STATES;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile STDATA delimiter=' ' MISSOVER DSD firstobs=2;
informat Region $7.;
informat State $16.;
informat Capital $11.;
informat Bird $20.;
format Region $7.;
format State $16.;
format Capital $11.;
format Bird $20.;
input Region $ State $ Capital $ Bird $;
if _ERROR_ then call symputx('_EFIERR_', 1); /* set ERROR detection macro variable */
run;
HTH
Anne.
Why would you use PROC IMPROT to read a text file with only four variables?
Typing the data step your self will be less code than the PROC IMPORT code.
Plus you then have complete control over how the variables are defined.
data states;
input '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' dsd dlm=' ' truncover firstobs=2;
input region :$20. state :$30. capital :$30. bird :$40. ;
run;
If you want to remove the trailing spaces from the file here is a simple data step.
filename stdata '/home/u62043935/Specialist_Prep_Guide/cert/state_data.txt' ;
filename fixed temp;
data _null_;
infile stdata truncover;
file fixed;
input line $char100.;
len=lengthn(line);
put line $varying100. len ;
run;
proc import datafile=fixed
...
Because I am learning and I am just following the book right now. Chapter 4.2 is about PROC IMPORT and the book is just demonstrating how to import various types of file formats with PROC IMPORT.
The result printed on the book did not have any empty variables where as I did, hence why I created this thread, I am sure there better ways of doing certain things in SAS, but that is not the point, I just wish to know why I have many empty variables and book does not, and if it due to empty spaces after the text in the .txt, how can do eliminate it with PROC IMPORT.
@Nietzsche wrote:
, I just wish to know why I have many empty variables and book does not, and if it due to empty spaces after the text in the .txt, how can do eliminate it with PROC IMPORT.
I suspect the End of Line character in the text file is not the same as the system you're running the code.
https://chortle.ccsu.edu/finiteautomata/Section09/sect09_9.html
I would explicitly specify the TERMSTR option on the FILE statement to fix this within PROC IMPORT.
https://documentation.sas.com/doc/en/pgmsascdc/v_032/lestmtsref/n15o12lpyoe4gfn1y1vcp6xs6966.htm
If you'd like assistance confirming this is an issue please upload the text file.
I have attached the .txt file below.
That file has fixed length 80 byte records.
So TONS of extra spaces.
Since you told PROC IMPORT to treat each space as a delimiter that means there are tons of empty values there.
40 data _null_; 41 infile 'c:\downloads\state_data.txt' ; 42 input; 43 list; 44 run; NOTE: The infile 'c:\downloads\state_data.txt' is: Filename=c:\downloads\state_data.txt, RECFM=V,LRECL=32767,File Size (bytes)=486, Last Modified=04Nov2022:17:53:09, Create Time=04Nov2022:17:53:09 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 Region State Capital Bird 80 2 South Georgia Atlanta 'Brown Thrasher' 80 3 South 'North Carolina' Raleigh Cardinal 80 4 North Connecticut Hartford Robin 80 5 West Washington Olympia 'American Goldfinch' 80 6 Midwest Illinois Springfield Cardinal 80 NOTE: 6 records were read from the infile 'c:\downloads\state_data.txt'. The minimum record length was 80. The maximum record length was 80. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
I suspect that the file used by the training did NOT have those spaces padded on the end of the lines.
If you were using the plain OLD normal SAS program editor you could just open that file and resave it and the spurious trailing spaces would not be written. The newer "enchanced" editors from SAS make that harder.
Otherwise used the data step I posted before the create a version that does not have the extra spaces.
50 data _null_; 51 infile 'c:\downloads\state_data.fixed' ; 52 input; 53 list; 54 run; NOTE: The infile 'c:\downloads\state_data.fixed' is: Filename=c:\downloads\state_data.fixed, RECFM=V,LRECL=32767,File Size (bytes)=227, Last Modified=04Nov2022:18:03:26, Create Time=04Nov2022:17:59:44 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 1 Region State Capital Bird 25 2 South Georgia Atlanta 'Brown Thrasher' 38 3 South 'North Carolina' Raleigh Cardinal 39 4 North Connecticut Hartford Robin 32 5 West Washington Olympia 'American Goldfinch' 44 6 Midwest Illinois Springfield Cardinal 37 NOTE: 6 records were read from the infile 'c:\downloads\state_data.fixed'. The minimum record length was 25. The maximum record length was 44.
That's probably worth raising to SAS e-learning and they'll usually update the materials.
Once you realize which variables are extra it is simple enough to remove them.
data states_fixed;
set states;
drop var: ;
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.