@larryn3 wrote:
Attached is a word file which should show more clearly the way the columns are aligned.
Actually a word processor document is likely to be worse than plain text. If by any chance your data contains TAB characters the "alignment" almost certainly will not be what should be read.Also non-visible formatting characters can sneak into the file making copy out and read a headache as well.
If the file is TEXT then post actual text. Don't type anything on this forum. Open a text box using the </> icon. Copy lines from the file after it is opened in a plain text file editor like NOTEPAD or even the SAS Editor. Then paste those lines into the text box.
File 1 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8 1 N22OR44DLICHT, MxxAdK, b x26 OWNEdd RY) adLATINUM 81 01/2002 B Y N 161 304gg438g97 2 134PO5TEAT, PffAfU, AdsfO x26 CHIEF sf323+R ALfdL 335gr 81 06/2007 NA Y N 161 3f5g77r File 2 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8 1 M11Af N43 GRsfsgANTfOR yy50 OWdaNERd- PLATsd(NsdaY) 81 03/2009 D Y 161 N x-xf3$x$5xxx 2 PLfsf33ATINUM CRdsfT yy50 GENE223RAL Pada2sARTNER - P 81 LATIsdaNU CRE23 09/2005 NA Y 161 N www
You are going to have figure out the pattern yourself I am afraid.
Is there no documentation provided that explains why the different files use different widths for the fields?
Read the first line with the C1 .... in it and use that to calculate the starting locations for this version of the file.
Let's convert one of your example files into a physical file we can reference in the example code.
options parmcards=txt ; filename txt temp; parmcards4; C1 C2 C3 c4 c5 c6 c7 c8 aaaa z8z 23 bbbbbbbbbbbbb s;lfjslf;sjl;dkfjs y ls;dkfjl ;ds n abc xyz ;;;;
Now let's read that first line and pick out the variable names and their start and end columns.
data start;
length col start end 8 name next $32;
infile txt obs=1 column=cc truncover;
start=1;
input name @;
do col=1 by 1 until(next=' ');
input next @ ;
end = cc - lengthn(next) - 2;
output;
name=next;
start = end+1;
end;
list;
drop next;
run;
Result:
Obs col start end name 1 1 1 13 C1 2 2 14 26 C2 3 3 27 79 C3 4 4 80 119 c4 5 5 120 143 c5 6 6 144 165 c6 7 7 166 171 c7 8 8 172 173 c8
Note that the end column for that last one might be a little off as the length of the NAME is probably shorter than the length of the values in the last column.
We can use that to write the input statement to read your file.
filename code temp;
data _null_;
set start end=eof;
file code;
if _n_=1 then put 'input ' ;
put @3 name '$' start '-' end ;
if eof then put ';' ;
run;
data want;
infile txt firstobs=2 truncover ;
%include code / source2;
run;
Thank you very much for all your help and the time you spent on this. I'm going to study your suggestion in the next couple of days. It is possible I may need to follow up.
Works great!
Now I'm going to try to modify code since my input file doesn't have headers. I was thinking of using the first row as a header and since that row obviously won't be legitimate variable names, i would possibly just pick the first 3 or 4 characters and append a character before each in case there is a numeric in the first row.
Thank you again for your help. And I'm going to mark this as an accepted solution.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.