Hi All,
My requirement here is ,I am having very large flat/text file from which i need to get variable names and its positions which i dont know what that falt files consists of..i mean i need table structure with its column names and its attributes. i tried with proc import/wizard and infile as well but its not working could you please help me.
Hello @Shwethav
Under similar circumstances I would try to open the file in a spreadsheet and get an understanding of what is there in the file and possibly make necessary changes save as an excel workbook (.xlsx) and then try to import it.
If the file is very big, , I would use Unix/Linux shell to extract the first few rows and then examine it.
I would like to remind that data preparation is an important step in any analytics projects and one should allocate time for it for in scenarios like this.
This is the code i have used to read the flatfile
PROC IMPORT
DATAFILE=".........\Documents\SEPA\lhbml430.txt"
OUT=sepa
DBMS=dlm
REPLACE;
GETNAMES=no;
delimiter='7F'x;
RUN;
There is neither ERROR message nor output generated.its only throwing the note message as below:
Am having the program template to read this flat file data in mainframes but only selected variables are mentioned in that program,my task here is to get another 5 more variables in to the output from this input(flat file)for which i dont know those variable position's to bring the data for it :.
I have FTP 'd the flat file from the mainframes to my local system, but unable to view it to get my required columns positions.
Hence i just want to know the columns positions/structure of flat file not importing the data from it.
First step is fix your unbalanced quotes. It looks from the photograph of the text in your SAS log (why did you post text as a picture?) that you are using some interface to SAS, like Enterprise Guide or SAS/Studio. Restart your SAS session (you should be able to do this without closing the program you are using to interface to SAS) so that you are starting without any user submitted code. That should fix the unbalanced quotes that are generating those NOTEs in the log.
Second is take a look the file you have gotten. You can just use a simple SAS dataset to look at the first few lines.
data _null_;
infile ".........\Documents\SEPA\lhbml430.txt" obs=10;
input;
list;
run;
Make sure that it looks right. If your "mainframe" is an actual IBM mainframe then you also should make sure you have copied the file properly. If the fields in the file contain binary data (like packed decimal format) then make sure to transfer the file as binary and not text. But this means if some of the fields are text you might need to use the $EBCDIC informat to read them.
First, fix the unbalanced quotes by restarting the SAS server session.
Next, take a good look at the file on the mainframe to determine if it contains only text or some binary data. Lots of MF software will write numbers in binary or packed decimal format (PD = 2 decimal digits per byte).
If it is pure text, you can use FTP/SFTP in text mode, if not, transfer the file in binary mode.
If you used binary, use the $EBCDIC informat for text, and the S370... informats for the numbers (S370FPD5.2 will read 9 digits, 2 of which are the decimal fraction, the last 4 bits contain the sign). Also note that you will probably have a file with Fixed Block organization, which means no line separators and reading with RECFM=F.
If the file was transmitted as text, use a text editor (Notepad++) to inspect it. This has a hex mode which allows you to verify if the delimiter is in fact '7F'x.
And: getting files from mainframes means there IS documentation for it. MF programmers live that way. They eat, drink, breathe and secrete documentation. The least thing you can get is a PL/1 "include" that contains the code for setting up the record structure.
Please have a a clean restart of the SAS EG and replace the delimiter as shown.
It should work
delimiter='7F'x
with
delimiter='09'x
Open the file in text editor that doesn't wrap long lines, copy the first 10 lines. On the forum open a text box using the </> icon. Paste the text.
That way we have something to see an may have more to work with.
Import expects exactly one row or line to hold the column headings and the first one at that. If your file is such that the "column header" information is on two or more rows then your file won't be read "correctly" by Import.
I can only second what @Cynthia_sas said: you MUST get documentation for files that you are given for import into SAS. This documentation (or "data dictionary") has to include the technical layout (fixed vs. variable records, are there CR and/or LF separators, are there delimiters, or do the columns have fixed widths) and a description for each column.
Without that, it is guesswork; start with opening the file in a suitable text editor (I prefer Notepad++). From there you may be able to get the column names, if they are included. The presence of delimiters may make it possible to let PROC IMPORT make its own guesses, which may even be close to the intended reality. Or the layout is simple and obvious (e.g. 2021-05-06 is clearly a date) so that you can write proper code right away.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.