I have data in below format as text file
MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST
1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
Here, first record is correct, in second and third record there is an extra pipe symbol(after second and first column), this is data error. but i need to solve this issue in SAS while importing. please anybody help.
i need output like below
MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP|1|O|Up|SASHELP.FSP.OK.SLIST
1038|CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
If the extra | only occurs in the second field then perhaps you could do this:
options parmcards=csv;
filename csv temp;
parmcards4;
MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST
1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
1036|CL_DOWN||||SASHELP.FSP.OK.SLIST
;;;;
data xx;
infile csv dsd dlm='|' truncover firstobs=2;
length MSGID $8 MNEMONIC $28 LINENO 8 LEVEL $1 TEXT $10 PBUTTONS $107 ;
length dummy $20 ;
input msgid MNEMONIC @;
if countc(_infile_,'|')>5 then do;
input dummy @;
MNEMONIC=cats(MNEMONIC,'|',dummy);
end;
input lineno -- pbuttons;
drop dummy ;
run;
proc print;
run;
File import code,
/* --------------------------------------------------------------------
Code generated by a SAS task
Generated on Monday, May 8, 2023 at 3:20:00 PM
By task: Import Data Wizard
Source file: C:\Users\PL4K760\Documents\SAMPLE.txt
Server: Local File System
Output data: WORK.SAMPLE_0000
Server: SASApp
Note: In preparation for running the following code, the Import
Data wizard has used internal routines to transfer the source data
file from the local file system to SASApp. There is no SAS code
available to represent this action.
-------------------------------------------------------------------- */
DATA WORK.SAMPLE_0000;
LENGTH
MSGID 8
MNEMONIC $ 28
LINENO $ 7
LEVEL $ 1
TEXT $ 119
PBUTTONS $ 20
F7 $ 20 ;
FORMAT
MSGID BEST4.
MNEMONIC $CHAR28.
LINENO $CHAR7.
LEVEL $CHAR1.
TEXT $CHAR119.
PBUTTONS $CHAR20.
F7 $CHAR20. ;
INFORMAT
MSGID BEST4.
MNEMONIC $CHAR28.
LINENO $CHAR7.
LEVEL $CHAR1.
TEXT $CHAR119.
PBUTTONS $CHAR20.
F7 $CHAR20. ;
INFILE '/saswork1/SAS_work28BB00001BF6_prodsascom5l/#LN00048'
LRECL=156
ENCODING="LATIN1"
TERMSTR=CRLF
DLM='7F'x
MISSOVER
DSD ;
INPUT
MSGID : ?? BEST4.
MNEMONIC : $CHAR28.
LINENO : $CHAR7.
LEVEL : $CHAR1.
TEXT : $CHAR119.
PBUTTONS : $CHAR20.
F7 : $CHAR20. ;
RUN;
output im getting is
here 1, 2,4 and 5th record is bad, i need record like 3rd
That's odd, seemed to work for me.
Tom's answer is better though 🙂
DATA WORK.SAMPLE_0000;
LENGTH MSGID 8 MNEMONIC $ 28 LINENO $ 7 LEVEL $ 1 TEXT $ 119 PBUTTONS $ 20
F7 $ 20;
FORMAT MSGID BEST4.
MNEMONIC $CHAR28.
LINENO $CHAR7.
LEVEL $CHAR1.
TEXT $CHAR119.
PBUTTONS $CHAR20.
F7 $CHAR20.;
INFORMAT MSGID BEST4.
MNEMONIC $CHAR28.
LINENO $CHAR7.
LEVEL $CHAR1.
TEXT $CHAR119.
PBUTTONS $CHAR20.
F7 $CHAR20.;
INFILE cards DLM='|' TRUNCOVER;
INPUT MSGID
MNEMONIC
LINENO
LEVEL
TEXT
PBUTTONS
F7 ;
cards;
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST
1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
;
;
;;
run;
As long as the extra delimiters are just appearing at the end (or beginning) of a field you can just remove the DSD option and SAS will treat the adjacent delimiters as one delimiter. Just like it treats spaces in normal LIST MODE input.
First let's get your lines of text into a file we can use to play with.
options parmcards=csv;
filename csv temp;
parmcards4;
MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST
1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
;;;;
Your IMPORT task wrote some really ugly SAS code.
Do you know if it also made that ugly text file? Is the original file a text file or did the import task make the text file for you?
Write your own data step and the code can be much simpler and clearer.
You can use your knowledge of what the variables are to define them properly.
For example that last one looks like a four level SAS object name. Since a LIBREF can only be 8 character and the maximum length for any other name is 32 characters the maximum length it could be is 8+32*3+3 = 107.
The first one looks like an ID variable so it should be CHARACTER since arithmetic on an ID has no meaning.
LINENO is probably a number.
So your data step could look like:
data want;
infile csv dlm='|' truncover firstobs=2;
length MSGID $8 MNEMONIC $28 LINENO 8 LEVEL $1 TEXT $10 PBUTTONS $107 ;
input msgid -- pbuttons;
run;
Result
@kuppusamy wrote:
I have records like
1036|CL_DOWN||||SASHELP.FSP.OK.SLIST
continuous missing values also there ,
in this case what should i do ? please help
Find the author of the file and get them to re-create it in a usable format.
The values that contain delimiters should have been in quotes, then the DSD option would work properly.
1037|"CL_UP|"|1|O|Up|SASHELP.FSP.OK.SLIST
1038|"|CL_LEFT"|1|O|Left|SASHELP.FSP.OK.SLIST
Without that the file is not parsable.
Personally I would just fix it by hand. You are lucky in this case that the file only has a few variables.
You could use a program to split the records into those with the right number of | and those with the wrong number. Then you only have to edit the smaller number of corrupt lines.
filename csv 'have.txt';
filename good 'good.txt';
filename bag 'bad.txt';
data _null_;
infile csv ;
input;
if _n_=1 or countc(_infile_,'|')=5 then do;
file good;
put _infile_;
end;
if _n_=1 or countc(_infile_,'|') ne 5 then do;
file bad;
put _infile_;
end;
run;
6774 data _null_; 6775 infile bad; 6776 input; 6777 put _infile_; 6778 run; NOTE: The infile BAD is: (system-specific pathname), (system-specific file attributes) MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS 1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST 1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST NOTE: 3 records were read from the infile (system-specific pathname). The minimum record length was 39. The maximum record length was 43. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds 6779 6780 data _null_; 6781 infile good; 6782 input; 6783 put _infile_; 6784 run; NOTE: The infile GOOD is: (system-specific pathname), (system-specific file attributes) MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS 1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST 1036|CL_DOWN||||SASHELP.FSP.OK.SLIST NOTE: 3 records were read from the infile (system-specific pathname). The minimum record length was 36. The maximum record length was 42. NOTE: DATA statement used (Total process time): real time 0.00 seconds cpu time 0.00 seconds
If the extra | only occurs in the second field then perhaps you could do this:
options parmcards=csv;
filename csv temp;
parmcards4;
MSGID|MNEMONIC|LINENO|LEVEL|TEXT|PBUTTONS
1036|CL_DOWN|1|O|Down|SASHELP.FSP.OK.SLIST
1037|CL_UP||1|O|Up|SASHELP.FSP.OK.SLIST
1038||CL_LEFT|1|O|Left|SASHELP.FSP.OK.SLIST
1036|CL_DOWN||||SASHELP.FSP.OK.SLIST
;;;;
data xx;
infile csv dsd dlm='|' truncover firstobs=2;
length MSGID $8 MNEMONIC $28 LINENO 8 LEVEL $1 TEXT $10 PBUTTONS $107 ;
length dummy $20 ;
input msgid MNEMONIC @;
if countc(_infile_,'|')>5 then do;
input dummy @;
MNEMONIC=cats(MNEMONIC,'|',dummy);
end;
input lineno -- pbuttons;
drop dummy ;
run;
proc print;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.