Actually i have uploaded the raw data as i have copy pasted earlier now i have given in that file
I know you have uploaded a attachment.
What I mean is what output you need.
Your data are very messy. I do not know From which column started to which column end for an obsersation.
The three records included in the attachment contain 1 records that only has 20 variables and two records that has 21 variables. Unlike your previous example "^" is used as the delimiter rather than "|".
In the first record the last (20th) variable was a date, in the second record the last (21st) variable contained the string "ok", and the last variable in the third record (21st) was a date.
Either you didn't send the actual data, the data is (like Ksharp suggested) too messed up to work with, and/or you haven't provided enough info so that anyone could have a clue as to what you want to achieve.
Here is a program that works for the three observations that you posted in the attachment above. You might need to adjust the record and variable lengths for files with longer records.
%let infile='c:\downloads\Raw_Data.txt';
%let nvars=21;
%let dlm=^;
filename tmpfile2 temp;
data _null_;
infile &infile lrecl=1000 end=eof length=inchar ;
file tmpfile2 lrecl=2000 ;
length outline $2000 inline $1000 ;
length outchar 8;
length nbar1 nbar2 8;
retain outline;
retain outchar 0 ;
do i=1 by 1 until( eof or (nbar1+nbar2 >= &nvars)) ;
input inline $varying1000. inchar;
nbar1 = lengthn(compress(outline,"&dlm",'K'));
nbar2 = lengthn(compress(inline,"&dlm",'K'));
if nbar1 + nbar2 < &nvars then do;
substr(outline,outchar+1)=inline;
outchar+inchar;
end;
end;
putlog 'Line ' _n_ 'has ' outchar 'characters.';
put outline $varying2000. outchar;
outline=inline; outchar=inchar;
run;
You will need to write a program that knows something about the actual variables to read the transformed text file.
Here is a program that just reads in the 21 variables as character strings and dumps them to the log.
data check ;
infile tmpfile2 dlm="&dlm" dsd truncover ;
length col1-col&nvars $200.;
input col1-col&nvars;
put (_N_ _all_) (=/);
run;
Here are the results.
_N_=1
col1=Is the time taken
col2=9865
col3=
col4=
col5=
col6=COOL
col7=dealt was good it is ok for us
col8=56
col9=ok
col10=AEFG
col11=
col12=13-aug-1999
col13=ok for it
col14=this is not prosable
col15=Ecoreco provides the fulsize reduction
col16=Registration link opens on at 3.00 pm. Any registration attempts before the specified
col17=
col18=
col19=
col20=
col21=
_N_=2
col1=REQ1
col2=REQ2
col3=REQ3
col4=REQ4
col5=REQ5
col6=REQ6
col7=13-JUNE-1980
col8=REQ8
col9=REQ9
col10=REQ10
col11=REQ11
col12=13-MAR-1997
col13=REQ13
col14=REQ14
col15=REQ15
col16=WE AREOK
col17=REQ17
col18=REQ18
col19=REQ19
col20=12-FEB-2011
col21=UP
_N_=3
col1=REQ1
col2=REQ2
col3=REQ3
col4=REQ4
col5=REQ5
col6=REQ6
col7=13-JUNE-1980
col8=REQ8
col9=REQ9
col10=REQ10
col11=REQ11
col12=13-MAR-1997
col13=REQ13
col14=REQ14
col15=REQ15
col16=WE AREOK
col17=REQ17
col18=REQ18
col19=REQ19
col20=REQ20
col21=12-FEB-2011
NOTE: 3 records were read from the infile TMPFILE2.
The minimum record length was 140.
The maximum record length was 256.
One or more lines were truncated.
NOTE: The data set WORK.CHECK has 3 observations and 21 variables.
Hi Tom actually my data structure was having huge length if i tryed on my own data it was not working i have attached my data structre can u help me..
If your output is like Tom's.
data want(keep=temp); infile 'c:\Raw_Data.txt' eof=last; length temp _temp $ 2000; retain temp _temp; input; temp=cats(temp,_infile_); if countc(temp,'^')ge 21 then do;temp=_temp;output;temp=_infile_;end; _temp=temp; return; last: output; run;
Ksharp
data want(keep=temp); infile 'c:\Raw_Data.txt' eof=last; length temp _temp $ 2000; retain temp _temp; input; temp=cats(temp,_infile_); if countc(temp,'^')ge 21 then do;temp=_temp;output;temp=_infile_;end; _temp=temp; return; last: output; run; data want(keep=var:); set want; array _v{21} $ 100 var1-var21; do i=1 to 21; _v{i}=scan(temp,i,'^','m'); end; run;
Ksharp
data want(keep=temp); infile 'c:\Raw_Data.txt' eof=last; length temp _temp $ 2000; retain temp _temp; input; temp=cats(temp,_infile_); if countc(temp,'^')ge 21 then do; temp=_temp; output; temp=_infile_; end; _temp=temp; return; last: output; Hi ksharp i have used the above code when i run to this the length of temp and _temp was only 256 it was showing ,i have checked by length function so the data is missing after that although it was 2000 for both temp and _temp why ?
You need to specify lengths for your data file records and your variables.
Add LRECL option to your INFILE statement.
Increase the length of the TEMP _TEMP variables.
If you have 21 variables @ 4000 characters per variable then you might need set the length to 85000. That would work for LRECL option, but is too long for a data step variable. Most likely any given record will not have maximum length for each variable so you could try setting length of the temp strings to 32767.
Let me emphasize again that the real issue here is the process that is generating the file. If you have any control over that then you can eliminate this problem by fixing the way that the data file is written.
I agree with Tom. You should add lrecl=32767 to adjust logic record length.
I am stunned that you have 21 variables and each variable has 2000-4000 length. That is really horrible.
I do not think you have used them fully.
From your posted data, there is not any sign to show your variable has 2000 length.
data want(keep=temp); infile 'c:\Raw_Data.txt' eof=last lrecl=32767; length temp _temp $ 32767; retain temp _temp; input; temp=compbl(cats(temp,_infile_)); if countc(temp,'^')ge 21 then do;temp=_temp;output;temp=compbl(_infile_);end; _temp=temp; return; last: output; run; data want(keep=var:); set want; array _v{21} $ 2000 var1-var21; do i=1 to 21; _v{i}=scan(temp,i,'^','m'); end; run;
Ksharp
Reg: Comma and Quotes
data want(keep=temp);
infile 'c:\Raw_Data.txt' eof=last lrecl=32767;
length temp _temp $ 32767;
retain temp _temp;
input;
temp=compbl(cats(temp,_infile_));
if countc(temp,'",')ge 10 then do;temp=_temp;output;temp=compbl(_infile_);end;
_temp=temp;
return;
last: output;
run;
data want3(keep=var:);
set want3;
array _v{10} $ 800 var1-var10;
do i=1 to 10;
_v{i}=scan(temp,i,'",','m');
end;
run;
Used this code but i am having another new data having codes and comma as delimiter with 10- variables
and the data is having another problem that data is moving in to 2-3 lineslines in the exampe given line2,3 has the data for
10 variables only but moving to another line.
"acv","1000036513","","Te_ADDR","507 Main, PLAZA, BUDH MARG","","HYd","","","IN"
"acv","1000036513","","Te_ADDR",
"507 Main, PLAZA, BUDH MARG","","HYd","","","IN"
"acv","1000036513","","Te_ADDR","507 Main, PLAZA, BUDH MARG","","HYd","","","IN"
"acv","1000
036513","","Te_ADDR","507 Main, PLAZA, BUDH MARG","","HYd","","","IN"
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.