06-13-2013 08:13 AM
i got a big problem.
As the output of a Security-Scan a got a flatfile. The variables are delimited with a semicolon.
But the variable "state" contains all the ports of a host... the variable is longer than 32767 characters... how can i read all the characters? a split into more variables would be ok but i dont know how to get that.
host $ 200
state $ 32767
note $ 200
os $ 200;
state = "port/status"
note = "notiz";
os $CHAR200. ;
host : $CHAR200.
state : $CHAR32767.
note : $CHAR200.
os : $CHAR200.
F5 : $1.
F6 : $1.;
06-13-2013 10:02 AM
There is probably a better way but you can combine input types in a data step (ie delimited and formated).
The maximum number of new variables may be a issue. If lrecl is close to correct then there might be an extra 20 plus 30,000 length variables that need to be defined.
I can test this right now but try replacing in the input statement (and probably the length statement) state with state1-state25 $30000. without a colon qualifier.
Interesting issue ... I hope others have ideas as well!
06-13-2013 11:10 AM
To deal with dynamic no of variables, transpose the data as you read it.
Have the the scan function (maybe in combination other functions depending on layout of the state variable) read each portno from the input buffer (_INPUT_), and then do an output.
Host, note and os will be repeated on several records - if that's ok depends on what you intend to do with file after the import.
06-13-2013 11:16 AM
Try reading the file using infile. Something like:
infile <file> options;
*here define a try like block*
*then look at the length of the read line*
if length(_infile_) > 32700 then do;
<You might have to store the max length of the _infile_ into a macro var here and use it after str1>
str1 = input(substr(_infile_,1,32700), $32767.);
str2 = input(substr(_infile_,32701,<maxlength>),$32767.);
str1 = input(_infile_,$32767.);
of course, the code is a rough draft and not complete. But I was just trying to give you an idea....
06-14-2013 08:32 AM
I am afraid you may have to read the file (or at least part of it) as a byte stream to go beyond the 32k lrecl limitation.
This is done by setting the RECFM option appropriately.
filename TEST 'f:\temp\test.txt' recfm=n;
data _null_; *create one very long record;
length A $500;
do CHAR=33 to 120;
put A @;
length A $256;
input A $256. @; *read very long record 256 bytes at a time, until end reached;
06-17-2013 02:05 AM
Thanks too all of your ideas,
Achilles gave me a idea, i will try i soon.
Her is a sample, the Portsstring can be up to about 200k:
Host: 10.131.113.131 (SAP00931.lan.****.de) Ports: 80/open/tcp//http//Microsoft IIS httpd 6.0/, 1042/open/tcp//msrpc//Microsoft Windows RPC/, 2555/open/tcp/////, 2580/open/tcp//tributary?///, 3389/open/tcp//ms-wbt-server//Microsoft Terminal Service/, 4999/open/tcp//hfcs-manager?/// Ignored State: closed (8997) OS: Microsoft Windows Server 2003 SP1 or SP2
I tried now the following:
do count=1 to 20;
if(length(_infile_) > 32767 then do;
str&count = input(substr(_infile_,(count*32767,((count+1)*32767),$32767.);
str21 = input(_infile_$32767.);
But it gaves me several errors
ERROR: The _INFILE_ variable cannot be referenced by
the INPUT statement.
WARNING: Apparent symbolic reference COUNT not resolved.
ERROR 388-185: Expecting an arithmetic operator.
ERROR 180-322: Statement is not valid or it is used
out of proper order.
ERROR 76-322: Syntax error, statement will be ignored.
ERROR 160-185: No matching IF-THEN clause.
ERROR 78-322: Expecting a ','.
161-185: No matching DO/SELECT statement.
Why is the COUNT Variable not resolved?
06-17-2013 03:02 AM
Because you never defined the COUNT macro variable that you use in str&count.
Also, then do; doesn't have a closing end;
You ought to be more careful and properly check your code before posting here and asking questions to people who will use their time and skill to help you.
Anyway, the logic you are trying to use is flawed as _infile_ cannot be longer than lrecl, i.e. 32k max.
Again, the only way to read such a long record afaik is to stream it.
06-17-2013 03:36 AM
Your Code above overwrite my File and filled it with something like !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!""""""""""""""""""""""""""""""""""""""""""""""""§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§ and so on...
glad i had a backup.
i submitted a ticket to SAS Support now.
06-17-2013 04:34 AM
Did you read the comment *create one very long record; ?
Why did you recreate the sample data?
Did you even try to understand what the code does?
06-17-2013 04:10 AM
You did not define macro variable &count.
data _null_; file 'c:\temp\x.txt' lrecl=200000; input x $char1. @@; retain max; if x=':' then do;x='='; n=0;end; else if x=',' then do; n+1; max=max(max,n) ; name= cats('p',n,'='); put +(-1) ' ' name @; call symputx('max',max); return; end; put +(-1) x @; cards4; Host: 10.131.113.131 (SAP00931.lan.****.de) Ports: 80/open/tcp//http//Microsoft IIS httpd 6.0/, 1042/open/tcp//msrpc//Microsoft Windows RPC/, 2555/open/tcp/////, 2580/open/tcp//tributary?///, 3389/open/tcp//ms-wbt-server//Microsoft Terminal Service/, 4999/open/tcp//hfcs-manager?/// Ignored State: closed (8997) OS: Microsoft Windows Server 2003 SP1 or SP2 ;;;; run; data have; infile 'c:\temp\x.txt' lrecl=200000; input (host ports p1 - p&max State os) (= $2000.) ; run;
06-17-2013 04:29 AM
lrecl=200000 ? Wow, that's great!
When was the 32k limit lifted?
The online doc for lrecl in 9.3 states:
06-17-2013 06:38 AM
The documentation for the LRECL system option states 32kb max, but the doc for the LRECL filename option states 1Gb max for windows and unix.
I missed that change. I wonder why the discrepancy between the 2 lengths. That discrepancy sure threw me off in any case.
No need for streaming then, all good.
Sorry for misleading you DJDaniel.
Thanks for the update KSharp.
Mmm weird again: It looks like sas did an half-baked enhancement. lrecl is longer but then variable _infile_ causes errors.
The following code should be valid but throws an error. What's going on sas?
data T (compress=yes);
infile TEST lrecl=500000 pad missover termstr=CRLF dlm=';' dsd;
ERROR: The LRECL / LINESIZE for infile TEST exceeds the maximum allowable length for an _INFILE_ or _INFILE_= variable (32,767).
The DATA STEP will not be executed.
so all is not so clear it seems.