BookmarkSubscribeRSS Feed
DanielKaiser
Pyrite | Level 9

Hi together,

i got a big problem.

As the output of a Security-Scan a got a flatfile. The variables are delimited with a semicolon.

But the variable "state" contains all the ports of a host... the variable is longer than 32767 characters... how can i read all the characters? a split into more variables would be ok but i dont know how to get that.

  1. DATA
    WORK.outG1;

    LENGTH

        host             $ 200

        state            $ 32767

        note             $ 200

        os               $ 200;

    DROP

        F5

        F6 ;

    LABEL

        state            = "port/status"

        note             = "notiz";

    FORMAT

        host             $CHAR200.

        state            $CHAR32767.

        note             $CHAR200.

        os               $CHAR200. ;

    INFORMAT

        host             $CHAR200.

        state            $CHAR32767.

        note             $CHAR200.

        os               $CHAR200.;

    INFILE'C:\Users\sy053.LAN.002\AppData\Local\Temp\SEG3112\outG-7619b19402784883a2ffb82139d30210.txt'

        LRECL=700000

        ENCODING="WLATIN1"

        TERMSTR=CRLF

        DLM='7F'x

   MISSOVER

        DSD;

    INPUT

        host             : $CHAR200.

        state            : $CHAR32767.

        note             : $CHAR200.

        os               : $CHAR200.

        F5               : $1.

        F6               : $1.;

RUN;

29 REPLIES 29
esjackso
Quartz | Level 8

There is probably a better way but you can combine input types in a data step (ie delimited and formated).

The maximum number of new variables may be a issue. If lrecl is close to correct then there might be an extra 20 plus 30,000 length variables that need to be defined.

I can test this right now but try replacing in the input statement (and probably the length statement) state with state1-state25 $30000. without a colon qualifier.

Interesting issue ... I hope others have ideas as well!

EJ

LinusH
Tourmaline | Level 20

To deal with dynamic no of variables, transpose the data as you read it.

Have the the scan function (maybe in combination other functions depending on layout of the state variable) read each portno from the input buffer (_INPUT_), and then do an output.

Host, note and os will be repeated on several records - if that's ok depends on what you intend to do with file after the import.

Data never sleeps
Achilles
Calcite | Level 5

Try reading the file using infile. Something like:

data work.outgl;

     length ...

     format ...

     informat...

     infile <file> options;

     *here define a try like block*

        input _infile_;

  

           *then look at the length of the read line*

               if length(_infile_) > 32700 then do;

                      <You might have to store the max length of the _infile_ into a macro var here and use it after str1>

                    str1 = input(substr(_infile_,1,32700), $32767.);

                    str2 = input(substr(_infile_,32701,<maxlength>),$32767.);

              end;

              else;

                         str1 = input(_infile_,$32767.);

              end;

          .............

          ..........

   run;

of course, the code is a rough draft and not complete. But I was just trying to give you an idea.... Smiley Happy

ChrisNZ
Tourmaline | Level 20

I am afraid you may have to read the file (or at least part of it) as a byte stream to go beyond the 32k lrecl limitation.

This is done by setting the RECFM option appropriately.

filename TEST 'f:\temp\test.txt' recfm=n;

data _null_; *create one very long record;

file TEST;

length A $500;

do CHAR=33 to 120;

  A=repeat(byte(CHAR),500);

  put A @;

end;

A=byte(121);

put A;

run;

data OUT;

infile TEST;

length A $256;

do until(find(A,byte(121)));

   input A $256. @; *read very long record 256 bytes at a time, until end reached;

   output;

end;

run;

Ksharp
Super User

Post a sample file is a better way to explain your question.

Ksharp

benhaz
Calcite | Level 5

Hi Sir,

 

I am trying to concatenate all the rows from each group. I have written a code from which I am getting my desire result in last line(using last.) but problem is , it is only keeping up to 32767 characters. I am looking for a solution, if the character limit reaches 32767 then in the next variable (column) the remaining character should add. I am using the below code but it is not spiting into different variable if limit reaches. I have attached the result. Any help will be appreciated. Many thanks in advance.

 

data part4 (keep=DOC_NUMBER original_variable count);
set part3;
BY DOC_NUMBER;
if FIRST.DOC_NUMBER then
Count = 0;
Count + 1;
run;

 

data part5;
length concatenated_field $ 32767;
retain concatenated_field;
set part4;
by DOC_NUMBER;
if first.DOC_NUMBER then
do;
concatenated_field = original_variable;
end;
else
do;
concatenated_field = catx(', ', concatenated_field, original_variable);
end;
run;

  

benhaz
Calcite | Level 5

Hi Sir,

 

I am trying to concatenate all the rows from each group. I have written a code from which I am getting my desire result in last line(using last.) but problem is , it is only keeping up to 32767 characters. I am looking for a solution, if the character limit reaches 32767 then in the next variable (column) the remaining character should add. I am using the below code but it is not spiting into different variable if limit reaches. I have attached the result. Any help will be appreciated. Many thanks in advance.

 

data part4 (keep=DOC_NUMBER original_variable count);
set part3;
BY DOC_NUMBER;
if FIRST.DOC_NUMBER then
Count = 0;
Count + 1;
run;

 

data part5;
length concatenated_field $ 32767;
retain concatenated_field;
set part4;
by DOC_NUMBER;
if first.DOC_NUMBER then
do;
concatenated_field = original_variable;
end;
else
do;
concatenated_field = catx(', ', concatenated_field, original_variable);
end;
run;

DanielKaiser
Pyrite | Level 9

Thanks too all of your ideas,

Achilles gave me a idea, i will try i soon.

Her is a sample, the Portsstring can be up to about 200k:

Host: 10.131.113.131 (SAP00931.lan.****.de) Ports: 80/open/tcp//http//Microsoft IIS httpd 6.0/, 1042/open/tcp//msrpc//Microsoft Windows RPC/, 2555/open/tcp/////, 2580/open/tcp//tributary?///, 3389/open/tcp//ms-wbt-server//Microsoft Terminal Service/, 4999/open/tcp//hfcs-manager?/// Ignored State: closed (8997) OS: Microsoft Windows Server 2003 SP1 or SP2

EDIT:

I tried now the following:

  %macro count;

     DATA test;

    INFILE 'W:\Prz-IT63-LINUX-SCHWACHSTELLENANALYSE\#IT63kd\DEA\Scan01\sv_10.131.112.x-142.x_ex_AIX-AS400\outG';

           input _infile_;

           do count=1 to 20;

                if(length(_infile_) > 32767 then do;

                str&count = input(substr(_infile_,(count*32767,((count+1)*32767),$32767.);

           end;

           else;

                str21 = input(_infile_$32767.);

           end;

     run;

%mend;

%count;

But it gaves me several errors

ERROR: The _INFILE_ variable cannot be referenced by
the INPUT statement.

WARNING: Apparent symbolic reference COUNT not resolved.

ERROR 388-185: Expecting an arithmetic operator.

ERROR 180-322: Statement is not valid or it is used
out of proper order.

ERROR 76-322: Syntax error, statement will be ignored.

ERROR 160-185: No matching IF-THEN clause.

ERROR 78-322: Expecting a ','.

ERROR
161-185: No matching DO/SELECT statement.


Why is the COUNT Variable not resolved?

ChrisNZ
Tourmaline | Level 20

Because you never defined the COUNT macro variable that you use in str&count.

Also, then do; doesn't have a closing end;

You ought to be more careful and properly check your code before posting here and asking questions to people who will use their time and skill to help you.

Anyway, the logic you are trying to use is flawed as _infile_ cannot be longer than lrecl, i.e. 32k max.

Again, the only way to read such a long record afaik is to stream it.

DanielKaiser
Pyrite | Level 9

Your Code above overwrite my File and filled it with something like !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!""""""""""""""""""""""""""""""""""""""""""""""""§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§§ and so on...
glad i had a backup.

i submitted a ticket to SAS Support now.

ChrisNZ
Tourmaline | Level 20

Did you read the comment *create one very long record; ?

Why did you recreate the sample data?

Did you even try to understand what the code does?

DanielKaiser
Pyrite | Level 9

i tried, but i understand about äh... nothing. Sorry

Ksharp
Super User

You did not define macro variable &count.

data _null_;
file 'c:\temp\x.txt' lrecl=200000;
input x $char1. @@;     
retain max;
if x=':' then do;x='=';     n=0;end;
 else if x=',' then do;
                      n+1; max=max(max,n) ;
                           name= cats('p',n,'=');
                           put +(-1) ' ' name @;
                           call symputx('max',max);
                           return;
                         end;
put +(-1) x @;
cards4;
Host: 10.131.113.131 (SAP00931.lan.****.de) Ports: 80/open/tcp//http//Microsoft IIS httpd 6.0/, 1042/open/tcp//msrpc//Microsoft Windows RPC/, 2555/open/tcp/////, 2580/open/tcp//tributary?///, 3389/open/tcp//ms-wbt-server//Microsoft Terminal Service/, 4999/open/tcp//hfcs-manager?/// Ignored State: closed (8997) OS: Microsoft Windows Server 2003 SP1 or SP2
;;;;
run;

data have;
 infile 'c:\temp\x.txt'  lrecl=200000;
 input (host ports p1 - p&max State os) (= $2000.)  ;
run;

Ksharp

ChrisNZ
Tourmaline | Level 20

lrecl=200000 ? Wow, that's great!

When was the 32k limit lifted?

The online doc for lrecl in 9.3 states:

Range:1–32767

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 29 replies
  • 11742 views
  • 5 likes
  • 7 in conversation