BookmarkSubscribeRSS Feed
tabraz
Fluorite | Level 6

i have few files in a directory (same format ).

sample file(all files are same just with different  records) (files attached)

 

 

File name.................................: abcdeeffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff.txt
created.....................: 01JAN1999 16:25:17
number..............................: 1234
name.................................: test
number1............................: 0010
xx..........................: xx
type...........................: 01
tmt ........................: 01
initials.........................: ha


Start time and date.......................: 06OCT1999 16:31:25
task one...........................: 06OCT1999 16:31:38
task two...........................: 06OCT1999 16:32:42

 


Available ............: 40

casr10......: 200.0
case1...........: 38
case2...........: 7

 


New[(xx)^2]:type1
                                         wrong     right
                                     one  two    one      two
AAA (1.1 - 6.0Hz):   1.101 1.101  1.111   1.105
BBB (1.2 - 8.5Hz):   1.782 1.101  1.121   1.106
CCC (1.3 - 12.5Hz): 1.403 1.101  1.131  1.107
DDD (1.4 - 30.0Hz): 1.714 1.101  1.141  1.108


New[(xx)^2]: type2
                                        wrong  right
                                  one two one two
AAA (1.1 - 6.0Hz):   1.101   1.101  2.111  6.105
BBB (1.2 - 8.5Hz):   1.782   1.101 3.121  7.106
CCC (1.3 - 12.5Hz): 1.403  1.101 4.131  8.107
DDD (1.4 - 30.0Hz): 1.714  1.101  5.141  9.108

completed at.....................: 06OCT1999 16:45:28

 

 

i am reading all files from a directory and trying to read below fields

#3 number as old_number

#5 number1 as new_num

#7 type

#12 time time8.

 

then i want to read below fields  

 

New[(xx)^2]:type1
                                         wrong     right
                                     one  two    one      two
AAA (1.1 - 6.0Hz):   1.101 1.101  1.111   1.105
BBB (1.2 - 8.5Hz):   1.782 1.101  1.121   1.106
CCC (1.3 - 12.5Hz): 1.403 1.101  1.131  1.107
DDD (1.4 - 30.0Hz): 1.714 1.101  1.141  1.108


New[(xx)^2]: type2
                                        wrong  right
                                  one two one two
AAA (1.1 - 6.0Hz):   1.101   1.101  2.111  6.105
BBB (1.2 - 8.5Hz):   1.782   1.101 3.121  7.106
CCC (1.3 - 12.5Hz): 1.403  1.101 4.131  8.107
DDD (1.4 - 30.0Hz): 1.714  1.101  5.141  9.108

completed at.....................: 06OCT1999 16:45:28

 

 

AAA_Type1_Right_ONE    AAA_Type1_Right_TWO             

       1.111                              1.101 

 

AAA_Type2_Right_ONE    AAA_Type2_Right_TWO 

     2.111                                                6.105           

 

BBB_Type1_Right_ONE    BBB_Type1_Right_TWO 

     1.121                                              1.106

BBB_Type2_Right_ONE    BBB_Type2_Right_TWO 

  3.121                                               7.106

 

(and same for CCC and DDD for both type1 and type2)

 

 

 

   

 

 

 

the final out will have should also have below fields + above mentioned fields.

 

path   ,   filename   ,nc ,size, date1, dname,old_number,new_number ,type, time  and fileds for AAA,BBB,CCC,DDD as mentioned above.

 

 

my piece of code for reading all files from directory in a sas dataset.

 

%let x=test;
%let DELIM = ' ' ;
%let DELIM2 = '_' ;

filename DIRLIST pipe "dir /-c /q /t:c %bquote("C:\Users\Tanwar\Desktop\&x\*.txt")";
data xxx (DROP=dir_rec line);
length path filename $255 line $1024 nc $50;
retain path ;
infile DIRLIST length=reclen ;
input line $varying1024. reclen ;
if reclen = 0 then delete ;
if scan(line,1,&DELIM)='Volume'|scan(line,1,&DELIM)='Total'|scan(line,2,&DELIM)='File(s)'|scan(line,2,&DELIM)='Dir(s)' then delete;
dir_rec=upcase(scan(line,1,&DELIM))='DIRECTORY';
if dir_rec then path=left(substr(line,length("Directory of")+2)) ;
else do ;
size = input( scan( line, 3, &DELIM ) , best. ) ;
filename = scan( line, 5, &DELIM ) ;
if filename in ( '.' '..' ) then delete ;
if find(substr(filename,1,5),"_")>0 then do;
if find("&x","_")>0 then date1 = abs(input( scan( line, 6, &DELIM2 ), date9. ));
if find("&x","_")<=0 and find("&x","FPS")<=0 then date1 = abs(input( scan( line, 5, &DELIM2 ), date9. ));
if find("&x","FPS")>0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
end;
if find(substr(filename,1,5),"_")=0 then do;
if find("&x","_")>0 then date1 = abs(input( scan( line, 5, &DELIM2 ), date9. ));
if find("&x","_")<=0 and find("&x","FPS")<=0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
if find("&x","FPS")>0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
end;
end ;
nc="1";
run ;

 

data yyyy;
set xxx;
if missing(filename) or find(filename,'Copy')>0 then delete;
dname=trim(left(path))||"\"||trim(left(filename));
run;

 

Help me on this.

 

9 REPLIES 9
Kurt_Bremser
Super User

Does every file have a single instance of this report, or could there be a sequence of reports in one file?

 

The basic algorithm looks like this

- retain all your variables

- read a line

- if line starts with "File name", and file_name > " ", output; and reset all vars. set file_name to scan(line,2,':')

- check line for appearance of keywords, and extract information as with file_name

- if line starts with one of AAA,BBB,CCC,DDD, take scan(line,2,':') and then use input(scan(...,3),5.1) and input(scan(...,4),5.) to get the respective numerical values

- at EOF, output (last group)

 

You can read all text files in a directory in one sweep, using wildcards in the infile statement. The above algorithm can deal with that.

 

tabraz
Fluorite | Level 6

Hi ,

 

Thanks for your response on this.

 

 Yes every file have a single instance of this report.File format and sequence remain same for all files.

Kurt_Bremser
Super User

Here a very crude piece of code, tested with the contents of test1.txt:

data want;
infile '$HOME/sascommunity/tabraz.txt' truncover end=eof;
input line $200.;
length
  type_flag $5
  file_name $100
  old_number
  new_num $4
  type $2
  time 8
  aaa_type1_right_one
  aaa_type1_right_two
  aaa_type2_right_one
  aaa_type2_right_two
  bbb_type1_right_one
  bbb_type1_right_two
  bbb_type2_right_one
  bbb_type2_right_two
  ccc_type1_right_one
  ccc_type1_right_two
  ccc_type2_right_one
  ccc_type2_right_two
  ddd_type1_right_one
  ddd_type1_right_two
  ddd_type2_right_one
  ddd_type2_right_two 8
;
format time datetime19.;
retain
  type_flag
  file_name 
  old_number
  new_num
  type
  time
  aaa_type1_right_one
  aaa_type1_right_two
  aaa_type2_right_one
  aaa_type2_right_two
  bbb_type1_right_one
  bbb_type1_right_two
  bbb_type2_right_one
  bbb_type2_right_two
  ccc_type1_right_one
  ccc_type1_right_two
  ccc_type2_right_one
  ccc_type2_right_two
  ddd_type1_right_one
  ddd_type1_right_two
  ddd_type2_right_one
  ddd_type2_right_two
;
if index(line,"File name") = 1
then do;
  if file_name > ' '
  then do;
    output;
    old_number = ' ';
  end;
  file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,3));
  when ('AAA') do;
    line = scan(line,2,':');
    select (type_flag);
      when ('type1') do;
        aaa_type1_right_one = input(scan(line,3,'09'x),5.);
        aaa_type1_right_two = input(scan(line,4,'09'x),5.);
      end;
      when ('type2') do;
        aaa_type2_right_one = input(scan(line,3,'09'x),5.);
        aaa_type2_right_two = input(scan(line,4,'09'x),5.);
      end;
      otherwise;
    end;
  end;
  otherwise do;
    if index(line,'number') = 1
    then do;
      if old_number = ' '
      then old_number = strip(scan(line,2,':'));
      else new_num = strip(scan(line,2,':'));
    end;
    if index(line,'New[(xx)^2]') = 1 then type_flag = strip(scan(line,2,':'));
    if index(line,'created') = 1
    then do;
      line = strip(substr(line,indexc(line,':')+1));
      substr(line,10,1) = ':';
      time = input(line,datetime19.);
    end;
    if index(line,'type') = 1 then type = strip(scan(line,2,':'));
  end;
end;
if eof then output;
run;

Once can pack the 'AAA' branch into a macro, so that it has to be written only once.

tabraz
Fluorite | Level 6

Thaks for reply.

 

When i am trying to read  actual file to find more records in actual file after tuning the script   its giving  wrong output i tried to debug it but not able find  solution.

 

i am sharing the replica of actual file.

 

fields that i need in out put :

 

S_number as  S_number 

Sub_number as sub_number 

occ number as occc

imp as imp

imp start as imp_start

imp end as imp_end 

aaa_abli_limit_one    
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two

aaa_toto _limit_one
aaa_toto_limit_two

aaa_pipi_limit_one
aaa_pipi_limit_two

aaa_rkrk_limit_one
aaa_rkrk _limit_two

 

for example     aaa_abli_limit_one  will have      1.1 

                      aaa_abli_limit_one  will have        1.1

 

 

 

same for BBB,CCC,DDD.

 

however i am trying to solve this.Your help on this is much appreciated.

tabraz
Fluorite | Level 6

am sharing the sample code

 

However its not the actual code which i was using as was not saved .

 

data want;
infile 'C:\Desktop\test\test5.txt' truncover end=eof;
input line $200.;
length
type_flag $5
file_name $100
S_number
Sub_number
occ $4
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two
/*aaa_toto _limit_one*/
/*aaa_toto_limit_two*/
/*aaa_pipi_limit_one*/
/*aaa_pipi_limit_two*/
/*aaa_rkrk_limit_one*/
/*aaa_rkrk _limit_two*/
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.
;
format imp_start datetime19.;
format imp_end datetime19.;
retain
type_flag
file_name
S_number
Sub_number
occ
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl_limit_one
aaa_joyl_limit_two
aaa_toto _limit_one
aaa_toto_limit_two
aaa_pipi_limit_one
aaa_pipi_limit_two
aaa_rkrk_limit_one
aaa_rkrk _limit_two
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.

;
if index(line,"File name") = 1
then do;
if file_name > ' '
then do;
output;
S_number = ' ';
end;
file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,5));
when ('abli') do;
line = scan(line,3,':');
select (type_flag);
when ('delta') do;
aaa_abli_limit_one = input(scan(line,3,'09'x),5.);
aaa_abli_limit_one= input(scan(line,4,'09'x),5.);
end;
when ('joyl') do;
aaa_joyl_limit_one = input(scan(line,3,'09'x),5.);
aaa_joyl_limit_two = input(scan(line,4,'09'x),5.);

when ('toto_') do;
aaa_toto_limit_one = input(scan(line,3,'09'x),5.);
aaa_toto_limit_two = input(scan(line,4,'09'x),5.);

when ('pipi') do;
aaa_pipi_limit_one = input(scan(line,3,'09'x),5.);
aaa_pipi_limit_two = input(scan(line,4,'09'x),5.);
when ('rkrk') do;
aaa_rkrk_limit_one = input(scan(line,3,'09'x),5.);
aaa_rkrk_limit_two = input(scan(line,4,'09'x),5.);
end;

/* /same for other BBB,CCC,DDD/*/
otherwise;
end;
end;
otherwise do;
if index(line,'S_number ') = 1
then do;
if Study_nr = ' '
then S_number = strip(scan(line,2,':'));
if index(line,'Subject number') = 1
then Sub_number= strip(scan(line,5,':'));
end;
if index(line,'play [(uV)^2]') = 1 then type_flag = strip(scan(line,2,':'));
if index(line,'Date and time created') = 1
then do;
line = strip(substr(line,indexc(line,':')+1));
substr(line,10,1) = ':';
imp_start = input(line,datetime19.);
substr(line,11,1) = ':';
imp_start = input(line,datetime19.);
substr(line,19,1) = ':';
imp = input(line,datetime19.);
end;
if index(line,'Occasion') = 1 then Occasion = strip(scan(line,2,':'));
end;
end;
if eof then output;
run;

Kurt_Bremser
Super User

When I wrote

 

"Use the "little running man" icon to post code, so we can safely copy/paste and run it."

 

I didn't do it just because I was incredibly bored. That subwindow preserves all formatting, while the main posting window virtually eliminates it, making the code next to unreadable.

tabraz
Fluorite | Level 6

data want;
infile 'C:\Desktop\test\test5.txt' truncover end=eof;
input line $200.;
length
type_flag $5
file_name $100
S_number
Sub_number
occ $4
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two
/*aaa_toto _limit_one*/
/*aaa_toto_limit_two*/
/*aaa_pipi_limit_one*/
/*aaa_pipi_limit_two*/
/*aaa_rkrk_limit_one*/
/*aaa_rkrk _limit_two*/
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.
;
format imp_start datetime19.;
format imp_end datetime19.;
retain
type_flag
file_name
S_number
Sub_number
occ
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl_limit_one
aaa_joyl_limit_two
aaa_toto _limit_one
aaa_toto_limit_two
aaa_pipi_limit_one
aaa_pipi_limit_two
aaa_rkrk_limit_one
aaa_rkrk _limit_two
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.

;
if index(line,"File name") = 1
then do;
if file_name > ' '
then do;
output;
S_number = ' ';
end;
file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,5));
when ('abli') do;
line = scan(line,3,':');
select (type_flag);
when ('delta') do;
aaa_abli_limit_one = input(scan(line,3,'09'x),5.);
aaa_abli_limit_one= input(scan(line,4,'09'x),5.);
end;
when ('joyl') do;
aaa_joyl_limit_one = input(scan(line,3,'09'x),5.);
aaa_joyl_limit_two = input(scan(line,4,'09'x),5.);

when ('toto_') do;
aaa_toto_limit_one = input(scan(line,3,'09'x),5.);
aaa_toto_limit_two = input(scan(line,4,'09'x),5.);

when ('pipi') do;
aaa_pipi_limit_one = input(scan(line,3,'09'x),5.);
aaa_pipi_limit_two = input(scan(line,4,'09'x),5.);
when ('rkrk') do;
aaa_rkrk_limit_one = input(scan(line,3,'09'x),5.);
aaa_rkrk_limit_two = input(scan(line,4,'09'x),5.);
end;

/* /same for other BBB,CCC,DDD/*/
otherwise;
end;
end;
otherwise do;
if index(line,'S_number ') = 1
then do;
if Study_nr = ' '
then S_number = strip(scan(line,2,':'));
if index(line,'Subject number') = 1
then Sub_number= strip(scan(line,5,':'));
end;
if index(line,'play [(uV)^2]') = 1 then type_flag = strip(scan(line,2,':'));
if index(line,'Date and time created') = 1
then do;
line = strip(substr(line,indexc(line,':')+1));
substr(line,10,1) = ':';
imp_start = input(line,datetime19.);
substr(line,11,1) = ':';
imp_start = input(line,datetime19.);
substr(line,19,1) = ':';
imp = input(line,datetime19.);
end;
if index(line,'Occasion') = 1 then Occasion = strip(scan(line,2,':'));
end;
end;
if eof then output;
run;

 

tabraz
Fluorite | Level 6

Hi,

 

I managed to read the file 🙂

 

Thanks for your help.

 

Regards

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1720 views
  • 1 like
  • 2 in conversation