BookmarkSubscribeRSS Feed
tabraz
Fluorite | Level 6

i have few files in a directory (same format ).

sample file(all files are same just with different  records) (files attached)

 

 

File name.................................: abcdeeffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff.txt
created.....................: 01JAN1999 16:25:17
number..............................: 1234
name.................................: test
number1............................: 0010
xx..........................: xx
type...........................: 01
tmt ........................: 01
initials.........................: ha


Start time and date.......................: 06OCT1999 16:31:25
task one...........................: 06OCT1999 16:31:38
task two...........................: 06OCT1999 16:32:42

 


Available ............: 40

casr10......: 200.0
case1...........: 38
case2...........: 7

 


New[(xx)^2]:type1
                                         wrong     right
                                     one  two    one      two
AAA (1.1 - 6.0Hz):   1.101 1.101  1.111   1.105
BBB (1.2 - 8.5Hz):   1.782 1.101  1.121   1.106
CCC (1.3 - 12.5Hz): 1.403 1.101  1.131  1.107
DDD (1.4 - 30.0Hz): 1.714 1.101  1.141  1.108


New[(xx)^2]: type2
                                        wrong  right
                                  one two one two
AAA (1.1 - 6.0Hz):   1.101   1.101  2.111  6.105
BBB (1.2 - 8.5Hz):   1.782   1.101 3.121  7.106
CCC (1.3 - 12.5Hz): 1.403  1.101 4.131  8.107
DDD (1.4 - 30.0Hz): 1.714  1.101  5.141  9.108

completed at.....................: 06OCT1999 16:45:28

 

 

i am reading all files from a directory and trying to read below fields

#3 number as old_number

#5 number1 as new_num

#7 type

#12 time time8.

 

then i want to read below fields  

 

New[(xx)^2]:type1
                                         wrong     right
                                     one  two    one      two
AAA (1.1 - 6.0Hz):   1.101 1.101  1.111   1.105
BBB (1.2 - 8.5Hz):   1.782 1.101  1.121   1.106
CCC (1.3 - 12.5Hz): 1.403 1.101  1.131  1.107
DDD (1.4 - 30.0Hz): 1.714 1.101  1.141  1.108


New[(xx)^2]: type2
                                        wrong  right
                                  one two one two
AAA (1.1 - 6.0Hz):   1.101   1.101  2.111  6.105
BBB (1.2 - 8.5Hz):   1.782   1.101 3.121  7.106
CCC (1.3 - 12.5Hz): 1.403  1.101 4.131  8.107
DDD (1.4 - 30.0Hz): 1.714  1.101  5.141  9.108

completed at.....................: 06OCT1999 16:45:28

 

 

AAA_Type1_Right_ONE    AAA_Type1_Right_TWO             

       1.111                              1.101 

 

AAA_Type2_Right_ONE    AAA_Type2_Right_TWO 

     2.111                                                6.105           

 

BBB_Type1_Right_ONE    BBB_Type1_Right_TWO 

     1.121                                              1.106

BBB_Type2_Right_ONE    BBB_Type2_Right_TWO 

  3.121                                               7.106

 

(and same for CCC and DDD for both type1 and type2)

 

 

 

   

 

 

 

the final out will have should also have below fields + above mentioned fields.

 

path   ,   filename   ,nc ,size, date1, dname,old_number,new_number ,type, time  and fileds for AAA,BBB,CCC,DDD as mentioned above.

 

 

my piece of code for reading all files from directory in a sas dataset.

 

%let x=test;
%let DELIM = ' ' ;
%let DELIM2 = '_' ;

filename DIRLIST pipe "dir /-c /q /t:c %bquote("C:\Users\Tanwar\Desktop\&x\*.txt")";
data xxx (DROP=dir_rec line);
length path filename $255 line $1024 nc $50;
retain path ;
infile DIRLIST length=reclen ;
input line $varying1024. reclen ;
if reclen = 0 then delete ;
if scan(line,1,&DELIM)='Volume'|scan(line,1,&DELIM)='Total'|scan(line,2,&DELIM)='File(s)'|scan(line,2,&DELIM)='Dir(s)' then delete;
dir_rec=upcase(scan(line,1,&DELIM))='DIRECTORY';
if dir_rec then path=left(substr(line,length("Directory of")+2)) ;
else do ;
size = input( scan( line, 3, &DELIM ) , best. ) ;
filename = scan( line, 5, &DELIM ) ;
if filename in ( '.' '..' ) then delete ;
if find(substr(filename,1,5),"_")>0 then do;
if find("&x","_")>0 then date1 = abs(input( scan( line, 6, &DELIM2 ), date9. ));
if find("&x","_")<=0 and find("&x","FPS")<=0 then date1 = abs(input( scan( line, 5, &DELIM2 ), date9. ));
if find("&x","FPS")>0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
end;
if find(substr(filename,1,5),"_")=0 then do;
if find("&x","_")>0 then date1 = abs(input( scan( line, 5, &DELIM2 ), date9. ));
if find("&x","_")<=0 and find("&x","FPS")<=0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
if find("&x","FPS")>0 then date1 = abs(input( scan( line, 4, &DELIM2 ), date9. ));
end;
end ;
nc="1";
run ;

 

data yyyy;
set xxx;
if missing(filename) or find(filename,'Copy')>0 then delete;
dname=trim(left(path))||"\"||trim(left(filename));
run;

 

Help me on this.

 

9 REPLIES 9
Kurt_Bremser
Super User

Does every file have a single instance of this report, or could there be a sequence of reports in one file?

 

The basic algorithm looks like this

- retain all your variables

- read a line

- if line starts with "File name", and file_name > " ", output; and reset all vars. set file_name to scan(line,2,':')

- check line for appearance of keywords, and extract information as with file_name

- if line starts with one of AAA,BBB,CCC,DDD, take scan(line,2,':') and then use input(scan(...,3),5.1) and input(scan(...,4),5.) to get the respective numerical values

- at EOF, output (last group)

 

You can read all text files in a directory in one sweep, using wildcards in the infile statement. The above algorithm can deal with that.

 

tabraz
Fluorite | Level 6

Hi ,

 

Thanks for your response on this.

 

 Yes every file have a single instance of this report.File format and sequence remain same for all files.

Kurt_Bremser
Super User

Here a very crude piece of code, tested with the contents of test1.txt:

data want;
infile '$HOME/sascommunity/tabraz.txt' truncover end=eof;
input line $200.;
length
  type_flag $5
  file_name $100
  old_number
  new_num $4
  type $2
  time 8
  aaa_type1_right_one
  aaa_type1_right_two
  aaa_type2_right_one
  aaa_type2_right_two
  bbb_type1_right_one
  bbb_type1_right_two
  bbb_type2_right_one
  bbb_type2_right_two
  ccc_type1_right_one
  ccc_type1_right_two
  ccc_type2_right_one
  ccc_type2_right_two
  ddd_type1_right_one
  ddd_type1_right_two
  ddd_type2_right_one
  ddd_type2_right_two 8
;
format time datetime19.;
retain
  type_flag
  file_name 
  old_number
  new_num
  type
  time
  aaa_type1_right_one
  aaa_type1_right_two
  aaa_type2_right_one
  aaa_type2_right_two
  bbb_type1_right_one
  bbb_type1_right_two
  bbb_type2_right_one
  bbb_type2_right_two
  ccc_type1_right_one
  ccc_type1_right_two
  ccc_type2_right_one
  ccc_type2_right_two
  ddd_type1_right_one
  ddd_type1_right_two
  ddd_type2_right_one
  ddd_type2_right_two
;
if index(line,"File name") = 1
then do;
  if file_name > ' '
  then do;
    output;
    old_number = ' ';
  end;
  file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,3));
  when ('AAA') do;
    line = scan(line,2,':');
    select (type_flag);
      when ('type1') do;
        aaa_type1_right_one = input(scan(line,3,'09'x),5.);
        aaa_type1_right_two = input(scan(line,4,'09'x),5.);
      end;
      when ('type2') do;
        aaa_type2_right_one = input(scan(line,3,'09'x),5.);
        aaa_type2_right_two = input(scan(line,4,'09'x),5.);
      end;
      otherwise;
    end;
  end;
  otherwise do;
    if index(line,'number') = 1
    then do;
      if old_number = ' '
      then old_number = strip(scan(line,2,':'));
      else new_num = strip(scan(line,2,':'));
    end;
    if index(line,'New[(xx)^2]') = 1 then type_flag = strip(scan(line,2,':'));
    if index(line,'created') = 1
    then do;
      line = strip(substr(line,indexc(line,':')+1));
      substr(line,10,1) = ':';
      time = input(line,datetime19.);
    end;
    if index(line,'type') = 1 then type = strip(scan(line,2,':'));
  end;
end;
if eof then output;
run;

Once can pack the 'AAA' branch into a macro, so that it has to be written only once.

tabraz
Fluorite | Level 6

Thaks for reply.

 

When i am trying to read  actual file to find more records in actual file after tuning the script   its giving  wrong output i tried to debug it but not able find  solution.

 

i am sharing the replica of actual file.

 

fields that i need in out put :

 

S_number as  S_number 

Sub_number as sub_number 

occ number as occc

imp as imp

imp start as imp_start

imp end as imp_end 

aaa_abli_limit_one    
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two

aaa_toto _limit_one
aaa_toto_limit_two

aaa_pipi_limit_one
aaa_pipi_limit_two

aaa_rkrk_limit_one
aaa_rkrk _limit_two

 

for example     aaa_abli_limit_one  will have      1.1 

                      aaa_abli_limit_one  will have        1.1

 

 

 

same for BBB,CCC,DDD.

 

however i am trying to solve this.Your help on this is much appreciated.

tabraz
Fluorite | Level 6

am sharing the sample code

 

However its not the actual code which i was using as was not saved .

 

data want;
infile 'C:\Desktop\test\test5.txt' truncover end=eof;
input line $200.;
length
type_flag $5
file_name $100
S_number
Sub_number
occ $4
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two
/*aaa_toto _limit_one*/
/*aaa_toto_limit_two*/
/*aaa_pipi_limit_one*/
/*aaa_pipi_limit_two*/
/*aaa_rkrk_limit_one*/
/*aaa_rkrk _limit_two*/
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.
;
format imp_start datetime19.;
format imp_end datetime19.;
retain
type_flag
file_name
S_number
Sub_number
occ
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl_limit_one
aaa_joyl_limit_two
aaa_toto _limit_one
aaa_toto_limit_two
aaa_pipi_limit_one
aaa_pipi_limit_two
aaa_rkrk_limit_one
aaa_rkrk _limit_two
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.

;
if index(line,"File name") = 1
then do;
if file_name > ' '
then do;
output;
S_number = ' ';
end;
file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,5));
when ('abli') do;
line = scan(line,3,':');
select (type_flag);
when ('delta') do;
aaa_abli_limit_one = input(scan(line,3,'09'x),5.);
aaa_abli_limit_one= input(scan(line,4,'09'x),5.);
end;
when ('joyl') do;
aaa_joyl_limit_one = input(scan(line,3,'09'x),5.);
aaa_joyl_limit_two = input(scan(line,4,'09'x),5.);

when ('toto_') do;
aaa_toto_limit_one = input(scan(line,3,'09'x),5.);
aaa_toto_limit_two = input(scan(line,4,'09'x),5.);

when ('pipi') do;
aaa_pipi_limit_one = input(scan(line,3,'09'x),5.);
aaa_pipi_limit_two = input(scan(line,4,'09'x),5.);
when ('rkrk') do;
aaa_rkrk_limit_one = input(scan(line,3,'09'x),5.);
aaa_rkrk_limit_two = input(scan(line,4,'09'x),5.);
end;

/* /same for other BBB,CCC,DDD/*/
otherwise;
end;
end;
otherwise do;
if index(line,'S_number ') = 1
then do;
if Study_nr = ' '
then S_number = strip(scan(line,2,':'));
if index(line,'Subject number') = 1
then Sub_number= strip(scan(line,5,':'));
end;
if index(line,'play [(uV)^2]') = 1 then type_flag = strip(scan(line,2,':'));
if index(line,'Date and time created') = 1
then do;
line = strip(substr(line,indexc(line,':')+1));
substr(line,10,1) = ':';
imp_start = input(line,datetime19.);
substr(line,11,1) = ':';
imp_start = input(line,datetime19.);
substr(line,19,1) = ':';
imp = input(line,datetime19.);
end;
if index(line,'Occasion') = 1 then Occasion = strip(scan(line,2,':'));
end;
end;
if eof then output;
run;

Kurt_Bremser
Super User

When I wrote

 

"Use the "little running man" icon to post code, so we can safely copy/paste and run it."

 

I didn't do it just because I was incredibly bored. That subwindow preserves all formatting, while the main posting window virtually eliminates it, making the code next to unreadable.

tabraz
Fluorite | Level 6

data want;
infile 'C:\Desktop\test\test5.txt' truncover end=eof;
input line $200.;
length
type_flag $5
file_name $100
S_number
Sub_number
occ $4
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl _limit_one
aaa_joyl _limit_two
/*aaa_toto _limit_one*/
/*aaa_toto_limit_two*/
/*aaa_pipi_limit_one*/
/*aaa_pipi_limit_two*/
/*aaa_rkrk_limit_one*/
/*aaa_rkrk _limit_two*/
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.
;
format imp_start datetime19.;
format imp_end datetime19.;
retain
type_flag
file_name
S_number
Sub_number
occ
imp
imp_start
imp_end
aaa_abli_limit_one
aaa_abli_limit_two
aaa_joyl_limit_one
aaa_joyl_limit_two
aaa_toto _limit_one
aaa_toto_limit_two
aaa_pipi_limit_one
aaa_pipi_limit_two
aaa_rkrk_limit_one
aaa_rkrk _limit_two
/**/
/*bbb_abli_limit_one*/
/*bbb_abli_limit_two*/
/*bbb_joyl _limit_one*/
/*bbb_joyl _limit_two*/
/*bbb_toto _limit_one*/
/*bbb_toto_limit_two*/
/*bbb_pipi_limit_one*/
/*bbb_pipi_limit_two*/
/*bbb_rkrk_limit_one*/
/*bbb_rkrk _limit_two*/
/**/
/*ccc_abli_limit_one*/
/*ccc_abli_limit_two*/
/*ccc_joyl _limit_one*/
/*ccc_joyl _limit_two*/
/*ccc_toto _limit_one*/
/*ccc_toto_limit_two*/
/*ccc_pipi_limit_one*/
/*ccc_pipi_limit_two*/
/*ccc_rkrk_limit_one*/
/*ccc_rkrk _limit_two*/
/**/
/*ddd_abli_limit_one*/
/*ddd_abli_limit_two*/
/*ddd_joyl _limit_one*/
/*ddd_joyl _limit_two*/
/*ddd_toto _limit_one*/
/*ddd_toto_limit_two*/
/*ddd_pipi_limit_one*/
/*ddd_pipi_limit_two*/
/*ddd_rkrk_limit_one*/
/*ddd_rkrk _limit_two*/
8.

;
if index(line,"File name") = 1
then do;
if file_name > ' '
then do;
output;
S_number = ' ';
end;
file_name = strip(scan(line,2,':'));
end;
select (substr(line,1,5));
when ('abli') do;
line = scan(line,3,':');
select (type_flag);
when ('delta') do;
aaa_abli_limit_one = input(scan(line,3,'09'x),5.);
aaa_abli_limit_one= input(scan(line,4,'09'x),5.);
end;
when ('joyl') do;
aaa_joyl_limit_one = input(scan(line,3,'09'x),5.);
aaa_joyl_limit_two = input(scan(line,4,'09'x),5.);

when ('toto_') do;
aaa_toto_limit_one = input(scan(line,3,'09'x),5.);
aaa_toto_limit_two = input(scan(line,4,'09'x),5.);

when ('pipi') do;
aaa_pipi_limit_one = input(scan(line,3,'09'x),5.);
aaa_pipi_limit_two = input(scan(line,4,'09'x),5.);
when ('rkrk') do;
aaa_rkrk_limit_one = input(scan(line,3,'09'x),5.);
aaa_rkrk_limit_two = input(scan(line,4,'09'x),5.);
end;

/* /same for other BBB,CCC,DDD/*/
otherwise;
end;
end;
otherwise do;
if index(line,'S_number ') = 1
then do;
if Study_nr = ' '
then S_number = strip(scan(line,2,':'));
if index(line,'Subject number') = 1
then Sub_number= strip(scan(line,5,':'));
end;
if index(line,'play [(uV)^2]') = 1 then type_flag = strip(scan(line,2,':'));
if index(line,'Date and time created') = 1
then do;
line = strip(substr(line,indexc(line,':')+1));
substr(line,10,1) = ':';
imp_start = input(line,datetime19.);
substr(line,11,1) = ':';
imp_start = input(line,datetime19.);
substr(line,19,1) = ':';
imp = input(line,datetime19.);
end;
if index(line,'Occasion') = 1 then Occasion = strip(scan(line,2,':'));
end;
end;
if eof then output;
run;

 

tabraz
Fluorite | Level 6

Hi,

 

I managed to read the file 🙂

 

Thanks for your help.

 

Regards

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3318 views
  • 1 like
  • 2 in conversation