BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
celdelmas
Calcite | Level 5

Hi All,

 

I am using a R code from someone else for the creation of a data set and I would like to write my own code in SAS but I don't know how to do it.

 

I have N text files (output of a particle counter). They are named as F_1, F_2 to F_N.

I need to extract 300 lines of one column (the number of particules counted) of these excel files; always the same: drop the first 18 lines (file description, no data); then extract lines 1 to 300 (so in fact 19 to 319), column number 4.

 

The final file (called "RES" in the script below) need to be :

each line corresponding to the N files

and in columns the 300 values extracted (from L1 to 300 of each file, column 4)

 

The final excel file should be:

        L1.......................L300

F_1

F_2

.

.

.

F_N

 

Here is the R code:

 

N<-130
l<-300

RASSEMBLE<-function(N){
 
MAT<-matrix(NA,N,l)
 
for (i in 1:N){
            file<-paste("F_",i,".txt", sep="") 
            name_var<-paste("F_",i,".txt", sep="")  
            data<-read.table(file,header=F,skip=18,fill=T)
            VAR<-data[1:300,4]               
            MAT[i,]<-VAR
}
row.names(MAT)<-seq(from=1,to=N,by=1)
return(MAT)
}
 
RES<-RASSEMBLE(N)
RES

 

 

If anyone could help me to write a SAS code doing this, it will help me a lot to start using MACRO and loop in SAS and advance in my skills,

 

Thank you very much 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Are the files text files or excel files?

I do not see any need to use macro for this problem as a SAS data step can read a text file.

Do the files have delimiters? If so then modify the INFILE statement appropriately.

 

%let N = 130 ;
%let lines= 300 ;

data res ;
  do fileno=1 to &n ;
    length file filename $200 ;
    filename=cats('F_',fileno,'.txt');
    file=filename ;
    infile txt filevar=file truncover end=eof;
    do skip=1 to 18 while (not eof);
      input;
    end;
    do obs=1 to &lines while (not eof);
      length dummy $200 ;
      input dummy dummy dummy VAR ;
      output;
    end;
  end;
  drop skip dummy ;
run;

So this should build a table with the variables FILENO, FILENAME, OBS, VAR .  Where FILENO goes from 1 to &N and OBS goes from 1 to &lines within in each file.

View solution in original post

10 REPLIES 10
Tom
Super User Tom
Super User

Are the files text files or excel files?

I do not see any need to use macro for this problem as a SAS data step can read a text file.

Do the files have delimiters? If so then modify the INFILE statement appropriately.

 

%let N = 130 ;
%let lines= 300 ;

data res ;
  do fileno=1 to &n ;
    length file filename $200 ;
    filename=cats('F_',fileno,'.txt');
    file=filename ;
    infile txt filevar=file truncover end=eof;
    do skip=1 to 18 while (not eof);
      input;
    end;
    do obs=1 to &lines while (not eof);
      length dummy $200 ;
      input dummy dummy dummy VAR ;
      output;
    end;
  end;
  drop skip dummy ;
run;

So this should build a table with the variables FILENO, FILENAME, OBS, VAR .  Where FILENO goes from 1 to &N and OBS goes from 1 to &lines within in each file.

celdelmas
Calcite | Level 5

Hi Tom,

 

thank you very much for your quick reply,

 

I have excel files OR text files depending on the projetcs I am working on, I should be able to correct the script as needed.

 

I got stuck at a very simple step: how do you enter the physical location of the N text files?

 

I tried this but it is not working:

filename mydata "/folders/myshortcuts/recherche/test";

Thank you very much for your help!

 

Tom
Super User Tom
Super User

You probably do not want to use a fileref for this problem. I would put that path into a macro variable then include the value of the macro varaible when building the name of the input file.

 

%let path= /folders/myshortcuts/recherche/test ;
...
   filename=cats("&path/F_",fileno,'.txt');
  
celdelmas
Calcite | Level 5

Hi,

 

Thank you very much for your reply. Unfortunaltely, the code is not working (I am using it in SAS Studio; it lasts forever without ending).

I only need the 4th column of each TXT file and I think this statement is not in the code you proposed to me, where can I add this?

Maybe this is the reason it's not working?

 

Thanks in advance for your help!

 

Tom
Super User Tom
Super User

Normally a SAS dataset will stop when it reads past the end of the input file, but since in this program you are both reading many files in one step and also only reading 300 records from the files it will never read past the end of the file.  So for this program you probably need to add a STOP statement just before the RUN statement to prevent it from running forever.

 

The INPUT statement is reading four values from the line.  SInce you said you only wanted the fourth value the first three values are reading in the variable DUMMY (overwriting it twice on each line) and then this variable is dropped so that it is not included in the output dataset.

celdelmas
Calcite | Level 5

Than you again for your help!

With the STOP statement it is working but the VAR column in the output file is empty. I try a few things and I look for answers in the forum but I could not get it works. In case you still have time to help me, I attached  2 txt files so you can try the loop and maybe find a solution...

 

Here is the script (simplified to 2 files and 60 lines to extract; the column of interest is still the 4th)

%let path= /folders/myshortcuts/recherche/test;

%let N = 2 ;
%let lines= 60 ;

data res ;
  do fileno=1 to &n ;
    length file filename $200 ;
    filename=cats("&path/F_",fileno,'.txt');
    file=filename ;
    infile  txt filevar=file truncover end=eof;
    do skip=1 to 18 while (not eof);
      input;
    end;
    do obs=1 to &lines while (not eof);
      length dummy $200 ;
      input dummy dummy dummy VAR ;
      output;
    end;
  end;
  drop skip dummy ;
  STOP;
run;

Thank you very much for your help!

 

Tom
Super User Tom
Super User

Like I said before you need to look at your file and see how it is formatted so that you can read it. The tabular data at the end of your file is delimited with tabs.  So modify the INFILE statement so that SAS knows that.

    infile  txt filevar=file dsd dlm='09'x truncover end=eof;

You probably want to read other than just the fourth column from these files. The header rows your are skipping seem to have information in them. Also I noticed that the first column in the tabular results section seems to be a BIN number with value less than 300. Perhaps this is important to read?

     input BIN dummy dummy VAR ; 

Also I noticed that some of the numbers have embedded commas.  So if you want to read them as numbers you will need to use the COMMA. informat. 

 

celdelmas
Calcite | Level 5

Thank you for your replies and suggestions!! 

 

The only important thing I would need in addition to the 4th column is the Sample ID (1A1 for example), on the 4th line of each file. I thought I could use a different loop to get those or you think it is feasible in the same?

 

Tom
Super User Tom
Super User

You can look for the sample id while you are skipping the headers.

 

    do skip=1 to 18 while (not eof);
      input;
      length sample_id $10;
      if _infile_=:'Sample ID' then sample_id = scan(_infile_,-1,':'||'09'x);
    end;
celdelmas
Calcite | Level 5

This is just working perfectly!! Thank you very much!!!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 3370 views
  • 0 likes
  • 2 in conversation