BookmarkSubscribeRSS Feed
zzmm
Calcite | Level 5

Hello,

I have hundreds .txt files in the following data structure:

"date, (mm/dd/yyyy)", "weigh, (lb)", "height, (cm)",

"sys blood pressure", "dia blood pressure", "heart rate", "temperature"

"01/10/2011", "140", "170", "120", "85", "90", "99.2"

"01/12/2011", "139", "170", "126", "90", "96", "98.9"

............................................................................

.............................................................................

The first 2 rows (I have 5 rows in my files but just listed 2 here for simplicity) are the variable names. Each .txt file contains a subject's yearly info., and the files name is the subject's id (for exmaple, 001002).

I need to read the hundreds of .txt files in to one SAS file and have the file names (id) into one variable (id) in the data set. Can anyone help?

Thanks in advance,

16 REPLIES 16
Reeza
Super User

You can use wildcards in your infile statement, so get the code to read it for one file and then use something like the following:

data input;

     infile 'C:\this is my path\*.txt' filename=source;

     input blah blah blah;

     ptid=source;

run;

Haikuo
Onyx | Level 15

This topic has been sufficiently discussed many times,  a search will show you a lot more.

Below is the code I have tried long time ago to file my downloaded files:

filename fn pipe 'dir h:\temp\*.txt /s /b';

data want;

  infile fn truncover;

  input name:&$100.;

  infile in filevar=name end=last truncover firstobs=6;

  do until(last);

input x$10.;

output;

end;

run;

Note: 1. change the folder to wherever your files locate. 2. firstobs= is to set the starting rows. Since you mentioned you have 5 rows of variable names before the data, I put it '6'. 3. input x$10. need to be replaced by your real variable names.

Haikuo

zzzmm
Calcite | Level 5

I am working in under Unix, so the pipe '........../s /b' seems not working. I will try to find what the commands are for Unix.

But regarding the input statement, since as I mentioned earlier, I have first 5 rows which are about 50 variables in each file, is there a way i don't need to manually type them in?

Thanks

FloydNevseta
Pyrite | Level 9

For the pipe, try 'ls /unix/dir/*.txt'.

zzzmm
Calcite | Level 5

Thanks SAS_Bigot, the code worked. Smiley Happy

zzzmm
Calcite | Level 5

Anyone know how could I have the .txt file name (subject id) convert into one variable (ID) in the final SAS data set

FloydNevseta
Pyrite | Level 9

Reeza's earlier reply uses the filename= option on the infile statement to capture the name of the file that is currently being processed by the infile statement. However, you cannot assign source directly to ptid as in the example because the value of source will contain the file extention (.txt) and may also contain the full path (directories) of the file. So I modified the code to strip the leading path and the trailing extension and leave the subject id:

patid = scan( scan( source, -1, '/' ), 1, '.' );


zzzmm
Calcite | Level 5

Thanks SAS_Bigot, again the code worked. Smiley Happy

zzzmm
Calcite | Level 5

Hi Haikuo,

Yes, all my variables have the same informat. And I just realized that the labels (variable names) are in the first row not the first 5 rows after I read the data into SAS file.

The labels look like this (these are not English) and comma divided. Do you have any suggestions?

"Naam","Bloeddruk, voor/na, Resultaat (Voor )","Bloeddruk, voor/na, Resultaat (Na )","Gewicht, voor/na (Voor )","Gewicht, voor/na (Na )","Geslacht","Dialysaatflow (Voorgeschreven )","Diabetes","Lengte patiënt","Totaal UF volume (vocht verlies)","Type behandeling (Voorgeschreven )","Kunstnier (Voorgeschreven )","Toegang tot de bloedbaan (Voorgeschreven )","Streefgewicht (Voorgeschreven )","Natrium in dialysaat (Voorgeschreven )","NeoRecormon, Sterkte (Voorgeschreven )","eprex (Levering via AZM), Sterkte (Voorgeschreven )","Mircera, Sterkte (Voorgeschreven )","aranesp, Sterkte (Voorgeschreven )","Behandelingsschema, Shift (Voorgeschreven )","Bloedflow, gemiddeld (Voorgeschreven )"

zzzmm
Calcite | Level 5

another question, when I use input statement to read in the following lines: input v1$ v2$ v3$...........

"18.04.08","","","","","","","","","","","","","","","","","","60 mcg","",""

"11.07.08","","","","","","500","Nee","","","Haemodialyse","F8 HPS Hemoflow (thuisdialyse)","","","","","","","","","350"

the "Haemodialyse" and "F8 HPS Hemoflow (thuisdialyse) got tuncated. Is there a infile option statement can take care of it without having informat statement?

Reeza
Super User

My typical suggestion is to use the GUI import procedure from SAS once, copy the code from the log and play with it :smileysilly:

Usually gets me what I want.

FloydNevseta
Pyrite | Level 9

Without specifying the length of a variable beforehand or using an informat on the input statement, SAS assumes the variable will be a length of 8. You can modify your input statement for those 2 variables to accommodate longer text strings such as: v11 :$30. v12 :$30. This tells SAS to read up to 30 characters for variables v11 and v12. You'll notice that SAS assigns a length of 30 for those 2 variables.

Haikuo
Onyx | Level 15

is there a way i don't need to manually type them in?

Not in your case.  if all of your variables have the same informat, you could first read in the first 5 rows into macro a variable (sometimes involving other manipulations, such as dequote, de-comma etc) , then use it in downstream data steps.

Haikuo

sschleede
Obsidian | Level 7

This thread was very helpful in solving my problem today. Thank you for sharing your code, Haikuo!

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 11807 views
  • 5 likes
  • 8 in conversation