DATA Step, Macro, Functions and more

How to create a dataset from a header file and a data file

Accepted Solution Solved
Reply
Contributor
Posts: 34
Accepted Solution

How to create a dataset from a header file and a data file

Hi,

I have a text file with the header information (all the variable names in one column) and another text file with the data (all the observations).

How do I create a SAS data set from these two files?

Please help!

Thanks,

Krishna.


Accepted Solutions
Solution
‎07-23-2014 08:35 PM
Super User
Super User
Posts: 6,502

Re: How to create a dataset from a header file and a data file

One way is to read the data using generic names like VAR1-VAR100 and then use the names from the other file to generate rename statements.

Another is read in the names and use them to generate the code.

Let's take the simple case of 10 columns of data where every column is a number.

data names;

  infile 'two.txt' truncover ;

  input name $32.;

  oldname = cats('var',_n_);

run;

proc sql noprint ;

  select catx('=',oldname,name)

    into :rename separated by ' '

from names

;

quit;

data numbers ;

  infile 'two.txt' dsd dlm=',' truncover ;

  input var1-var10;

  rename &rename ;

run;

View solution in original post


All Replies
Solution
‎07-23-2014 08:35 PM
Super User
Super User
Posts: 6,502

Re: How to create a dataset from a header file and a data file

One way is to read the data using generic names like VAR1-VAR100 and then use the names from the other file to generate rename statements.

Another is read in the names and use them to generate the code.

Let's take the simple case of 10 columns of data where every column is a number.

data names;

  infile 'two.txt' truncover ;

  input name $32.;

  oldname = cats('var',_n_);

run;

proc sql noprint ;

  select catx('=',oldname,name)

    into :rename separated by ' '

from names

;

quit;

data numbers ;

  infile 'two.txt' dsd dlm=',' truncover ;

  input var1-var10;

  rename &rename ;

run;

Contributor
Posts: 34

Re: How to create a dataset from a header file and a data file

Wow, it worked!

Thank you very much, Tom. You rock!

Regards,

Krishna.

Respected Advisor
Posts: 3,124

Re: How to create a dataset from a header file and a data file

As a more direct approach, you could skip the 'rename' as to (stem code was stolen shamelessly from Tom's code):

data _null_;

infile 'varname.txt' truncover end=last;

length temp $ 2000;

retain temp;

input name $32.;

temp = catx(' ',temp,name);

if last then call symputx('name',temp);

run;

data numbers ;

infile 'data.txt' dsd dlm=',' truncover ;

input &name;

run;

Haikuo

Super Contributor
Posts: 297

Re: How to create a dataset from a header file and a data file

Hi Hai.kuo,

How does this code account for character variables?

I can see how it works for numeric, but not character.

Regards,

Scott

Respected Advisor
Posts: 3,124

Re: How to create a dataset from a header file and a data file

Scott,

I would try this:

data numbers ;

infile 'data.txt' dsd dlm=',' truncover ;

input (&name) (:$200.);

run;

So you want to set a large enough length to cover the maximum length of any possible variables. But again, this would be all Char. if mixed with num and char, then obviously more information will be needed.

Haikuo

Contributor
Posts: 34

Re: How to create a dataset from a header file and a data file

Thank you, Hai.Kuo.

I have a follow-up question: how do I read column input data without a header from a text file into a SAS data set?

Regards,

Krishna.

Super User
Super User
Posts: 6,502

Re: How to create a dataset from a header file and a data file

You can let PROC IMPORT read the file for you.  It has an option to tell not to look for a header line.

proc import datafile="C:\My Documents\myfiles\delimiter.txt"

  dbms=dlm

  out=mydata

  replace;

  delimiter=',';

  getnames=NO;

run;

You could use this instead of the data step in my original answer and then use the trick to generate a RENAME statement to change the variable names.

Watch out though as PROC IMPORT will have to guess at what type of data is in each column.  If you already know the data types then write your own data step so you can control the types yourself.  Personally for your example if the number of columns is small (<30 perhaps) I would just pull the list of names into a program editor and use it to create the data step.  For example you could use the list of names in a LENGTH statement and assign the length (numeric variables should always use length of 8).  If you have any formatted data like DATE or TIME variables then also add an INFORMAT and FORMAT statement for them.  Then in the actual INPUT statement you can either list all the variables again or just use variable list using double hyphen so that you only need to retype the first and last variable name.

data want ;

   infile 'mydata.txt' dsd trucnover;

length

  NAME $20

  AGE 8

  SEX $1

;

input NAME -- AGE ;

run;

Contributor
Posts: 34

Re: How to create a dataset from a header file and a data file

Thank you, Tom, for your quick response.

My datafile.txt has about 60000 columns and does not have any delimiter (column input data). Does PROC IMPORT work on data without a delimiter?

Thanks,

Krishna.

N/A
Posts: 1

Re: How to create a dataset from a header file and a data file

Hi Krishna, you need to declare your delimiter in this line.

"delimiter=',';"  ,in this case the delimiter is ( , ). At the end of each column, you need a comma in your text file.

proc import datafile="C:\My Documents\myfiles\delimiter.txt"

  dbms=dlm

  out=mydata

  replace;

  delimiter=',';

  getnames=NO;

run;

In case you don´t have any delimiter, you need to declare the beginning and the end of each column or your size.

data want ;

   infile 'mydata.txt' dsd trucnover;

length

  NAME $20  <-- This number represents the number of columns that your variable has.

  AGE 8      

  SEX $1

;

input NAME -- AGE ;

run;

Is that correct Tom ?

Thanks,

Wagner.

Contributor
Posts: 34

Re: How to create a dataset from a header file and a data file

Thank you for your response, Wagner. I still have an issue.

All the values in the data set are 1s , 0s or blanks, except for the ID column. I tried the following ways but the data are not getting read correctly:

(1)

LENGTH ID $10

         VAR1 $1

         VAR2 $1;

INPUT ID -- VAR2;

(2)

LENGTH ID $10

         VAR1 $1

         VAR2 $1;

INPUT @1 ID $

           @11 VAR1 $

           @12 VAR2 $;

(3)

LENGTH ID $10

         VAR1 3

         VAR2 3;

INPUT @1 ID $

           @11 VAR1

           @12 VAR2 ;

Could you tell me which length and input statements work best for data with 1s, 0s and blanks?

Thanks!

Krishna.

Contributor
Posts: 34

Re: How to create a dataset from a header file and a data file

Never mind.

I figured this out, finally!

Thank you, all.

Regards,

Krishna.

Super User
Super User
Posts: 6,502

Re: How to create a dataset from a header file and a data file

So if the data is in fixed columns then that is also simple with SAS.

You could tell SAS to read from specific columns (input ID $ 1-10) or tell it to use a specific format (input ID $10.).

I would use formatted mode so that I did not need to type all of the column numbers.

If you input your non-id variables as numeric then it will treat '1' as the number one , '0' as the number zero, and ' ' as a missing value.

So if you have a 10 character ID value and then 20 one character variables then you could write:

data want ;

  infile 'myfile.txt' truncover ;

  input id $10. (var1-var20) (1.) ;

run;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 2762 views
  • 7 likes
  • 5 in conversation