BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Tasin_Ahmed
Calcite | Level 5

Hi Team,

 

I am Tasin. I have a query related to external raw files.

 

Query:- There are two external raw files. 1st file contains only the descriptor portion (Heading only) and 2nd raw file contains only data portion. How can we combined these two file into single file using data step or SQL.

 

I am trying to expalin the query below.
 
There are two external raw files (.txt).
 
File1.txt =  It contains only Descriptor portion (Column Name)
ex- EmpId Name
 
File2.txt = It contains only Data portion
ex- 
 
101 Tasin
102 Andy
103 Ted
104 Mikael
105 Senastien
 
Now we have to combined these two extenal raw data into single dataset.
 
output :-
EmpId Name
101   Tasin
102   Vijay
103   Arul
104   Arvind
105   Malaye
 
-------------------------------------------------------
I tried with data step as shown below.
 
Data Combine;
 
   infile " path File2.txt";  /* This file contains data portion only so NO PROBLEM*/
 
   input EmpID Name$;        /* Here we write normally variables name manually. Here need to read variables from RAW File1, which contains only variables*/
 
run;
 
Problem is in INPUT statement. So here, we need to think only on INPUT statement that how can we fetch the columns from FILE1?
 
Please help me. My Email id = <redacted>
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Since file1 contains no meta-information (type, size, format) for the columns, you will still need to write that manually, so there's no sense in trying to automate that.

If you simply concatenate the two files externally and use proc import, this would come closest to an automated solution. But be aware that the results of proc import are not consistent and therefore not suitable for a long-time solution.

View solution in original post

5 REPLIES 5
Kurt_Bremser
Super User

Since file1 contains no meta-information (type, size, format) for the columns, you will still need to write that manually, so there's no sense in trying to automate that.

If you simply concatenate the two files externally and use proc import, this would come closest to an automated solution. But be aware that the results of proc import are not consistent and therefore not suitable for a long-time solution.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Its not a good idea to post your email address to an open website, so its removed now.  With regards to your questions I would do things in two steps:

1) Read in the first datafile which contains the metadata for the second. 

2) Using the data read in from step 1, generate the code needed to read in the second.

 

Now as I don't have much to go on, this just shows and example:

data meta;
  infile "metadata.txt" dlm=",";
  length name format $200;
  input name $ format $;
run;

data _null_;
  set meta end=last;
  if _n_=1 then call execute('data want; infile "rawdata.txt" dlm=","; input ');
  call execute(name," ",format);
  if last then call execute(';run;');
run;

This will create the code for data want, based on the information read in from the first datastep.  Note that this is an advanced topic, so if you only learning it would be far simpler for you to program a plain datastep which reads in the second file manually based on the spec provided rather than try to generate code from it.  

BrunoMueller
SAS Super FREQ

Since the file with the column names does not have any information on the type of variable to read, I suggest to combine the two files into one and use Proc IMPORT to do the rest.

 

Combining the file into one, there are several possibilities depending your operating system and the size of the files.

 

See below an example using a DATA Step to combine the files and Proc IMPORT the file.

 

filename xdata ("c:\temp\data_meta.txt" "c:\temp\data_data.txt");
filename xout temp;

data _null_;
  infile xdata;
  input;
  file xout;
  put _infile_;
run;

proc import file=xout out=want dbms=csv replace;
  delimiter=" ";
run;
Tom
Super User Tom
Super User

@BrunoMueller wrote:

Since the file with the column names does not have any information on the type of variable to read, I suggest to combine the two files into one and use Proc IMPORT to do the rest.

 

Combining the file into one, there are several possibilities depending your operating system and the size of the files.

 

See below an example using a DATA Step to combine the files and Proc IMPORT the file.

 

filename xdata ("c:\temp\data_meta.txt" "c:\temp\data_data.txt");
filename xout temp;

data _null_;
  infile xdata;
  input;
  file xout;
  put _infile_;
run;

proc import file=xout out=want dbms=csv replace;
  delimiter=" ";
run;

The only reason that you needed to copy the lines from the fileref XDATA to the temporary file XOUT is because PROC IMPORT is unable to properly read the header records from the fileref XDATA.

Why does PROC IMPORT not handle the aggregate fileref properly? 

BrunoMueller
SAS Super FREQ

Why Proc Import can not handle the XDATA fileref, I have no idea. This is a question for Technical Support.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1966 views
  • 3 likes
  • 5 in conversation