DATA Step, Macro, Functions and more

append many files and add a new variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 199
Accepted Solution

append many files and add a new variable

The code below brings in many excel files from a folder (they all contain the same variables).

All the observations in the variable named "dataset" begin with "allinj" and are followed by numbers (eg allinj82614, allinj23014, allinj51614, etc)

I would like to append the files into one file and keep three variables (ie clm_num, doi, x).

Also, I need to add a variable into this new file that contains the variable dataset (eg allinj82614, allinj23014, allinj51614) so that it's known which excel file the data came from.

Does anyone know how to do this?

filename dir  "C:\Users\Desktop\Test_Import\*.xls ";

data new;

  length filename  fname $ 100;

  infile dir  eof=last filename=fname;

  input ;

  last: filename=fname;

run;

data a;

  set new;

  call scan(filename,6,pos3,len,'\');

  drop len;

  filename1=substr(filename,pos3);

  dataset=compress(filename1,"\-_.");

run;

data _null_;

  set a;

  call execute('proc import datafile="C:\Users\Desktop\Test_Import\'||strip(filename1)||'" out='||strip(dataset)||';

                  run;');

run;


Accepted Solutions
Solution
‎09-12-2014 07:48 PM
Super User
Posts: 19,870

Re: append many files and add a new variable

If all the files contain the same variables then you can use the indsname option.

data all;

set allinj: indsname=src_file;

excel_file=src_file;

keep clm_num doi x;

run;

View solution in original post


All Replies
Solution
‎09-12-2014 07:48 PM
Super User
Posts: 19,870

Re: append many files and add a new variable

If all the files contain the same variables then you can use the indsname option.

data all;

set allinj: indsname=src_file;

excel_file=src_file;

keep clm_num doi x;

run;

Regular Contributor
Posts: 199

Re: append many files and add a new variable

Hi Reeza,

Thank you.

Do you also know how to add a variable into this new file that contains the variable "dataset" (eg allinj82614, allinj23014, allinj51614) so that it's clear which excel file the data came from?

Respected Advisor
Posts: 4,934

Re: append many files and add a new variable

Just add excel_file to the keep statement in Reeza's code.

PG

PG
Frequent Contributor
Posts: 89

Re: append many files and add a new variable

Sir.if possible explain with a detailed example it would help me.I am a beginner.

Super User
Posts: 19,870

Re: append many files and add a new variable

Posted in reply to venkatnaveen

ADD to KEEP STATMENT:

data all;

set allinj: indsname=src_file;

excel_file=src_file;

keep clm_num doi x excel;

run;

Respected Advisor
Posts: 4,934

Re: append many files and add a new variable

Posted in reply to venkatnaveen

The kind of documentation that would always be required.:

/* Create a dataset called all */

data all;

/* Read all datasets with names starting with allinj. Create a variable called src_file that will hold the actual name of the dataset being read. */

set allinj: indsname=src_file;

/* Copy the name of the dataset being read to a variable called excel_file */

excel_file=src_file;

/* List the names of variables which will be part of the all dataset */

keep clm_num doi x excel_file;

run;

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 403 views
  • 3 likes
  • 4 in conversation