I am having trouble combining four data sets that came from .txt files on my computer into one new temporary data set. I have attached the four .txt files. The SAS code I have used is below. The first 4 data steps are for reading the files into SAS; then I sorted the data sets by Customer, and then tried to merge. But when I merged them I only got the 25 observations from the AgentD data set. Any ideas on how to get all four of them into one set (like one on top of the next)?
data AgentA;
infile '/home/jadencarlson0/STAT-725/Homework 3/AgentA.txt' firstobs=2;
length Customer 8 Date1 8 Date2 8 City $24 ZipCode 8;
input Customer Date1 : YYMMDD10. Date2 : YYMMDD10. City $ & ZipCode;
format Date1 Date2 YYMMDD10.;
run;
proc print data=AgentA;
title 'AgentA';
run;
data AgentB;
infile '/home/jadencarlson0/STAT-725/Homework 3/AgentB.txt' firstobs=2;
length Customer 8 Date1 8 Date2 8 City $24 ZipCode 8;
input Customer Date1 : YYMMDD10. Date2 : YYMMDD10. City $ & Zipcode;
format Date1 Date2 YYMMDD10.;
run;
proc print data=AgentB;
title 'AgentB';
run;
data AgentC;
infile '/home/jadencarlson0/STAT-725/Homework 3/AgentC.txt' firstobs=2;
length Customer 8 Date1 8 Date2 8 City $24 ZipCode 8;
input Customer Date1 : YYMMDD10. Date2 : YYMMDD10. City $ & Zipcode;
format Date1 Date2 YYMMDD10.;
run;
proc print data=AgentC;
title 'AgentC';
run;
data AgentD;
infile '/home/jadencarlson0/STAT-725/Homework 3/AgentD.txt' firstobs=2;
length Customer 8 Date1 8 Date2 8 Address $24;
input Customer Date1 : YYMMDD10. Date2 : YYMMDD10. Address $ &;
format Date1 Date2 YYMMDD10.;
City=scan(address,1,',');
ZipCode=input(strip(scan(address,-1,',')),5.);
keep Customer Date1 Date2 City ZipCode;
run;
proc print data=agentd;
title 'AgentD';
run;
proc sort data=AgentA;
by Customer;
run;
proc sort data=AgentB;
by Customer;
run;
proc sort data=AgentC;
by Customer;
run;
proc sort data=AgentD;
by Customer;
run;
data Combined;
merge AgentA AgentB AgentC AgentD;
by Customer;
run;
proc print data=Combined;
run;
Set the 4 data sets together as @novinosrin suggests. The code set Agent: ; with the : says to use all data sets that start with Agent. If you have other sets that start with Agent then list the ones you want to use.
Then sort by customer and possibly your date variables After combining them.
You shouldn't be merging. you should be appending. Try
data combined;
set agent:;
run;
MERGE -> add columns to your data set
APPEND -> add rows to your data set
So you want rows.
Set the 4 data sets together as @novinosrin suggests. The code set Agent: ; with the : says to use all data sets that start with Agent. If you have other sets that start with Agent then list the ones you want to use.
Then sort by customer and possibly your date variables After combining them.
That all worked. Now I need to make a new column titled 'Agent' for which Agent .txt file the data came from in this new combined data set. So the first 25 records would say A, the next B, the next C, and the final 25 would be labeled D.
data Combined;
set Agent:;
run;
proc print data=Combined;
run;
look for indsname= in set statement option. im eatin my lunch now, cant type properlyto help
data Combined;
set Agent: indsname=source;
dsn=source;
run;
You could also modify your initial read in program to read them all into the same file at once but I think this would solve your problem.
What if I just want it to say A, B, C, or D instead of WORK.AGENTA?
data Combined;
title 'Combined DataSet';
set Agent: indsname=source;
Agent=source;
run;
proc print data=Combined;
run;
lol you are being cheeky, aren't you ? hahaha
why did you name
data AgentA;
data AgentB;
in the first place?
@novinosrinHaha! I'm just trying to make my professor happy! The current code makes sense to me, but I know I'm going to get points off. So I changed all my Data to
Data A;
Data B;
etc. like you suggested and it still came up with the same WORK.AGENTA in the columns...At this point, I'll probably just leave it, but I greatly appreciate @novinosrin and @Reeza 's responses today. Literally couldn't have gotten through this without you both! Thanks!
@newbie_grad All the best mate!. This forum is very addictive. Try to get on here more often. I am off for my midterm exam at my college in an hour. I was so stressed but your post let go of my stess. 🙂 Take care
If you haven't used a naming convention you need to explicitly list all the data sets. Try the following to import all at once:
data import_all;
*make sure variables to store file name are long enough;
length filename txt_file_name $256;
length Customer 8 Date1 8 Date2 8 Address $24;
*keep file name from record to record;
retain txt_file_name;
*Use wildcard in input;
infile '/home/jadencarlson0/STAT-725/Homework 3/*.txt' firstobs=2 eov=eov filename=filename truncover;
*Input first record and hold line;
input@;
*Check if this is the first record or the first record in a new file;
*If it is, replace the filename with the new file name and move to next line;
if _n_ eq 1 or eov then do;
txt_file_name = scan(filename, -1, "\");
eov=0;
end;
*Otherwise go to the import step and read the files;
else input Customer Date1 : YYMMDD10. Date2 : YYMMDD10. Address $ &;
City=scan(address,1,',');
ZipCode=input(strip(scan(address,-1,',')),5.);
keep Customer Date1 Date2 City ZipCode;
;
run;
Merge also updates columns in dataset a with column values from dataset b.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.