Merging Multiple Datasets to One Dataset

MellyJ13 · Posted 03-03-2021 08:42 PM

Hi all

I have multiple datasets named after stock tickers: AAPL AMZN MSFT etc. I have about 1000 and I want to merge them all into one data file. Each dataset is set up with the same columns: date&time date hour minute holding_"ticker" (e.g. holding_aapl). I want my new dataset to have the columns: date&time date hour minute holding_aapl holding_amzn holding_msft etc. This is the code I have so far:

proc sql ;
create table WORK.mytables as
select*
from dictionary.tables
where libname = 'WORK'
order by memname ;
quit ;

%macro mergedata;
proc sql ;
select count(memname) into: obs from work.mytables;

%LET OBS=&OBS.;

%let obs=&obs.;
SELECT CATX("test", memname)
INTO :SETNAME1-:SETNAME&OBS.
FROM work.MYTABLES;
QUIT;

%DO i=1 %to &obs.;
data merged;
merge work.aapl (in=ina)
work.&&setname&i (in=inb);
by date hour dummy;
if ina;
run;
proc append data=merged base=work.merged;
run;
%end;
%mend;
%mergedata;

But the new merge keeps replacing the old merge so I am left with a dataset that has date&time date hour minute holding_aapl holding_msft or whichever the last chronological stock is. I tried it without the proc append line and still overwriting it. Any help would be much appreiated!

ballardw · Posted 03-03-2021 09:15 PM

I don't think you want a data step merge, that is side by side and replaces values of like named variables.

It sounds like you want an Append

Simplest sounds like:

data want;
    set aapl amzn msft <continue the list>;
   keep <the list of common variables goes here>;
run;

If this doesn't work then provide some details why not.

MellyJ13 · Posted 03-03-2021 09:39 PM

I do want it to be side by side. I might be using the proc append function incorrectly. In the final dataset I want to have all the tickers as individual columns:

DATE&TIME DATE HOUR MINUTE HOLDING_AAPL HOLDING AMZN ------- HOLDING_LUV --------HOLDING_MSFT etc.

mkeintz · Posted 03-03-2021 10:13 PM

Why do you want it side by side? Is this because each has an identical sequences of data & time? Are you trying to determine some sort of portfolio value over the time series?

If it is really a matter of synchronized time series, you would almost certainly be better off interleaving the series, i.e. N(stocks) records for time point 1 followed by N(stocks) records for time point 2, etc.

More context please.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Tom · Posted 03-03-2021 10:20 PM

You originally said the files all had the same structure. How are you going to create NEW variables?

Your description of the variable names is also very confusing.

date&time date hour minute holding_"ticker"

Do you really have a variable name with an & in it?

Why do you have DATE twice? Why do you have TIME and HOUR and MINUTE? Is the first variable really the just the same information as the next three?

Does that last variable name really have quotes in it? Perhaps you mean to say that the last variable has a different name in each of the source datasets?

Are the original datasets sorted by the DATETIME?

Why is your attempt to merge only using DATE HOUR? Are you trying to collapse multiple minute level observations into a single observation for the hour?

Please post a few observations of at least two of the datasets and show what output dataset you want for the posted example input.

PaigeMiller · Posted 03-04-2021 06:45 AM

As discussed in your other thread, creating a long instead of wide data set is the way to go. Don't make the mistake of insisting on a wide data set, this makes your programming more difficult. A long data set, as shown in your other thread, is much easier to work with.

--
Paige Miller

Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

Re: Merging Multiple Datasets to One Dataset

SAS Innovate 2025: Save the Date