BookmarkSubscribeRSS Feed
MellyJ13
Calcite | Level 5

Hi all

 

I have multiple datasets named after stock tickers: AAPL AMZN MSFT etc. I have about 1000 and I want to merge them all into one data file. Each dataset is set up with the same columns: date&time date hour minute holding_"ticker" (e.g. holding_aapl). I want my new dataset to have the columns: date&time date hour minute holding_aapl holding_amzn holding_msft etc. This is the code I have so far:

 

proc sql ;
create table WORK.mytables as
select*
from dictionary.tables
where libname = 'WORK'
order by memname ;
quit ;

 

%macro mergedata;
proc sql ;
select count(memname) into: obs from work.mytables;

%LET OBS=&OBS.;

%let obs=&obs.;
SELECT CATX("test", memname)
INTO :SETNAME1-:SETNAME&OBS.
FROM work.MYTABLES;
QUIT;

%DO i=1 %to &obs.;
data merged;
merge work.aapl (in=ina)
work.&&setname&i (in=inb);
by date hour dummy;
if ina;
run;
proc append data=merged base=work.merged;
run;
%end;
%mend;
%mergedata;

 

But the new merge keeps replacing the old merge so I am left with a dataset that has date&time date hour minute holding_aapl holding_msft or whichever the last chronological stock is. I tried it without the proc append line and still overwriting it. Any help would be much appreiated!

5 REPLIES 5
ballardw
Super User

I don't think you want a data step merge, that is side by side and replaces values of like named variables.

It sounds like you want an Append

 

Simplest sounds like:

 

data want;
    set aapl amzn msft <continue the list>;
   keep <the list of common variables goes here>;
run;

If this doesn't work then provide some details why not.

MellyJ13
Calcite | Level 5

I do want it to be side by side. I might be using the proc append function incorrectly. In the final dataset I want to have all the tickers as individual columns:

DATE&TIME DATE HOUR MINUTE HOLDING_AAPL HOLDING AMZN ------- HOLDING_LUV --------HOLDING_MSFT etc. 

mkeintz
PROC Star

Why do you want it side by side?   Is this because each has an identical sequences of data & time?  Are you trying to determine some sort of portfolio value over the time series?

 

If it is really a matter of synchronized time series, you would almost certainly be better off interleaving the series, i.e. N(stocks) records for time point 1 followed by N(stocks) records for time point 2, etc.  

 

More context please.

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

You originally said the files all had the same structure. How are you going to create NEW variables?

 

Your description of the variable names is also very confusing. 

date&time date hour minute holding_"ticker"

Do you really have a variable name with an & in it? 

Why do you have DATE twice? Why do you have TIME and HOUR and MINUTE?   Is the first variable really the just the same information as the next three?

Does that last variable name really have quotes in it?  Perhaps you mean to say that the last variable has a different name in each of the source datasets?

 

Are the original datasets sorted by the DATETIME?

 

Why is your attempt to merge only using DATE HOUR?  Are you trying to collapse multiple minute level observations into a single observation for the hour?

 

Please post a few observations of at least two of the datasets and show what output dataset you want for the posted example input.

PaigeMiller
Diamond | Level 26

As discussed in your other thread, creating a long instead of wide data set is the way to go. Don't make the mistake of insisting on a wide data set, this makes your programming more difficult. A long data set, as shown in your other thread, is much easier to work with.

--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1699 views
  • 0 likes
  • 5 in conversation