BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
machete
Calcite | Level 5

Dear all,

This question partly originates from a larger problem currently addressed. For more info see below:

I have a large data set with 64 million observations containing high frequency (up to the second) data for currencies for a time period of 68 days.

I would like to reduce this dataset to a minute interval that is. Randomly to pick for each currency 1 observation per minute. This should net 60x24=1440 observations per day per currency and around 100 000(1440x68) per currency for the whole time period.

Since I have around 10 currencies the dataset will be reduced from 64 million to 10x100 000= 1 million.

Do you have any ideas on how to reduce the dataset based on my suggestion?

I will then use this reduced dataset to overcome the computation difficulties that appear in the matching question post (see above link)

Attached is a sample of the data set for only once currency.

Thank you

Best

Neo

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

If your full dataset is sorted by _RIC, date_G_ and time_G_, you could use surveyselect this way :

proc sql;
create view chfReuters3 as
select

     *,

     intnx("MINUTE", dhms(datepart(date_G_), hour(time_G_), minute(time_G_), second(time_G_)), 0)
          as minute format=datetime13.
from sasforum.chfReuters3;
quit;

options nonotes; /* Prevents the printing of a note for every minute with only 1 obs */


proc surveyselect data=chfReuters3 out=chfReutersMinute method=srs sampsize=1;
strata _RIC minute;
run;

options notes;

PG

PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

If your full dataset is sorted by _RIC, date_G_ and time_G_, you could use surveyselect this way :

proc sql;
create view chfReuters3 as
select

     *,

     intnx("MINUTE", dhms(datepart(date_G_), hour(time_G_), minute(time_G_), second(time_G_)), 0)
          as minute format=datetime13.
from sasforum.chfReuters3;
quit;

options nonotes; /* Prevents the printing of a note for every minute with only 1 obs */


proc surveyselect data=chfReuters3 out=chfReutersMinute method=srs sampsize=1;
strata _RIC minute;
run;

options notes;

PG

PG
machete
Calcite | Level 5

PG,

Many thanks for this one, it worked perfectly. Apologies for the late reply, had the impression I had provided feedback

Cheers

Neo

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1012 views
  • 0 likes
  • 2 in conversation