I have the below table which gives you prices for a product from a date range. But as you can see there are multiple price is available for the same date.So I wanted to have a new range where price is lowest for that day i.e.
if we talk about 02-jan-2018 for product A there are price 28,26,28,10 but i will consider the price 10 because thats the lowest in that range.
Product | on_sale_date | off_sale_date | Price |
A | 01-Jan-18 | 07-Jan-18 | 28 |
A | 02-Jan-18 | 08-Jan-18 | 26 |
A | 02-Jan-18 | 07-Jan-18 | 28 |
A | 04-Jan-18 | 10-Jan-18 | 21 |
A | 02-Jan-18 | 10-Jan-18 | 10 |
A | 05-Jan-18 | 14-Jan-18 | 14 |
b | 02-Jan-18 | 09-Jan-18 | 27 |
b | 05-Jan-18 | 13-Jan-18 | 30 |
b | 04-Jan-18 | 12-Jan-18 | 16 |
b | 05-Jan-18 | 11-Jan-18 | 19 |
c | 02-Jan-18 | 08-Jan-18 | 27 |
c | 02-Jan-18 | 06-Jan-18 | 24 |
c | 02-Jan-18 | 08-Jan-18 | 30 |
So my table will look like this ( it creates a new range which takes the lowest price in that period.)
Product | new_on_sale_date | new_off_sale_date | Price |
A | 01-Jan-18 | 01-Jan-18 | 28 |
A | 02-Jan-18 | 10-Jan-18 | 10 |
A | 11-Jan-18 | 14-Jan-14 | 14 |
B | 02-Jan-18 | 03-Jan-18 | 27 |
B | 04-Jan-18 | 12-Jan-18 | 16 |
B | 13-Jan-18 | 13-Jan-18 | 30 |
c | 02-Jan-18 | 06-Jan-18 | 24 |
c | 07-Jan-18 | 08-Jan-18 | 24 |
Note:
1) There is no limitation for number of records at any level.
2) Unique level is Product+on_sale_date+off_sale_date+price.
3) I am not considering a solution
Where i need to put all the record on day wise first
And then choose the lowest one
And then create range again.
I wanted to have this data without creating any table on day wise because the size of data is too huge to create on day wise.
Basically I am looking for an optimised query which can hendle 300 Gb data rather than just a simple query .
An SQL query in general would sort the data behind the scenes just the same as you would yourself, it just hides that step from you.
Why do you have a need to use SQL for this, is the data not in SAS? If so then consider an SQL forum. If its in SAS, then use SAS.
As for the data, your going to have to do some sort of processing on it, no getting away from that. I would expand the data so each date is a single element, e.g:
data inter (drop=on_sale_date off_sale_date); set have;
retain group;
by product;
group=ifn(first.product,1,group+1);
do date=on_sale_date to off_sale_date; output; end; run;
Then sort out your data from there, taking start of range/end of range.
Not very specific for SQL, datastep is good for me.Let me check and confirm
Its giving me day wise result and I specifically mentioned i dont want to go in that route. I need range without creating a day wise record.
Once you have the day-wise result, you can convert it into ranges.
But I doubt there's a way to get the result you want without first getting the day-wise results.
Daywise data i have created, but the challenge is to create the data the way i am looking.
Purpose to use less hard-disk, i have limited storage for this operation.
@Srigyan wrote:
Daywise data i have created, but the challenge is to create the data the way i am looking.
Purpose to use less hard-disk, i have limited storage for this operation.
Use less hard disk? That should have been mentioned right at the beginning.
But somehow, you have to have enough space on either the disk or in memory to hold all your data and then sort it. If you don't have enough space, then you can't complete the task, it's not a programming problem any more, it is a hardware limitation.
Well, as I said, your going to have to do some work on the data. Simplest method is daywise I am afraid. You could look at hash lookups, not my style of coding so I tend to avoid that. And you would have to loop over the data to find each match, then keep min/max of start/ends and values, then output again. Not sure if it would save you anything and would be more complex coding.
Daywise data i have created, but the challenge is to create the data the way i am looking.
Purpose to use less hard-disk, i have limited storage for this operation.
So you have lots and lots of RAM available then? You need something in order to process it, be it disk space or ram. If you have lots of ram then look at the hash method.
I can try, please share how can i do it.
Please see the link provided on how to use the hash object:
request you to please have a look on the query.
I don't understand what you're asking me to do.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.