BookmarkSubscribeRSS Feed
Srigyan
Quartz | Level 8

Hi,

I have the following data

 

Start_DateEnd_DatePrice
01-Jan-1805-Jan-1815
03-Jan-1809-Jan-1814
07-Jan-1810-Jan-1812
02-Jan-1807-Jan-1818

 

& I am looking for following output

 

Start_DateEnd_DatePrice
01-Jan-1802-Jan-1815
03-Jan-1806-Jan-1814
07-Jan-1810-Jan-1812

 

How I find the price is based on what is the lowest price available in the first table.

1) 15 is lowest price is for 01-Jan & 2nd Jan.

2) In the same way, 14 is lowest price for 03rdJan to 06-Jan

3) 18 -is never the lowest price so it didn't come in the output data. 

***Our output data start date is 01-Jan-2018 and end date is 10-Jan-2018.

 

This is a very huge data originally so I need the most optimized SAS program which completes this job

4 REPLIES 4
Astounding
PROC Star

Here's one approach:

 

data all;

set have;

do date = start_date to end_date;

   output;

end;

keep date price;

run;

 

proc summary data=all nway;

var price;

class date;

output out=minimums (keep=price date) min=;

run;

  

data want;

set minimums;

by price notsorted;

retain start_date;

if first.price then start_date = date;

if last.price;

end_date = date;

drop date;

run;

 

It's untested code, but looks right.

Srigyan
Quartz | Level 8

This is working great, however As I mentioned this is for huge database, the following program

 

data all;
format date date9.;
set sample;
do date=Start_Date to End_Date;
output;
end;
keep date price;
run;

 

creates one record for each day and each price.I need to calculate this 5 years for more than a billion records. 

Just don't want to create one record for each date, and each price. Can we optimize this process?

To approve a single suggestion, mouse over it and click "✔"
Click the bubble to approve all of its suggestions.
Astounding
PROC Star

 Sorry, I didn't appreciate the size of the data so I tried to use a clearer approach.  Here's another way to get summarized data.  I'm going to assume you know the minimum and maximum dates and can plug them into this step, but if that needs to be automated it can be:

 

data minimums;

set have;

array prices ("01jan2010"d : "31dec2018"d);

do until (done);

   set have end=done;

   do k=start_date to end_date;

      prices{k} = min(prices{k}, price);

   end;

end;

do date="01jan2010"d to "31dec2018"d;

   price = prices{k};

   output;

end;

keep date price;

run;

 

This might still take a while to process 5B records.  But direct access to an array element should even be faster than hashing.

 

Then run the final DATA step as is (by price notsorted).

Ksharp
Super User

Hope your computer have a big memory .

 

 

data have;
infile cards expandtabs truncover;
input Start_Date :date9. End_Date :date9. Price;
format  Start_Date End_Date :date9.;
cards;
01-Jan-18	05-Jan-18	15
03-Jan-18	09-Jan-18	14
07-Jan-18	10-Jan-18	12
02-Jan-18	07-Jan-18	18
;
run;
data _null_;
 if _n_=1 then do;
   declare hash h(hashexp:20,ordered:'y');
   h.definekey('date');
   h.definedata('date','p');
   h.definedone();
 end;
set have end=last;
do date=Start_Date to End_Date ;
  if h.find()=0 then do;p=min(p,price);h.replace(); end;
   else do;p=price;h.add(); end;
end;
if last then h.output(dataset:'want');
format date date9.;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 567 views
  • 0 likes
  • 3 in conversation