I am trying to identify trends that last at least for 10 observations. However, the output I get repeats itself if the trend lasts for more than 10 observations. For example if the trend continues for 12 observations I have observations 0-10, 1-11 and 2-12 back to back. Below is the output file. How can I eliminate the repetition in the output? I need only 0-12. Of course the max observations in the trend can be more than 12. In the output I have the Symbol, Price and the Counter. Thank you.
How can I accomplish this? Thank you!
Data want;
set test;
lagPrice=lag(Price);
lag10Price=lag10(Price);
if Price>lagPrice and not missing(lagPrice) then Increasing ++ 1 ;
else Increasing=0 ;
if Increasing > 9 then TrendUp=1 ;
ISIL 18.7800 0
ISIL 18.7900 1
ISIL 18.8000 2
ISIL 18.8100 3
ISIL 18.8200 4
ISIL 18.8700 5
ISIL 18.8900 6
ISIL 18.9100 7
ISIL 18.9200 8
ISIL 18.9300 9
ISIL 18.9400 10
ISIL 18.7900 1
ISIL 18.8000 2
ISIL 18.8100 3
ISIL 18.8200 4
ISIL 18.8700 5
ISIL 18.8900 6
ISIL 18.9100 7
ISIL 18.9200 8
ISIL 18.9300 9
ISIL 18.9400 10
ISIL 18.9600 11
ISIL 18.8000 2
ISIL 18.8100 3
ISIL 18.8200 4
ISIL 18.8700 5
ISIL 18.8900 6
ISIL 18.9100 7
ISIL 18.9200 8
ISIL 18.9300 9
ISIL 18.9400 10
ISIL 18.9600 11
ISIL 18.9800 12
Message was edited by: Betsy Abbe
And what should happen if your criteria is met for obs 1-10 and then for obs 3-12?
From what you describe it appears you want to do some kind of trend analysis. It's not my area of expertise but I would expect that the problem of overlaying time series is a common one and that some of the people here in this forum could point you into the right direction if you can explain what you're trying to achieve.
Thank you so much. I tried to clarify the question further by adding the below sentence based on your answer. I hope the question is clearer now:
If there is a trend between observations 1-11 and then observations 3-13, I treat them as separate trends. What I'm trying to capture are trends that meet the criteria and continue unbroken for 10 or more observations.
I can't think of any elegant approach for your requirement. It certainly can be done but anything I can think of requires quite a bit of coding and passes through the data.
I still would recommend that you explain what you're actually trying to achieve, what kind of analysis you want to do.
On a side note: Do you have SAS/ETS licensed? And what SAS version are you on?
Perhaps some test data would be helpful? Off the top of my head you could use arrays and reverse do loops (I assume 50 columns with either Up or Down in the below and a section of 11 consecutive - note I haven't tested the below as without data):
data want;
set have;
array mycols{50} $200. col1-col50;
do I=49 to 1 by -1; /* Last occurrence would be max - 11 otherwise you don't get the 11 consecutive
The by -1 indicates to go back from 49 to 1 */
found=1; /* Set a temporary search at yes */
do j=1 to 11;
if mycols{I+(j-1)}="Down" then found=0;
end;
if found=1 then do; /* Only if all elements from I to I+11 are Up */
the_sequence="Start array="||strip(put(I));
output;
end;
end;
run;
I added the output file, thank you!
This program reads a transaction data set with columns: time, price (a transaction_id column would be also useful) and outputs the "maximal windows": time1, price1, time2, price2.
For each transaction you will get 0 or 1 output row.
data haveNumbered;/*if in the same "up-run", runNum is the same, otherwise +1*/
set have;
retain runNum 1;
if price<lag(price) then runNum+1;
run;
data upRunsOnly;
set haveNumbered;
by runNum;
if first.runNum and last.runNum then delete; /*if it is a single obs in a group, then delete*/
run;
data readAhead; /*we will use this dataset to interleave with the original, so we look ahead 60 minutes*/
set upRunsOnly;
time=time-60;
run;
/*
Observe, that when you move the left side of the window to the right (reading an obs from upRunsOnly): obs diff decreases, time diff decreases, price ratio decreases.
When you move the right side of the window (reading an obs from readAhead): the opposite.
You should always look for maximal window sizes. So the right side of the window is always the maximum time within the +60 minutes.
*/
data want;
set readAhead(in=inR) upRunsOnly(in=inU);
by runNum time;
retain n1 n2 price1 price2 time1 time2;
keep n1 n2 price1 price2 time1 time2 /*additional variables: transaction_id, etc. */;
if first.runNum then do;
n1=0; n2=0;
end;
if InR then do;
n2+1;
price2=price;
time2=time-60;
end;
if InU then do;
n1+1;
price1=price;
time1=time;
if n1<=n2+10 and price1*1.1<=price2 and time2<=time1+60 then do; /*time condition is always true, because of by statement*/
output;
end;
end;
run;
Not tested.
I'm sure this can be consolidated into fewer data steps. Or you can use data step views. But this way it is more clear (for me), and could be easier tested, debugged.
Thanks I tried but couldn't get this to work. Unfortunately I'm not that good in SAS - yet!
Sorry Betty,
please ignore my answer.
In my program I am using a column called time, which you don't even have in your input data set. And I am also checking a 1 minute time window constraint, which is also not mention in your question.
Maybe I wanted to post this to a different question.
Below code is eventually doing what you're after.
It would really help if you could post a data step creating representative sample data, explain what you want to achieve and then also post a desired output.
data have;
format date date9. code $8. price comma20.4;
input code $ price :comma. check $1.;
date='01jan2014'd + _n_ -1;
datalines;
ISIL 18.7800 1
ISIL 10.7900 0
ISIL 18.7800 1
ISIL 18.7900 1
ISIL 18.8000 1
ISIL 18.8100 1
ISIL 18.8200 1
ISIL 18.8700 1
ISIL 18.8900 1
ISIL 18.9100 1
ISIL 18.9200 1
ISIL 18.9300 1
ISIL 18.9400 1
ISIL 18.9600 1
ISIL 18.9800 1
ISIL 10.9800 0
ISIL 18.7800 1
ISIL 18.7900 1
ISIL 18.8000 1
ISIL 18.8100 1
ISIL 18.8200 1
ISIL 18.8700 1
ISIL 18.8900 1
ISIL 18.9100 1
ISIL 10.9200 0
ISIL 18.9300 1
ISIL 18.9400 1
ISIL 18.9600 1
ISIL 18.9800 1
XXXX 19.9800 1
;
run;
proc sort data=have;
by code date;
run;
data inter;
set have;
by code;
prev_price=lag(price);
if first.code then
do;
call missing(prev_price);
counter=0;
serie+1;
end;
if price> prev_price then counter+1;
else
do;
counter=1;
serie+1;
end;
run;
/* only wanted series */
proc sql;
create table want_selected as
select *
from inter
group by code, serie
having max(counter)>=10
order by code, date
;
quit;
/* all series but with serie_wanted_flag added */
proc sql;
create table want_all as
select *, max(counter)>=10 as serie_wanted_flag
from inter
group by code, serie
order by code, date
;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.