DATA Step, Macro, Functions and more

Variation between dates

Reply
N/A
Posts: 0

Variation between dates

Hi there,

I need to calculate the variation in "VALOR" between dates (1 day
diference). How can I do that?

Thanks.

CODFONTE CODTIT ULTDATA MOEDA BOLSA CODITEM VALOR DTALT
1 100019900833 20080417 EUR LUXEMBOURG Preco 15H30 96.224 17-04-2008
1 100019900833 20080418 EUR LUXEMBOURG Preco 15H30 96.249 18-04-2008
1 100019900833 20080421 EUR LUXEMBOURG Preco 15H30 95.764 21-04-2008
1 100019900833 20080422 EUR LUXEMBOURG Preco 15H30 96.256 22-04-2008
1 100019900833 20080423 EUR LUXEMBOURG Preco 15H30 96.367 23-04-2008
1 100019900833 20080424 EUR LUXEMBOURG Preco 15H30 96.366 24-04-2008
1 100019900833 20080425 EUR LUXEMBOURG Preco 15H30 95.562 25-04-2008
1 100019900833 20080428 EUR LUXEMBOURG Preco 15H30 96.416 28-04-2008
1 100019900833 20080429 EUR LUXEMBOURG Preco 15H30 96.843 29-04-2008
1 100019900833 20080430 EUR LUXEMBOURG Preco 15H30 96.848 01-05-2008
1 100019900833 20080502 EUR LUXEMBOURG Preco 15H30 96.851 02-05-2008
1 100019900833 20080505 EUR LUXEMBOURG Preco 15H30 96.855 05-05-2008
1 100019900833 20080506 EUR LUXEMBOURG Preco 15H30 96.857 06-05-2008
1 100019900833 20080507 EUR LUXEMBOURG Preco 15H30 96.887 07-05-2008
1 100019900833 20080508 EUR LUXEMBOURG Preco 15H30 96.869 08-05-2008
1 100056520533 20080104 EUR EURONEXT-LISBON Preco 15H30 99.625
04-01-2008
1 100056520533 20080107 EUR EURONEXT-LISBON Preco 15H30 99.616
08-01-2008
1 100056520533 20080109 EUR EURONEXT-LISBON Preco 15H30 99.633
09-01-2008
1 100056520533 20080110 EUR EURONEXT-LISBON Preco 15H30 99.64
11-01-2008
1 100056520533 20080111 EUR EURONEXT-LISBON Preco 15H30 99.634
14-01-2008
1 100056520533 20080115 EUR EURONEXT-LISBON Preco 15H30 99.676
15-01-2008
1 100056520533 20080116 EUR EURONEXT-LISBON Preco 15H30 99.661
16-01-2008
1 100056520533 20080117 EUR EURONEXT-LISBON Preco 15H30 99.671
17-01-2008
1 100056520533 20080118 EUR EURONEXT-LISBON Preco 15H30 99.702
18-01-2008
1 100056520533 20080121 EUR EURONEXT-LISBON Preco 15H30 99.708
21-01-2008
1 100056520533 20080122 EUR EURONEXT-LISBON Preco 15H30 99.719
22-01-2008
1 100056520533 20080123 EUR EURONEXT-LISBON Preco 15H30 99.701
24-01-2008
1 100056520533 20080125 EUR EURONEXT-LISBON Preco 15H30 99.691
25-01-2008

The lag funtion don't work because I have more than one COD_TIT and the data is not sorted by date.

data testt;
set values;
diff=valor-lag(valor);
run;
N/A
Posts: 0

Re: Variation between dates

LAG can work. Assuming DTALT is a numeric value with a european style data format.
[pre]
proc sort data=indata;
by codfonte codtit dtalt;
run;

data outdata;
set indata;
by codfonte codtit;

if first.codtit then lag(valor);

diff = valor - lag(valor);

if not first.codtit then output;
run;
[/pre]
Does this help?

Lag does not mean "use the previous value", like you would hope it does.
The "LAG" documentation is a little confusing and easily mis-interpreted. I had to spend a couple hours/days experimenting until I fully understood what was happening.

LAG simply represents a FIF0 queue of some depth: lag = lag1 = depth of 2; lag10 = depth of 11 (could be 1 and 10 for the actual implementation, but this is easier to explain). When the function is called, it loads the current value at the variable into a queue for that variable and then retrieves the last value in the queue and returns that value. So, the first time LAG(valor) is used/called, it loads the value of VALOR into the queue and then returns a missing value. The next time LAG(valor) is executed, a new value is put into the queue (now holding 2 actual meaningful values) and then retrieves the end/last/previous value and returns that value for use. This is why it is easy to trip over the use of LAG within conditional statements, because it does not necessarily have the value from the previous observation if the conditional was different for that observation. This is also why you have to be careful with the number of times LAG is used with a particular variable in a datastep. If it is desired that the lag variable explicitly mean "previous observation value" then use:
[pre]
prior_value = lag(value);
[/pre]
in the code to hold that prior value for use
[pre]
diff1 = value - lag(value);
value3 = value2*lag(value);
[/pre]
is very different from
[pre]
prior = lag(value);
diff1 = value - prior;
value3 = value2 * prior;
[/pre]
N/A
Posts: 0

Re: Variation between dates

Hello Chuck, thanks for your help.

When I try to do like you say it gives the following error:

463 proc sort data=mylib.TAcc_Obri;
464 by codfonte codtit dtalt;
465 run;

NOTE: Input data set is already sorted, no sorting done.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds


466 data mylib.TAcc_Obriga;
467 set mylib.TAcc_Obri;
468 by codtit;
469
470 if first.codtit then lag(valor);
-
22
76
ERROR: Undeclared array referenced: lag.
ERROR 22-322: Syntax error, expecting one of the following: +, =.

ERROR 76-322: Syntax error, statement will be ignored.

471
472 diff = valor - lag(valor);
473
474 if not first.codtit then output;
475 run;

NOTE: Character values have been converted to numeric values at the places given by: (Line)Smiley SadColumn).
472:10 472:18
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set MYLIB.TACC_OBRIGA may be incomplete. When this step was stopped there were 0
observations and 13 variables.
WARNING: Data set MYLIB.TACC_OBRIGA was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Frequent Contributor
Posts: 139

Re: Variation between dates

Nanita
you sorted by codfonte codtit dtalt, so in your data set you need the same by sequence in the data step

set mylib.TAcc_Obri;
by codfonte codtit;

......


-Darryl
N/A
Posts: 0

Re: Variation between dates

I probably should have used
[pre]
... then dummy=lag(valor);
[/pre]
N/A
Posts: 0

Re: Variation between dates

I've tried this code:

proc sort data=mylib.TAcc_Obri;
by codfonte codtit dtalt;
run;


data mylib.TAcc_Obriga;
set mylib.TAcc_Obri;
by codfonte codtit dtalt;

if first.codtit then diff=0;
if not first.codtit then diff = valor - lag(valor);
run;

It seems to be working but I don't understand why in line 2 the value of diff is '.'.

Where did I fail?


Obs CODTIT VALOR DTALT ISIN diff

1 100019900833 96.224 17APR2008 XS0195487912 0.000
2 100019900833 96.249 18APR2008 XS0195487912 .
3 100019900833 95.764 21APR2008 XS0195487912 -0.485
4 100019900833 96.256 22APR2008 XS0195487912 0.492
5 100019900833 96.367 23APR2008 XS0195487912 0.111
6 100019900833 96.366 24APR2008 XS0195487912 -0.001
7 100019900833 95.562 25APR2008 XS0195487912 -0.804
8 100019900833 96.416 28APR2008 XS0195487912 0.854
9 100019900833 96.843 29APR2008 XS0195487912 0.427
10 100019900833 96.848 01MAY2008 XS0195487912 0.005
11 100019900833 96.851 02MAY2008 XS0195487912 0.003
12 100019900833 96.855 05MAY2008 XS0195487912 0.004
13 100019900833 96.857 06MAY2008 XS0195487912 0.002
14 100019900833 96.887 07MAY2008 XS0195487912 0.030
15 100019900833 96.869 08MAY2008 XS0195487912 -0.018
16 100056520533 99.625 04JAN2008 PTSEMCOE0006 0.000
17 100056520533 99.616 08JAN2008 PTSEMCOE0006 2.747
18 100056520533 99.633 09JAN2008 PTSEMCOE0006 0.017
19 100056520533 99.64 11JAN2008 PTSEMCOE0006 0.007
20 100056520533 99.634 14JAN2008 PTSEMCOE0006 -0.006
21 100056520533 99.676 15JAN2008 PTSEMCOE0006 0.042
22 100056520533 99.661 16JAN2008 PTSEMCOE0006 -0.015
23 100056520533 99.671 17JAN2008 PTSEMCOE0006 0.010
24 100056520533 99.702 18JAN2008 PTSEMCOE0006 0.031
25 100056520533 99.708 21JAN2008 PTSEMCOE0006 0.006
26 100056520533 99.719 22JAN2008 PTSEMCOE0006 0.011
27 100056520533 99.701 24JAN2008 PTSEMCOE0006 -0.018
28 100056520533 99.691 25JAN2008 PTSEMCOE0006 -0.010
29 100056520533 99.692 28JAN2008 PTSEMCOE0006 0.001
30 100056520533 99.692 29JAN2008 PTSEMCOE0006 0.000
31 100056520533 99.686 30JAN2008 PTSEMCOE0006 -0.006
32 100056520533 99.686 06FEB2008 PTSEMCOE0006 0.000
33 100056520533 99.689 07FEB2008 PTSEMCOE0006 0.003
34 100056520533 99.691 08FEB2008 PTSEMCOE0006 0.002
35 100056520533 99.688 11FEB2008 PTSEMCOE0006 -0.003
Frequent Contributor
Posts: 139

Re: Variation between dates

First, You need to re-read Chuck's post on how the lag function works. His explaination is right on target, especially with using lag with conditional logic.

Second, The first time that the lag function is called is during the 2 observation, since this is the first time the lag function is called there is no value to fetch. Recall from Chuck's post that lag is not the previous value, lag effectivilty places values in a queue. So you need to think what value is in the queue at any given record.

Also see
15 100019900833 96.869 08MAY2008 XS0195487912 -0.018
16 100056520533 99.625 04JAN2008 PTSEMCOE0006 0.000
17 100056520533 99.616 08JAN2008 PTSEMCOE0006 2.747

Notice that the valor=99.616 for obs(17) and valor=99.625 for obs(16). I assume that you want the diff to be 99.616-99.625=-0.009. What is being calculated is for obs(17) minus valor for obs(15) 99.616-96.869=2.747. This is because the last value of valor in the lag queue is for obs(15). Your code skips the lag function for the first.codit. My guess is that you don't want to do that.

I would do something like this. The below example has the additional benefit of creating another column (prior_valor) that holds the previous example. You can open up the table and see what values are being used in your calcuations. Don't be afraid to make as many variables as you need during development, you can always cull them down once you are certain all is well.


data mylib.TAcc_Obriga;
set mylib.TAcc_Obri;
by codfonte codtit dtalt;

prior_valor=lag(valor);

if first.codtit then diff=0;
if not first.codtit then diff = valor - prior_valor;
run;

-Darryl
N/A
Posts: 0

Re: Variation between dates

Hi Darryl,

thank you so much for your reply. You just solved my problem.
Ask a Question
Discussion stats
  • 7 replies
  • 218 views
  • 0 likes
  • 2 in conversation