You didn't correct the BY statement in the datastep.
Reeza,
I used these statements:
proc import out=TRI
datafile= "c:\evantus\T-RI-main.csv" dbms = csv replace;
getnames=yes;
datarow= 2;
GUESSINGROWS=20000;
run;
proc sort data=tri;
by caldt dscd;
run;
data rett;
set tri;
by dscd caldt;
ret=dif(RI)/lag(RI);
if first.dscd then ret=.;
keep comnam dscd caldt ret;
run;
Log:
6731 data rett;
6732 set tri;
6733 by dscd caldt;
6734 ret=dif(RI)/lag(RI);
6735 if first.dscd then ret=.;
6736 keep comnam dscd caldt ret;
6737 run;
ERROR: BY variables are not properly sorted on data set WORK.TRI.
comnam=NIPPON TELG. & TEL. - TOT RETURN IND dscd=740847 caldt=12/30/1999 RI=159.69 FIRST.dscd=0
LAST.dscd=0 FIRST.caldt=1 LAST.caldt=1 ret=. _ERROR_=1 _N_=15
NOTE: Missing values were generated as a result of performing an operation on missing values.
Each place is given by: (Number of times) at (Line):(Column).
1 at 6734:12
NOTE: The SAS System stopped processing this step because of errors.
NOTE: There were 16 observations read from the data set WORK.TRI.
WARNING: The data set WORK.RETT may be incomplete. When this step was stopped there were 14
observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
and only 14 rows appear in the Rett:
comnam | dscd | caldt | ret |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.005828 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | -0.00579 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | -0.02889 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.011937 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | -0.0118 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | -0.02381 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.02439 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.011937 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | -0.00593 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.023669 |
NIPPON TELG. & TEL. - TOT RETURN IND | 740847 | ######## | 0.011593 |
Thanks,
Niloo
Did you read the error message? What did it mean to you?
Did you read the code you submitted? Didn't it look strange to you that your were sorting by one set of variables and then attempting to read by another set of variables?
If you sorted the data incorrectly then that would explain why you thought the IF statement was "not working". If you process the data by subject and then by date within patient (BY DSDC CALDT;) then you will only be setting RET to missing at the first record for each subject. If instead you sort by date and then subject you will be making two mistakes. One you will be comparing values from different subjects to each other in the LAG() function calls. Two you will be setting many more values to missing since there are many more dates than subjects.
Thank you Tom, I fixed the sort statement and the if statement works perfectly now.
Thanks,
Niloo
Please don't double post questions, it's confusing for all!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.