The variables INDATE<n> are driven by the count on the variable open_date. In the below example, the count on the variable open_date is 5 so the INDATE would be from INDATE1 to INDATE5. I want to apply minimum and maximum function between the variable INDATE1 - INDATE5.
I can write it manually min(INDATE1,INDATE2,INDATE3,INDATE4,INDATE5) which serves the purpose but I want to enhance it in such a way that the code doesnt need to be modified manually every time the process runs.
OPEN_DATE | COUNT | INDATE1 | INDATE2 | INDATE3 | INDATE4 | INDATE5 |
7-Aug-14 | 266 | 7-Aug-14 | . | . | . | . | Any help is greatly appreciated.
8-Aug-14 | 480 | . | 8-Aug-14 | . | . | . |
11-Aug-14 | 269 | . | . | 11-Aug-14 | . | . |
12-Aug-14 | 435 | . | . | . | 12-Aug-14 | . |
13-Aug-14 | 281 | . | . | . | . | 13-Aug-14 |
Want
OPEN_DATE | COUNT | INDATE1 | INDATE2 | INDATE3 | INDATE4 | INDATE5 | MINPULL | MAXPULL | CNT |
13-Aug-14 | 281 | 7-Aug-14 | 8-Aug-14 | 11-Aug-14 | 12-Aug-14 | 13-Aug-14 | 7-Aug-14 | 13-Aug-14 | 1731 |
CNT is total of countvariable in have data set.
data have;
input OPEN_DATE : date7. COUNT (INDATE1 INDATE2 INDATE3 INDATE4 INDATE5)(: date7.);
format OPEN_DATE INDATE1 INDATE2 INDATE3 INDATE4 INDATE5 date7.;
datalines;
07-Aug-14 266 07-Aug-14 . . . .
08-Aug-14 480 . 08-Aug-14 . . .
11-Aug-14 269 . . 11-Aug-14 . .
12-Aug-14 435 . . . 12-Aug-14 .
13-Aug-14 281 . . . . 13-Aug-14
;
proc stdize data=have reponly method=mean out=want;
run;
data final;
set want end=last;
cnt+count;
if last then do;
maxpull=max(of in:);
output;
end;
format maxpull date8.;
run;
Look up the idea of variable lists. Short hand if you have a common stem for the names of variables is to use a colon following the common part of the variable
min ( of indate:) for example.
Warning this will attempt to use all variables that start with INDATE and ALL of them must be of the same type (numeric in the case of min) to get expected results. If you were to have a character variable INDATE_comment that had text the above would generate and error.
data have;
input OPEN_DATE : date7. COUNT (INDATE1 INDATE2 INDATE3 INDATE4 INDATE5)(: date7.);
format OPEN_DATE INDATE1 INDATE2 INDATE3 INDATE4 INDATE5 date7.;
datalines;
07-Aug-14 266 07-Aug-14 . . . .
08-Aug-14 480 . 08-Aug-14 . . .
11-Aug-14 269 . . 11-Aug-14 . .
12-Aug-14 435 . . . 12-Aug-14 .
13-Aug-14 281 . . . . 13-Aug-14
;
proc stdize data=have reponly method=mean out=want;
run;
data final;
set want end=last;
cnt+count;
if last then do;
maxpull=max(of in:);
output;
end;
format maxpull date8.;
run;
Thank you so much. It is a huge help. Just an FYI. As I need min value I added one more line to what you gave me.
PROC STDIZE DATA=IN1.PROC_DT1 REPONLY METHOD=MEAN OUT=IN1.PROC_DT2;
RUN;
DATA IN1.PROC_DT3;
SET IN1.PROC_DT2 END=LAST;
CNT+COUNT;
IF LAST THEN DO;
MAXPULL=MAX(OF IN:);
MINPULL=MIN(OF IN:);
OUTPUT;
END;
FORMAT MAXPULL MINPULL DATE9.;
RUN;
The UPDATE statement makes a more natural "flatten-er" for this scenario. You do need a BY variable but I consider that a minor inconvenience.
data date; infile cards expandtabs; input OPEN_DATE:Date11. COUNT (INDATE1-INDATE5)(:date11.); format o: in: date11.; cards; 7-Aug-14 266 7-Aug-14 . . . . 8-Aug-14 480 . 8-Aug-14 . . . 11-Aug-14 269 . . 11-Aug-14 . . 12-Aug-14 435 . . . 12-Aug-14 . 13-Aug-14 281 . . . . 13-Aug-14 ;;;; proc summary data=date; var _numeric_; output out=sum(drop=_: OPEN_DATE rename=(count=cnt)) sum=; run; data want; set date point=nobs nobs=nobs ; set sum; run;
Xia Keshan
Message was edited by: xia keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.