Hi ,
I have two datasets have and dates . For PMC end date is 30SEP2024 and need to extend this until end date in dates table that is 29 Dec2028 . First check dates in dates table which are greater than 30SEP2024 and add only those dates as an extension to output dataset Test5 which are
31OCT2024 , 29NOV2024 , 28FEB2025 , 31MAR2027 , 31AUG2028 , 29DEC2028. Same for ERC . I have used below code but that creates an intermediate dataset want which is having all dates in the year after 30SEP2024 . Is there any other code i can use related to performance .
Can anyone please help . This is an extension to original post
data have;
input contype$1-3 TimeStamp CP PP MM;
informat TimeStamp date9.;
format TimeStamp date9.;
cards;
PMC 31May2024 200 200 200
PMC 28JUN2024 235 244 33
PMC 31JUL2024 235 500 654
PMC 30SEP2024 238 234 222
ERC 31JAN2023 109 546 230
ERC 28FEB2023 234 320 777
ERC 31MAR2023 223 111 435
ERC 29SEP2023 187 325 120
;
run;
proc sort data=have;
by contype timestamp;
run;
data dates;
input enddate date9.;
informat enddate date9.;
format enddate date9.;
cards;
30JUL2022
31JAN2023
28FEB2023
31MAR2023
31JUL2023
29SEP2023
31JAN2024
31May2024
28JUN2024
31JUL2024
30SEP2024
31OCT2024
29NOV2024
28FEB2025
31MAR2027
31AUG2028
29DEC2028
;
run;
proc sort data=dates;
by enddate;
run;
proc summary data=dates ;
var enddate;
output out=lastday max=enddate;
run;
data want;
if _n_=1 then set lastday(keep=enddate);
set have;
by contype;
output;
if last.contype then do;
cp=0; pp=0; mm=0;
do timestamp=timestamp+1 to enddate;
output;
end;
end;
run;
proc sql;
create table test as
select a.*
from want as a
inner join dates as b on a.TimeStamp= b.enddate;
quit;
Input Datasets
data have;
input contype$1-3 TimeStamp CP PP MM;
informat TimeStamp date9.;
format TimeStamp date9.;
cards;
PMC 31May2024 200 200 200
PMC 28JUN2024 235 244 33
PMC 31JUL2024 235 500 654
PMC 30SEP2024 238 234 222
ERC 31JAN2023 109 546 230
ERC 28FEB2023 234 320 777
ERC 31MAR2023 223 111 435
ERC 29SEP2023 187 325 120
;
run;
data dates;
input enddate date9.;
informat enddate date9.;
format enddate date9.;
cards;
30JUL2022
31JAN2023
28FEB2023
31MAR2023
31JUL2023
29SEP2023
31JAN2024
31May2024
28JUN2024
31JUL2024
30SEP2024
31OCT2024
29NOV2024
28FEB2025
31MAR2027
31AUG2028
29DEC2028
;
run;
Output like this
Test5
contype | TimeStamp | CP | PP | MM |
ERC | 31-Jan-23 | 109 | 546 | 230 |
ERC | 28-Feb-23 | 234 | 320 | 777 |
ERC | 31-Mar-23 | 223 | 111 | 435 |
ERC | 29-Sep-23 | 187 | 325 | 120 |
ERC | 31-Jan-24 | 0 | 0 | 0 |
ERC | 31-May-24 | 0 | 0 | 0 |
ERC | 28-Jun-24 | 0 | 0 | 0 |
ERC | 31-Jul-24 | 0 | 0 | 0 |
ERC | 30-Sep-24 | 0 | 0 | 0 |
ERC | 31-Oct-24 | 0 | 0 | 0 |
ERC | 29-Nov-24 | 0 | 0 | 0 |
ERC | 28-Feb-25 | 0 | 0 | 0 |
ERC | 31-Mar-27 | 0 | 0 | 0 |
ERC | 31-Aug-28 | 0 | 0 | 0 |
ERC | 29-Dec-28 | 0 | 0 | 0 |
PMC | 31-May-24 | 200 | 200 | 200 |
PMC | 28-Jun-24 | 235 | 244 | 33 |
PMC | 31-Jul-24 | 235 | 500 | 654 |
PMC | 30-Sep-24 | 238 | 234 | 222 |
PMC | 31-Oct-24 | 0 | 0 | 0 |
PMC | 29-Nov-24 | 0 | 0 | 0 |
PMC | 28-Feb-25 | 0 | 0 | 0 |
PMC | 31-Mar-27 | 0 | 0 | 0 |
PMC | 31-Aug-28 | 0 | 0 | 0 |
PMC | 29-Dec-28 | 0 | 0 | 0 |
hi ,
I understood what the error is i have data in the form of timestamp like seconds so its adding +1 to seconds . Error is cleared . Thank you so much .
when i run the code for huge dataset with more than 300 distinct contype i get below error while creating dataset want
ERROR: Insufficient space in file WANT
ERROR: File WANT is damaged. I/O processing did not complete.
Can anyone please help . Thank you
Messages without context are useless. Post the complete log (all code and messages) of the offending step. If previous steps had at least WARNINGs, include them. Post the log by copy/pasting it into a window opened with this button:
@jhh197 wrote:
In have dataset for Contype PMC enddate is 30SEP2024 . After that we have below dates in Dates table so i add these dates to output dataset and for CP,PP and MM columns it is 0 values . Output should be like Test 5 . How do i achieve output like Test5 ?
31OCT2024
29NOV2024
28FEB2025
31MAR2027
31AUG2028
29DEC2028
Sorry, but i don't understand the relation between this post and the earlier one, which is unreadable due to posting unformatted sas code.
@jhh197 wrote:
In have dataset for Contype PMC enddate is 30SEP2024 . After that we have below dates in Dates table so i add these dates to output dataset and for CP,PP and MM columns it is 0 values . Output should be like Test 5 . How do i achieve output like Test5 ?
31OCT2024
29NOV2024
28FEB2025
31MAR2027
31AUG2028
29DEC2028
You were asked to provide the log, and given instructions to do so; you didn't provide the log. Please provide the log and follow the instructions given.
hi ,
I understood what the error is i have data in the form of timestamp like seconds so its adding +1 to seconds . Error is cleared . Thank you so much .
33 data have;
34 input contype$1-3 TimeStamp CP PP MM;
35 informat TimeStamp date9.;
36 format TimeStamp date9.;
37 cards;
NOTE: The data set WORK.HAVE has 8 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
46 ;
47 run;
48
49 proc sort data=have;
50 by contype timestamp;
51 run;
NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.HAVE has 8 observations and 5 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
53 data dates;
54 input enddate date9.;
55 informat enddate date9.;
56 format enddate date9.;
57 cards;
NOTE: The data set WORK.DATES has 17 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
75 ;
76 run;
77
78 proc sort data=dates;
79 by enddate;
80 run;
NOTE: There were 17 observations read from the data set WORK.DATES.
NOTE: The data set WORK.DATES has 17 observations and 1 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
81
82 proc summary data=dates ;
83 var enddate;
84 output out=lastday max=enddate;
85 run;
NOTE: There were 17 observations read from the data set WORK.DATES.
NOTE: The data set WORK.LASTDAY has 1 observations and 3 variables.
NOTE: PROCEDURE SUMMARY used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds
86
87 data want;
88 if _n_=1 then set lastday(keep=enddate);
89 set have;
90 by contype;
91 output;
92 if last.contype then do;
93 cp=0; pp=0; mm=0;
94 do timestamp=timestamp+1 to enddate;
95 output;
96 end;
97 end;
98 run;
NOTE: There were 1 observations read from the data set WORK.LASTDAY.
NOTE: There were 8 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 3477 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
99
100 proc sql;
101 create table test as
102 select a.*
103 from want as a
104 inner join dates as b on a.TimeStamp= b.enddate;
NOTE: Table WORK.TEST created, with 25 rows and 6 columns.
105 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Here in want I have 3477 observations because in this dataset timestamp+1 is extending to all dates until 29dec2028 like for example but i dont want all the dates in oct want only 31OCT2024 which is in dates table . Is there way to do this ? So it occupies less space .
contype | TimeStamp | CP | PP | MM |
ERC | 30-Sep-24 | 0 | 0 | 0 |
ERC | 1-Oct-24 | 0 | 0 | 0 |
ERC | 2-Oct-24 | 0 | 0 | 0 |
ERC | 3-Oct-24 | 0 | 0 | 0 |
ERC | 4-Oct-24 | 0 | 0 | 0 |
ERC | 5-Oct-24 | 0 | 0 | 0 |
ERC | 6-Oct-24 | 0 | 0 | 0 |
ERC | 7-Oct-24 | 0 | 0 | 0 |
ERC | 8-Oct-24 | 0 | 0 | 0 |
ERC | 9-Oct-24 | 0 | 0 | 0 |
ERC | 10-Oct-24 | 0 | 0 | 0 |
ERC | 11-Oct-24 | 0 | 0 | 0 |
ERC | 12-Oct-24 | 0 | 0 | 0 |
ERC | 13-Oct-24 | 0 | 0 | 0 |
ERC | 14-Oct-24 | 0 | 0 | 0 |
ERC | 15-Oct-24 | 0 | 0 | 0 |
ERC | 16-Oct-24 | 0 | 0 | 0 |
ERC | 17-Oct-24 | 0 | 0 | 0 |
ERC | 18-Oct-24 | 0 | 0 | 0 |
ERC | 19-Oct-24 | 0 | 0 | 0 |
ERC | 20-Oct-24 | 0 | 0 | 0 |
ERC | 21-Oct-24 | 0 | 0 | 0 |
ERC | 22-Oct-24 | 0 | 0 | 0 |
ERC | 23-Oct-24 | 0 | 0 | 0 |
How do you determine which value of enddate in dataset dates is the right one?
You said 31oct2024 before, so which is it?
And your test5 has dates up to 29dec2028, which is also the max enddate in dates.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.