Hi, I have a dataset with 3 varibales, ID, Date, Cost in a long format. I want to generate a new variable called "Date_over_100" that shows the date when one's cumulative cost have exceeded 100. I would appreciate all the help there is. Thanks in advance! I have achieved this in R, but need help to do so in SAS. R: 1. I created the dataset first. I have repeated the generated 10 IDs so there are 100 in total, and generated 10 repetitions of 10 dates for each ID, and generated values for cost by random sampling 100 numbers between 30 and 50. And the output would look like the attachment file. set.seed(11) ID <- c(rep(1:10,each=10)) Date <- rep(sample(seq(as.Date('2008/01/01'), as.Date('2008/01/11'), by="day",replace=T), 10),10) Cost <- round(runif(100,30,50),digits = 2) df <- data.frame(ID,Date,Cost) df <- df%>% arrange(Date) 2. I created a variable named "Date_Over_100" to show the date when one's cumulative cost has exceeded 100. Below are codes and output. df_new <- df%>%
group_by(ID) %>%
summarize(Date_over_100 = Date[which.max(cumsum(Cost) > 100)]) 3. I then transformed the original dataset into wide format. df_wide <- df%>%
arrange(Date)%>%
pivot_wider(names_from = Date, values_from = Cost) 4. Lastly, I merged the wide format dataset with the dataset I got in 3. df_final <- inner_join(df_new,df_wide,by="ID") Now in SAS: 1. Creating the dataset. data a;
call streaminit(123);
do ID = 1 to 10;
do Date = '01Jan2008'd to '10Jan2008'd;
Cost = rand("integer",30,50);
output;
end;
end;
format Date yymmddd10.;
run; First 15 rows of the output: And next I need to create the variable "Date_over_100" that shows the date when one's cumulative cost has exceeded 100. I have tried to replicate what I have done in R into SAS but haven't succeeded. I would appreciate any help there is. Thanks in advance!
... View more