Hello!
I have a quick question about collapsing a data set. Currently I have multiple observations for unique identification numbers (I believe this is in long format). I currently have five variables in the data set: id, operation, start, end, and duration. Here is a quick sample of the data set:
ID Operation Start End Duration
1 A # # #
1 B # # #
2 B # # #
3 A # # #
4 A # # #
4 B # # #
etc. there are 317 observations in the data set.
I want to make it so that each ID (1-4) occurs only once, but the start, end and duration need to be unique to each operation. Basically I'm envisioning this:
ID Operation1 Start1 End1 Duration1 Operation2 Start2 End2 Operation2
1
2
3
4
with the data filled in.
I've tried arrays and proc transpose, however neither seem to be working. Not every observation will have values for each variable (ex: ID 3 was only in one 'operation' and thus information for operation 2 will be missing). Is there any proc that can do this easily?
Proc transpose and proc Summary.
Perhaps post the code you've tried that isn't working.
Data Deployment_Wide;
set Deployment_Sort; by id;
keep id operation1-operation2;
retain operation1-operation2;
array aoperation (1:2) operation1-operation2;
if first.id then
do;
do i = 1 to 2;
aoperation :smileyinfo: = .;
end;
end;
aoperation (duration) = operation;
if last.id then output;
run;
here is the code for the array I've been trying
Try the following modification, I added in a row counter, and use that as my index to the array to store my data.
Data Deployment_Wide;
set Deployment_Sort; by id;
keep id operation1-operation2;
retain operation1-operation2 obs_count;
array aoperation (1:2) operation1-operation2;
if first.id then
do;
count=0;
do i = 1 to 2;
aoperation = .;
end;
end;
count+1;
aoperation (count) = operation;
if last.id then output;
run;
I tried that but it said the array was out of range.
Then you have more than two operations per ID?
Yes, some IDs have more than two operations and some only have one.
Thanks so much for all the help! I have one last question. I've gotten the array to work, however I need the second variable 'duration' to match the first variable 'operation' (ie if operation 1 lasted 10 days and operation 2 lasted 14 days duration1=10 and duration2=14). I've written a second array to make the variables duration1-duration5, but the data is only printing the information for duration1. Is there any way to rectify this problem? Here is the code I've been working with:
Data Deployment_Wide;
set Deployment_Sort; by id;
keep id operation1-operation5 duration;
retain operation1-operation5 obs_count;
array aoperation (1:5) operation1-operation5;
if first.id then
do;
count=0;
do i = 1 to 5;
aoperation :smileyinfo: = .;
end;
end;
count+1;
aoperation (count) = operation;
if last.id then output;
run;
data Deployment_wide_2;
set deployment_wide; by id;
keep id operation1 operation2 operation3 operation4 operation5 duration1-duration5;
retain duration1-duration5 obs_count;
array aduration (1:5) duration1-duration5;
if first.id then do;
count=0;
do i = 1 to 5;
aduration (1) = .;
end;
end;
count + 1;
aduration (count) = duration;
if last.id then output;
run;
Thanks!
Why do you have two different data steps?
Here is a simple way to roll duration and location up into multiple columns;
%let max=5;
data wide (drop=_duration _location i) ;
do i=1 to &max until(last.id);
set tall (rename=(duration=_duration location=_location);
by id;
array duration(&max) ;
array location(&max) ;
duration(i)=_duration;
location(i)=_location;
end;
run;
Data Deployment_Wide1;
set Deployment_Sort;
where Operation='A';
run;
Data Deployment_Wide2;
set Deployment_Sort;
where Operation='B';
run;
Data Deployment_wide;
Merge Data Deployment_Wide1 (rename=(start=start1 end=end1 duration=duration1 operation=operation1))
Data Deployment_Wide2 (rename=(start=start2 end=end2 duration=duration2 operation=operation2));
By ID;
run;
How is that?:)
Jojan.
Hello Mdormond,
I have tested and it works exectily as you need:
data Mdormond;
input ID
Operation $
Start
End
Duration;
datalines;
1 A 1 3 2
1 B 4 5 1
2 B 1 3 2
3 A 4 5 1
4 A 1 3 2
4 A 6 9 3
4 A 3 4 1
;
run;
%macro Mdormond(var=);
proc transpose data=Mdormond
out=&var.(drop=_name_)
prefix=&var._;
var &var.;
by id;
run;
%mend Mdormond;
%Mdormond(var=Operation);
%Mdormond(var=Start);
%Mdormond(var=End);
%Mdormond(var=Duration);
data Mdormond_final;
merge Operation
Start
End
Duration;
by id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.