hi, hope you can help me on this
I have a large dataset that each line is showing image types of a MRI study. Lines 1-4 are from 1 MRI study and lines 5-9 are from another MRI study. So there are 2 MRI studies shown here, but my dataset contains a lot more MRI studies. Note that the fields studydate, studydescription, operatorname contains only a single value specific to the MRI study; whereas the fields seriesnumber and seriesdescription contain data specific to the image types.
would like to shrink the dataset into each line represents a MRI study (so from this collapse into 2 lines), by:
1) creating a new field that contains all the values of series number of an MRI, separated by a "/" (i.e. 1/2/3/4 for the first MRI study and 1/2/3/4/5 for the second study)
2) creating a new field that contains all of the values of series description of that MRI separated by a "/" (i.e. localizer/3pl loc/.....bh chest)
There is an optional parameter in the PUT function that controls the alignment of the text.
That is actually a dash and the letter L. It left aligns the text so there are no extra spaces added in to the text.
There's typically two methods to do this, one is to loop through the data and combine, the second is a transpose method. Because you're doing more than one variable, the data step approach is more efficient.
Here's a worked example with fictional data:
*create sample data for demonstration;
data have;
infile cards truncover;
input OrgID Product $ States $ NumVar;
cards;
1 football DC 54
1 football VA 84
1 football MD 38
2 football CA 45
3 football NV 58
3 football CA 98
;
run;
*Sort - required for both options;
proc sort data=have;
by orgID;
run;
**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
set have;
by orgID;
length combinedStates $100 combinedNumVar $100;
retain combinedStates combinedNumVar;
if first.orgID then do;
combinedStates=states;
combinedNumVar = put(numVar, 8. -l);
end;
else do;
combinedStates=catx('/', combinedStates, states);
combinedNumVar=catx('/', combinedNumVar, put(numVar, 8. -l));
end;
if last.orgID then
output;
run;
@rykwong wrote:
hi, hope you can help me on this
I have a large dataset that each line is showing image types of a MRI study. Lines 1-4 are from 1 MRI study and lines 5-9 are from another MRI study. So there are 2 MRI studies shown here, but my dataset contains a lot more MRI studies. Note that the fields studydate, studydescription, operatorname contains only a single value specific to the MRI study; whereas the fields seriesnumber and seriesdescription contain data specific to the image types.
would like to shrink the dataset into each line represents a MRI study (so from this collapse into 2 lines), by:
1) creating a new field that contains all the values of series number of an MRI, separated by a "/" (i.e. 1/2/3/4 for the first MRI study and 1/2/3/4/5 for the second study)
2) creating a new field that contains all of the values of series description of that MRI separated by a "/" (i.e. localizer/3pl loc/.....bh chest)
There is an optional parameter in the PUT function that controls the alignment of the text.
That is actually a dash and the letter L. It left aligns the text so there are no extra spaces added in to the text.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.