- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Need to transpose an imported data set using only DATA step (if you wonder why it is required for class)
I have the original dataset with patients and visits
it is required to create a new set with the weights separated by visit, while using the retain option it does not program for missing values and maintains the value until a patient with more visits appears.
I have tried the null statement, missing statement, changing the visit value, changing weight per visit to missing if value is missing, and also setting weight value to missing
Current code looks something like this
DATA fat (keep= patient gender group weight1 weight2 weight3 weight4);
set icdb.clinical;
by patient visit;
if visit = '1' then weight1 = weight;
else if visit = '2' then weight2 = weight;
else if visit = '3' then weight3 = weight;
else if visit = '4' then weight4 = weight;
if last.patient then output;
retain weight1 weight2 weight3 weight4;*keeping the new variables until next patient*;
RUN;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Besides of the need to set all the retained variable to missing after the output statement...
if last.patient then do; output; call missing(of weight1-weight4); end;
...consider if you need to transpose your data at all.
Keeping the data in a long and narrow structure is often better and a lot of SAS Proc's are very "happy" with by group processing and categorical variables for grouping (often defined via a CLASS statement in the Proc).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
After you output, you must also set the 4 weight variables to missing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
retain var1-var4 .;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Besides of the need to set all the retained variable to missing after the output statement...
if last.patient then do; output; call missing(of weight1-weight4); end;
...consider if you need to transpose your data at all.
Keeping the data in a long and narrow structure is often better and a lot of SAS Proc's are very "happy" with by group processing and categorical variables for grouping (often defined via a CLASS statement in the Proc).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
this is what I end up doing to restart the retained value, homework are small exercises to test what was learned in class during the week. In this case it was output and retain functions, arrays are coming on the next unit so it will simplify doing this long process
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Understand this is requested for class.
Show what the expected result actually should look like for that example as I am not understanding.
I used to get into trouble with teachers by asking "why" when assigned to do something I was pretty sure was, let us say 'sub-optimal' at best, and would ask where the proposed changed structure is better.
If the word "spreadsheet" or "Excel" is in the answer run screaming into the night.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I really hope that your teacher explains, that transposing perfectly formed data is something to be avoided.
Have you worked with arrays? This is a perfect scenario to use them, because the variable "visit" can be used as index.
/* Array version */
data fat(keep= patient gender group weight1 weight2 weight3 weight4);
set icdb.clinical;
by patient; /* you have multiple obs per patient, not per visit */
length weight1-weight4 8;
retain weight1-weight4;
array weights [4] weight1-weight4;
if first.patient then do;
call missing(of weights); /* reset all weights */
end;
weights[visit] = weight;
if last.patient then do;
output;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Basically, your teacher wants you to recreate what PROC TRANSPOSE does in a long-to-wide transformation.
This requires two steps:
- determining the items needed in the horizontal structure
- then set up that structure and populate it
In your case, we first need to determine the maximum number of visits present in the dataset:
proc sql noprint;
select max(visit) into :maxvisit trimmed from have;
quit;
Then you use an array in the data step:
data want;
if 0 then set have; /* retrieves variable metadata */
array weights {&maxvisit.} weight1-weight&maxvisit.;
do until (last.patient);
set have;
by patient;
weights{visit} = weight;
end;
keep patient gender weight1-weight&maxvisit.;
run;
Using a DO loop for each patient makes sure that the variables in the array are set to missing before each patient.
This code is untested; for tested code, please provide source data in a data step with datalines, so we can recreate your dataset with a simple copy/paste and submit. Do not post pictures of data.
Like this:
data have;
input patient visit v_date :date9. weight gender $;
format v_date yymmdd10.;
datalines;
1 1 11aug2006 158 Female
1 2 18sep2006 160 Female
;
My footnotes contain a link to a macro that can convert an existing dataset to such code automatically.
Use the "little running man" button next to the one indicated for posting SAS code:
The indicated button is for logs and other text where the content and formatting must not be changed upon posting.
PS and if your teacher wants you to actually not even use an array, then a step that works without losing data (as you can't know the maximum number of visits beforehand in the real world) will need macro coding to dynamically create the assignment statements for the weightX variables:
%macro transpose;
data want;
if 0 then set have;
do until (last.patient);
set have;
by patient;
select (visit);
%do i = 1 %to &maxvisit.;
when (&i) weight&i. = weight;
%end;
end;
end;
keep patient gender weight1-weight&maxvisit.;
run;
%mend;
%transpose
This code is tested based on the made-up data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Perfectly understandable that I would need to add the counter in the real world.
As it was just a little set I was able to see it before working on it, I have learnt so much about arrays with this community, this lesson was about OUTPUT and RETAIN so unfortunately the array is not an option until the next lesson. Also was required to be done within a single data step. The then-do did the trick for me
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
You can easily un-macro the code to work with a fixed set:
data want;
if 0 then set have;
do until (last.patient);
set have;
by patient;
select (visit);
when (1) weight1 = weight;
when (2) weight2 = weight;
when (3) weight3 = weight;
when (4) weight4 = weight;
end;
end;
keep patient gender weight1-weight4;
run;
The nice thing about reading a group in a DO loop is that you do not have to worry when to reset the variables to missing, as the data step does it for you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I was going to make a comment about how this assignment wants the user to do things the hard way, and then achieve a data structure that is harder to work with than the original data structure. But I'm not going to say that.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try my MERGE skill:
https://support.sas.com/resources/papers/proceedings15/2785-2015.pdf
data have;
input patient visit v_date :date9. weight gender $;
format v_date yymmdd10.;
datalines;
1 1 11aug2006 158 Female
1 2 18sep2006 160 Female
;
proc sql;
select distinct catt('have(where=(visit=',visit,')
rename=(weight=weight_',visit,' v_date=v_date_',visit,'))') into : merge separated by ' '
from have;
quit;
data want;
merge &merge;
by patient gender;
drop visit;
run;