Transposing Multiple Variables

Reply
Regular Contributor
Posts: 180

Transposing Multiple Variables

Hello all,

 

I have a taks to perform, and the only thing I managed to get working, is long and unefficient.

 

I have a datase with clinical laboratory results, along with reference values, the data has the following variables:

 

  • Subject ID
  • Visit (ranging from 1-4, i.e. each subject has up to 4 rows in this data)
  • Treatment group (1/0)
  • Test 1 (Numeric)
  • Test 2 (Numeric)
  • Test 3 (Numeric)
  • Test 1 Low reference value (numeric)
  • Test 1 High reference value (numeric)
  • Test 2 Low reference value (numeric)
  • Test 2 High reference value (numeric)
  • Test 3 Low reference value (numeric)
  • Test 3 High reference value (numeric)

My final goal is to present descriptive statistics for the numeric tests and frequency distributions of the tests within range.

 

I want to do the analysis with the "by" statment, to save code (write it once).

 

My problem is to bring the data to the shape I need. I do not want to calculate the within normal range variable 3 times, I want it only once, but I can't seem to manage that. So I did it "widthly" which means copy+paste. The if statment is long (4 conditions) and copying it 3 times (what if I'll have 13?) is prone to bugs. How can I create a data, with each row corresponding to a subject*visit*treatment*test, with the normality status matching the right test?

 

Thank you in advance !

 

Comment: Clearly I can't add data, but feel free to type any numbers you wish, the principle matters, not the numbers.

Super User
Super User
Posts: 7,720

Re: Transposing Multiple Variables

If you search here, or on the internet for "reshape data" or "long to wide" you will find many posts on the subject, with various solutions from macro code, arrays etc.  Without seeing an example of what your data looks like, or what you want the output to look like I can't provide code though.

Regular Contributor
Posts: 180

Re: Transposing Multiple Variables

[ Edited ]

the data looks like this:

 

data A;

input ID Visit Group Test1 Test2 Test3 Test1_L Test1_H Test2_L Test2_H Test3_L Test3_H;

datalines;

1 1 0 5 3 6.7 1 10 2 7 3 9

1 2 0 5.5 3.8 8.7 1 10 2 7 3 9

1 3 0 4.5 2.8 5.7 1 10 2 7 3 9

2 1 1 .......

2 2 1 ......

2 3 1 ......

;

 

What I want at the end is that each row will represent one subject, in one visit, for one single test, I want the row to include the value of the test, and a character variable: "Normal" or "Not normal" according to the reference values (low and high) of the test. And, i want to keep the group of course. Test1 is normal if it is within Test1_L and Test1_H, and so on.

 

I tried looking, couldn't find what I need.

Super User
Super User
Posts: 7,720

Re: Transposing Multiple Variables

Assuming the prefix is constant:

data A;
  input ID Visit Group Test1 Test2 Test3 Test1_L Test1_H Test2_L Test2_H Test3_L Test3_H;
datalines;
1 1 0 5 3 6.7 1 10 2 7 3 9
1 2 0 5.5 3.8 8.7 1 10 2 7 3 9
1 3 0 4.5 2.8 5.7 1 10 2 7 3 9
;
run;

proc transpose data=a out=want;
  by id visit group;
  var test:;
run;

Will transpose the data down.  

 

For your other question, regarding the ranges, you have really made a rod for your own back there with that layout.  TestX is fine, but because you have not put the incrementor at the end of test_hX for instance, its harder to reference them.

data A;
  input ID Visit Group Test1 Test2 Test3 Test1_L Test1_H Test2_L Test2_H Test3_L Test3_H;
datalines;
1 1 0 5 3 6.7 1 10 2 7 3 9
1 2 0 5.5 3.8 8.7 1 10 2 7 3 9
1 3 0 4.5 2.8 5.7 1 10 2 7 3 9
;
run;

data b;
  set a;
  array test{3};
  array test_h{3} test1_h test2_h test3_h;
  array test_l{3} test1_l test2_l test3_l;
  array test_result{3} $50;
  do i=1 to 3;
    if test_l{i} <= test{i} <= test_h{i} then test_result{i}="Normal";
    else test_result{i}="Not Normal";
  end;
run;


proc transpose data=b out=want;
  by id visit group;
  var test:;
run;
Respected Advisor
Posts: 3,788

Re: Transposing Multiple Variables

[ Edited ]

This method transposes the LAB test and makes no assumption about the name (i.e. TEST1) only that the high and low range values are identified by adding _H or _L to the root name.  I will leave it to you to calculated the range flag.  

 

data A;
   input ID Visit Group Test1 Test2 Test3 Test1_L Test1_H Test2_L Test2_H Test3_L Test3_H;
   datalines;
1 1 0 5 3 6.7 1 10 2 7 3 9
1 2 0 5.5 3.8 8.7 1 10 2 7 3 9
1 3 0 4.5 2.8 5.7 1 10 2 7 3 9
2 1 1 5 3 6.7 1 10 2 7 3 9
2 2 1 5.5 3.8 8.7 1 10 2 7 3 9
2 3 1 4.5 2.8 5.7 1 10 2 7 3 9
;;;;
   run;
proc print;
   run;
proc transpose data=a out=b;
   by id visit group;
   run;
data b;
   set b;
   length paramcd $8 namecd $1;
   call scan(_name_,1,p,l,'_');
   paramcd = substrn(_name_,p,l);
   namecd  = coalesceC(substrn(_name_,p+l+1),'V');
   drop p l _name_;
   run;
proc sort data=b;
   by id visit group paramcd;
   run;
proc format;
   value $namecd 'V'='Value' 'H'='High' 'L'='Low';
   run;
proc transpose data=b out=c(drop=_name_);
   by id visit group paramcd;
   id namecd;
   format namecd $namecd.;
   var col1;
   run;
proc print;
   run;

Capture.PNG

Valued Guide
Posts: 505

Re: Transposing Multiple Variables

see

 

https://goo.gl/05M2EO

 

you can have multiple variables on the id statement

 

 

 

Ask a Question
Discussion stats
  • 5 replies
  • 409 views
  • 0 likes
  • 4 in conversation