BookmarkSubscribeRSS Feed
LarissaW
Obsidian | Level 7

Screenshot 2024-01-28 075852.png

Hi, I would like to know how can I reshape the long data to the expected wide data. I tried the code below, however, it can only move one column at a time and it cannot shape the table to what I want.

 

data want2; set want1; retain ID; by ID; if first.ID then your_id=1; else your_id=your_id+1; run;

 

11 REPLIES 11
Kurt_Bremser
Super User

You do not want a dataset in this layout, as it is a pain in the ass to work with.

If you need this for reporting purposes, PROC REPORT is the tool:

proc report data=want1;
column id type,(dose length);
define id / group;
define type / across;
define dose / analysis;
define length / analysis;
run;
LarissaW
Obsidian | Level 7

Not for the report, I need the wide data for the further Cox proportional analysis. Each type is an individual variable, that's why I have to convert them. I don't think the proc phreg can apply when there are multiple observations under the same person.

Tom
Super User Tom
Super User

That does not make sense with the data you showed.  Why would create variables named DOSE2 that might sometimes have the value of the dose of TYPE=4 and sometimes the value of dose for TYPE= something else?  Why not call the variable DOSE4 for the dose of type 4?

LarissaW
Obsidian | Level 7

Here the type is the smoking type and the dose and years are the corresponding number of cigarettes and smoking length. People change the type of smoking, dose and length over time, so there are several observations under the same person. Because some of the types overlap, so each type of smoking is an individual variable that interacts with each other. 

Tom
Super User Tom
Super User

@LarissaW wrote:

Here the type is the smoking type and the dose and years are the corresponding number of cigarettes and smoking length. People change the type of smoking, dose and length over time, so there are several observations under the same person. Because some of the types overlap, so each type of smoking is an individual variable that interacts with each other. 


So you don't want to get a generic transpose.  Instead you want to create new variables with meaning like NUMBER OF CIGARETTES, NUMBER OF CIGARS, etc.

 

If you type codes are simple 1,2,3 then just use them as an index into an array (or multiple arrays).

data want;
  do until(last.id);
    set have;
    by id;
    array count num_cigarettes num_pipes num_cigars;
    array years year_cigarettes year_pipes year_cigars;
    count[type] = count;
    years[type] = length;
  end;
  drop type count length;
run;
PaigeMiller
Diamond | Level 26

@LarissaW wrote:

Not for the report, I need the wide data for the further Cox proportional analysis. Each type is an individual variable, that's why I have to convert them. I don't think the proc phreg can apply when there are multiple observations under the same person.


I am not an expert on Cox proportional analysis, but it seems to me that you have a problem here that has nothing to do with SAS and has nothing to do with converting the data set to a new arrangement. It is my understanding (and I admit I could be 100% wrong on this) that to fit a Cox proportional hazard model, you can't have multiple time values and multiple doses for each patient/subject, and so I think you need to straighten out what to do when fitting a Cox proportional hazard model when a patient/subject has multiple time values and multiple doses. I don't think stringing them all out in one record in a SAS data set is the right approach.

 

So a better question to ask, which should be entitled "How to Fit a COX proportional hazard model to this data", should ignore the re-arranging the data issue entirely. Just show your original (long) data set and ask how to fit the model to it. (see Maxim 42)

--
Paige Miller
LarissaW
Obsidian | Level 7

I'm pretty sure the Cox model can add a time-varying variable. However, I didn't plan to combine the type, dose and year, because they described the same sample from different aspects. I'd like to try the correlation of each of them with the outcomes. I know how to do the Cox proportional part but for my purpose on the Cox model, I need to transpose the long data to the wide data.

PaigeMiller
Diamond | Level 26

I didn't say Cox proportional hazard model cannot have time varying variable. I said I don't think it can have MULTIPLE time periods and multiple dosages for each subject. Maybe I'm wrong, but that's my limited understanding.

 

Please start a new thread. Ask the question the way I suggested, with the title that I suggested. In your two threads now where you focus on re-arranging the data, you have not gotten close to anything that can be used to fit a Cox proportional model, and have been advised by several people not to re-arrange the data as you want it to be re-arranged.

--
Paige Miller
Tom
Super User Tom
Super User

Don't share data as photographs.  Nobody wants to re-type your data.

Tom
Super User Tom
Super User

If you are just talking about reshaping then PROC SUMMARY can do that using IDGROUP.  You will need to know the maximum count in advance (easy enough to calculate and put into a macro variable).  

proc summary data=original;
  by id ;
  output out=want idgroup(out[5] (type dose length)=) ;
run;

If the data is not already sorted by ID then use CLASS instead of BY and add NWAY option to the PROC SUMMARY statement.

 

Since you didn't provide any example data let's use SASHELP.CLASS to see how it works.

proc summary data=sashelp.class(obs=10) nway;
  class sex;
  output out=want idgroup(out[5] (name age)=);
run;

Tom_0-1706456628597.png

 

But I do not see how that layout is going to help you.

 

 

 

Ksharp
Super User

https://communities.sas.com/t5/SAS-Procedures/proc-transpose-dataset/m-p/912923#M83262

 

data have;
   input id type $ dose $ length :mmddyy10.;
   format length mmddyy10.;
datalines;
1  A   OH 05-23-2023
1  A   MI 11-30-2023
2  C   TX 01-22-2023
2  C   CA 04-14-2023
2  C   MO 07-25-2023
3  F   OH 06-22-2023
3  F   IN 12-12-2023
3  K   WA 08-01-2023
;

data temp;
 set have;
 by id ;
 if first.id then n=0;
 n+1;
run;
proc sql noprint;
select distinct catt('temp(where=(n=',n,') rename=(type=type',n,' dose=dose',n,' length=length',n,'))') 
       into :merge separated by ' '
  from temp;
quit;
data want;
 merge &merge.;
 by id;
 drop n;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 770 views
  • 1 like
  • 5 in conversation