BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
mitrakos
Obsidian | Level 7

Hi Everyone,


This might be a little confusing to explain so I will try my best. I have a survey with around ~1000 records that I'm trying to transpose from long to wide, the problem is that three questions in the survey has two questions that can have multiple options selected, and when I use proc transpose with let option it only captures one of the options. I'm unsure of how to have all the responses shown for each record ID, and doing it manually will take a very long time. I've included some sample data of what I'm looking at, my code thus far, and my desired output. Thank you so much!

 

Sample dataset of what I have:

 

data have;
input record_id labcode $ protocolcode $ question_value response date :MMDDYY10.;
format date date9.;
datalines;
1 Lab1 P1 38912 2 1/30/2020
1 Lab1 P1 38912 9 1/30/2020
1 Lab1 P1 38912 10 1/30/2020
1 Lab1 P1 60388 5 1/30/2020
1 Lab1 P1 60388 1 1/30/2020
2 Lab2 P3 47577 1 2/15/2015
2 Lab2 P3 47577 2 2/15/2015
2 Lab2 P3 47577 3 2/15/2015
2 Lab2 P3 38912 9 2/15/2015
2 Lab2 P3 38912 10 2/15/2015
3 Lab5 P5 60388 1 6/23/2021
3 Lab5 P5 60388 2 6/23/2021
3 Lab5 P5 38912 8 6/23/2021
3 Lab5 P5 38912 10 6/23/2021
3 Lab5 P5 38912 7 6/23/2021
4 Lab4 P6 38912 9 3/12/2014
4 Lab4 P6 38912 3 3/12/2014
4 Lab4 P6 47577 4 3/12/2014
4 Lab4 P6 47577 5 3/12/2014
4 Lab4 P6 47577 1 3/12/2014
5 Lab9 P8 38912 10 7/6/2020
5 Lab9 P8 38912 9 7/6/2020
5 Lab9 P8 38912 8 7/6/2020
5 Lab9 P8 60388 1 7/6/2020
5 Lab9 P8 60388 5 7/6/2020
;
run;

 

 

My transpose code so far: 

 

options validvarname=v7;
proc transpose data=have out = want name=masterquestioncore let;
by record_id labcode protocolcode date; *i need all of these to show up in final dataset so I have them in the sort statement;
var response;
id question_value ;
run;

Here's a table showing my desired output, I am essentially looking for one record ID per row, with the question_values each their own variable and repeats are named as such: 

 

 

record_Id labcode protocolcode 38912_1 38912_2 38912_3 60388_1 60388_2 47577_1 47577_2 47577_3 date
1 Lab1 P1 2 9 10 5 1       1/30/2020
2 Lab2 P3 9 10       1 2 3 2/15/2015
3 Lab5 P5 8 7 10 1 2       6/23/2021
4 Lab4 P6 9 3       4 5 1 3/12/2014
5 Lab9 P8 10 9   1 5       7/6/2020

 

Hopefully I have explained it well enough. There are multiple other question_value variables that are other items on the survey, but they are all only one response per record_id, so those are easy to transpose, it is just these that are complicated. Thank you again!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Then you can use the existing dataset directly.  Just use RESPONSE in both the ID and VAR statement.

proc transpose data=have prefix=Q delim=_ out=wide(drop=_name_);
  by record_id labcode protocolcode date;
  id question_value response;
  var response;
run;

Result

Tom_0-1737757609817.png

 

View solution in original post

6 REPLIES 6
Tom
Super User Tom
Super User

Are these multiple choice questions?  Something like select ALL that apply and the "RESPONSE" variable is the number of the choice selected?

If so then use the question and the response in the ID statement add an extra variable to use the VAR statement.

proc sort;
  by record_id labcode protocolcode date question_value response;
run;

data for_transpose;
  set have;
  picked='X';
run;

proc transpose data=for_transpose prefix=Q delim=_ out=wide(drop=_name_);
  by record_id labcode protocolcode date;
  id question_value response;
  var picked;
run;

Result

                  p
                  r
                  o
     r            t                            Q
     e            o                  Q    Q    3    Q    Q    Q    Q    Q    Q    Q    Q    Q    Q    Q
     c     l      c                  3    3    8    6    6    4    4    4    3    3    6    3    4    4
     o     a      o                  8    8    9    0    0    7    7    7    8    8    0    8    7    7
     r     b      l                  9    9    1    3    3    5    5    5    9    9    3    9    5    5
     d     c      c         d        1    1    2    8    8    7    7    7    1    1    8    1    7    7
O    _     o      o         a        2    2    _    8    8    7    7    7    2    2    8    2    7    7
b    i     d      d         t        _    _    1    _    _    _    _    _    _    _    _    _    _    _
s    d     e      e         e        2    9    0    1    5    1    2    3    7    8    2    3    4    5

1    1    Lab1    P1    30JAN2020    X    X    X    X    X
2    2    Lab2    P3    15FEB2015         X    X              X    X    X
3    3    Lab5    P5    23JUN2021              X    X                        X    X    X
4    4    Lab4    P6    12MAR2014         X                   X                             X    X    X
5    5    Lab9    P8    06JUL2020         X    X    X    X                        X
mitrakos
Obsidian | Level 7
They are multiple choice yes, and these specific questions I'm having trouble with are a "select all that apply" type question. Ideally I would like to have the response value in the cell rather than an X since this dataset will be uploaded elsewhere where the value is needed in the cell.
mitrakos
Obsidian | Level 7
Sorry to further clarify, in response to if response is the number of choice selected then no it is not. For example for record ID 1, the question 38912 was a multiple choice question with response options of 1-10, and record ID 1 chose response option 2, 9, and 10. These numbers correspond to categories that are in a data dictionary elsewhere.
Tom
Super User Tom
Super User

Then you can use the existing dataset directly.  Just use RESPONSE in both the ID and VAR statement.

proc transpose data=have prefix=Q delim=_ out=wide(drop=_name_);
  by record_id labcode protocolcode date;
  id question_value response;
  var response;
run;

Result

Tom_0-1737757609817.png

 

mitrakos
Obsidian | Level 7
Oh my god you are a lifesaver!! Thank you so much for this.
Tom
Super User Tom
Super User

If the values are free text the add an extra variable to indicate which replicate it is and include that new varaible in the ID statement.

data for_transpose;
  do rep=1 by 1 until(last.date);
    set have;
    by record_id labcode protocolcode date;
    output;
  end;
run;

proc transpose data=for_transpose prefix=Q delim=_ out=wide(drop=_name_);
  by record_id labcode protocolcode date;
  id question_value rep;
  var response;
run;

Result:

                  p
                  r
                  o
     r            t
     e            o                   Q    Q     Q    Q    Q    Q    Q    Q     Q     Q    Q    Q    Q    Q
     c     l      c                   3    3     3    6    6    4    4    4     3     3    6    6    4    4
     o     a      o                   8    8     8    0    0    7    7    7     8     8    0    0    7    7
     r     b      l                   9    9     9    3    3    5    5    5     9     9    3    3    5    5
     d     c      c         d         1    1     1    8    8    7    7    7     1     1    8    8    7    7
O    _     o      o         a         2    2     2    8    8    7    7    7     2     2    8    8    7    7
b    i     d      d         t         _    _     _    _    _    _    _    _     _     _    _    _    _    _
s    d     e      e         e         1    2     3    4    5    1    2    3     4     5    1    2    4    5

1    1    Lab1    P1    30JAN2020     2    9    10    5    1    .    .    .     .     .    .    .    .    .
2    2    Lab2    P3    15FEB2015     .    .     .    .    .    1    2    3     9    10    .    .    .    .
3    3    Lab5    P5    23JUN2021     .    .     8    .    .    .    .    .    10     7    1    2    .    .
4    4    Lab4    P6    12MAR2014     9    3     .    .    .    .    .    4     .     .    .    .    5    1
5    5    Lab9    P8    06JUL2020    10    9     8    1    5    .    .    .     .     .    .    .    .    .

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 748 views
  • 0 likes
  • 2 in conversation