BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ThomasGeorge
Calcite | Level 5

Hopefully, this should be the last change to the requirement i have asked on this post. I would like to transpose the data. I tried proc transpose by ID and SP_MD. But the proc transpose did not work for me. I believe DO Loops and arrays are the best bet, but it has not worked well for me. I worked on the last Do loop and tried modifying what art had posted, but i keep getting error messages. I would like do have the data set set up just like i did on the model sheet the first sheet in the Excel file.

Thanks,

Thomas

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Thomas,

One step you could take is to eliminate the DATA steps.  The WHERE statements could be added to PROC TRANSPOSE.

But I'm a little curious about how you intend to put the data sets together at the end.  You can't do what SAS calls a "merge" because you'll have the same variable names in both data sets.  One set of variables will overwrite the other.  So do you want to combine them side by side (but changing the variable names for MD vs. SP), or do you want to keep the same variable names and create twice as many observations (while keeping SP_vs_MD to differentiate)?

View solution in original post

26 REPLIES 26
Reeza
Super User

Use proc tranpsose instead of an array.

See a good example w/ code here

http://www.ats.ucla.edu/stat/sas/modules/ltow_transpose.htm

Astounding
PROC Star

Ahh, much better now that you fixed the output structure.  This utilizes a neat feature that Art used last week:

data before;

   set before;

   _name_ = compress( 'st' || put(response,1.) || question, ' .');

run;

proc transpose data=before out=after;

   by idnum;

   var response;

run;

If you have responses that could be 2 digits long, replace the 1. format with the z2. format.

Good luck.

art297
Opal | Level 21

or, yet another way of doing it:

data have;

  input IDNUM  $     Question $      Response;

  cards;

001               Q.1.                1

001               Q.2.                3

001               Q.5.                4

002               Q.1.                0

002               Q.3.                3

002               Q.4.                2

003               Q.5.                3

003               Q.1.                0

003               Q.2.                5

;

data temp;

  set have;

  question=compress(question,'.');

run;

proc transpose data=temp out=want (drop=_:) prefix=st1;

  by idnum;

  var response;

  id question;

  idlabel question;

run;

Astounding
PROC Star

Art,

Yes, depending on what is really needed as output I probably should have used:

_name_ = 'st1' || compress(question, ' .');

It's not 100% clear to me whether the new variable names should start with "st1" or whether the 3rd character should vary depending on the response.  Given the follow-up information, it should probably remain "st1".

ThomasGeorge
Calcite | Level 5

astounding. I appreciate your help. I missed out a critical step. Some of the IDNUM have 3-4 Q.1. How would i average the response in the code?

Tom
Super User Tom
Super User

It all really depends on what you want to do with the data.

But you can use PROC SUMMARY to create an average of each question.

proc summary data=have nway missing ;

   class idnum question ;

   var response;

   output out=temp mean= ;

run;

Astounding
PROC Star

If you combine our solutions, it should work.

First, apply Tom's summary.

Then use Art's code (or my revised code with the updated assignment for _name_), using the output data set from PROC SUMMARY as the input to PROC TRANSPOSE.

Here's a more complete version:

proc summary data=before nway missing;

   class idnum question;

   var response;

   where question ne 'CanM';

   output out=temp (keep=idnum question response) mean=;

run;

data temp;

   set temp;

   _name_ = 'st1' || compress(question, ' .');

run;

proc transpose data=before out=after;

   by idnum;

   var response;

run;

art297
Opal | Level 21

I don't understand why you got duplicate ids, but Astounding's approach should avoid that.  How about if you try:

data have;

  input IDNUM  $     Question $      Response;

  cards;

001               Q.1.                1

001               Q.1.                2

001               Q.1.                3

001               Q.2.                3

001               Q.5.                4

002               Q.1.                0

002               Q.3.                3

002               Q.4.                2

002               CanM                4

003               Q.5.                3

003               Q.1.                0

003               Q.2.                5

;

proc summary data=have nway;

  class idnum question;

  var response;

  output out=averages (drop=_:) mean=;

run;

data averages;

  set averages;

  _name_ = 'st1' || compress(question, ' .');

run;

proc transpose data=averages out=want (drop=_:);

   by idnum;

   var response;

run;

data want;

  retain idnum st1Q1-st1Q5 st1CanM;

  set want;

run;

Astounding
PROC Star

Art,

Looks like our solutions are converging.

Those duplicate ids must have come from a test that didn't use PROC SUMMARY first.

ThomasGeorge
Calcite | Level 5

astounding i have a question. This program that you and art created only calculates the average, but i need standard deviation too?cos i cannot use the proc means step after this, as it will do an averge over an average.

Astounding
PROC Star

PROC MEANS can generate standard deviations in the output data set, as well as means.

Part of the answer depends on what you want your report to look like.  Should the standard deviations form a second row for each IDNUM, or should they form additional columns on the existing row?

Also note, you will have a lot of missing standard deviations when there is only one response.

Astounding
PROC Star

Thomas,

I'm afraid that work beckons, and I won't be able to give you a complete answer here, just a few notes.

First, it should be possible to run all 7 STs at once.  You might skip this at first, getting it to work for a single ST as a starting point.  But it would certainly be possible to code statements like these at the proper points:

_name_ = 'st' || put(st,1.) || compress(question, ' .');

class st idnum question;

But without attempting that hurdle, first, in the PROC SUMMARY, modify the OUTPUT statement to get both statistics:

output out=temp (keep=idnum question mean std) mean=mean std=std;

Then in the DATA step that is needed anyway, construct the combination variable:

data temp;

   set temp;

   _name_ = 'st1' || compress(question, ' .');

   new_response = put(mean, 3.1) || '+-' || put(std, 3.1);

run;

Then use new_response instead of response within PROC TRANSPOSE.

Good luck. 

art297
Opal | Level 21

Thomas,

There is probably an easier way to accomplish this, but I think that the result at least is what you are looking for:

data have;

  input IDNUM  $     Question $      Response;

  cards;

001               Q.1.                1

001               Q.1.                2

001               Q.1.                3

001               Q.2.                3

001               Q.2.                4

002               Q.1.                0

002               Q.1.                1

002               Q.3.                3

002               Q.3.                2

002               Q.4.                2

002               Q.4.                4

002               CanM                4

002               CanM                5

003               Q.5.                3

003               Q.5.                1

003               Q.1.                0

003               Q.1.                2

003               Q.2.                5

003               Q.2.                7

;

proc summary data=have nway;

  class idnum question;

  var response;

  output out=averages (drop=_:) mean=  std= /autoname;

run;

data averages;

  set averages;

  _name_ = 'sm1' || compress(question, ' .');

run;

proc transpose data=averages (rename=(response_mean=response))

   out=part1 (drop=_:);

   by idnum;

   var response;

run;

data averages;

  set averages;

  _name_ = 'sd1' || compress(question, ' .');

run;

proc transpose data=averages

   out=part2 (drop=_:);

   by idnum;

   var response_stddev;

run;

proc sql noprint;

  select 'st'||trim(substr(name,3))||'='||

          'catt(put('||trim(name)||',4.2),"+",put(sd'||

          trim(substr(name,3))||',4.2));',

         'if strip(st'||trim(substr(name,3))||

         ')=".+ ." then call missing('||

         'st'||trim(substr(name,3))||');',

         'length st'||trim(substr(name,3))||' $10;'

    into :calcs separated by ' ',

               :miss separated by ' ',

               :lengths separated by ' '

      from dictionary.columns

        where libname="WORK" and

          memname="PART1" and

          name ne "IDNUM"

  ;

quit;

data want (drop=sm: sd:);

  &lengths.;

  set part1;

  set part2 (drop=idnum);

  &calcs.;

  &miss.;

run;

art297
Opal | Level 21

Thomas,

If you look at the code I sent, last, the creation of _name_ was in a different place:

proc summary data=have nway;

  class idnum question;

  var response;

  output out=averages (drop=_:) mean=;

run;

data averages;

  set averages;

  _name_ = 'st1' || compress(question, ' .');

run;

proc transpose data=averages out=want (drop=_:);

   by idnum;

   var response;

run;

data want;

  retain idnum st1Q1-st1Q5 st1CanM;

  set want;

run;

I'll take a look at your latest need now.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 26 replies
  • 1440 views
  • 0 likes
  • 5 in conversation