BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ThomasGeorge
Calcite | Level 5

astounding, Art....I appreciate your guys help.

Now the project has changed quite a bit i think, i am not sure how it effects the code. But i am so glad i am learning 😄

I wrote this code so far:

proc sql;

          Create table Station1raw as

                    Select SUBSTR(ID_No_,6,3) as IDNUM, PRG, Question_ , (SUBSTR(Question_,1,4) || Evaluation_Type) as Questionnaire,

                                 SP_vs_MD, Response_ as Response, Out__Of

from Station1

;

quit;

Data station1sp;

          set Station1raw;

          where SP_vs_MD=1;

          Drop PRG;

run;

proc transpose data=station1sp out=station2sp;

  by idnum;

  var response;

run;

But when i do the proc transpose, i get IDNUM, _NAME_, _LABEL_, COL1, COL2, COL3, COL4 till COL16. I am not clear why it does that. Even if you rewrite the code, please explain. That would greatly help.

art297
Opal | Level 21

Thomas,

You will have to provide an example Station1 dataset.  It is different from the original one since it is looking for variables that didn't exist in the original data.

ThomasGeorge
Calcite | Level 5

Art,

I have uploaded the file.

Thank you again,

Thomas

Astounding
PROC Star

Thomas,

One step you could take is to eliminate the DATA steps.  The WHERE statements could be added to PROC TRANSPOSE.

But I'm a little curious about how you intend to put the data sets together at the end.  You can't do what SAS calls a "merge" because you'll have the same variable names in both data sets.  One set of variables will overwrite the other.  So do you want to combine them side by side (but changing the variable names for MD vs. SP), or do you want to keep the same variable names and create twice as many observations (while keeping SP_vs_MD to differentiate)?

art297
Opal | Level 21

I'm not sure how you are planning to use the dataset you are trying to create, but still I don't think you have described it completely.

For example, from what you told us, I'd expect to create a dataset that had the following variables and data:

IDNUM  Q1_.Rotation  SP_MD1 Q2_Med  SP_MD_2 Q3_Dry SP_MD3 Q4_Frame  SP_MD4 Q5_Std  SP_MD5

001              1                    1         3              0                                                                       4               0

002              0                    1         2              1             3            0

003              0                    1                                                                     5                 0        3               0

Is that what you are trying to create?

ThomasGeorge
Calcite | Level 5

astounding, Art

I have created in the second sheet, how the boss wants to see the final data set. Art your code worked, but the variable headings have to be changed according to the boss, cos she wants to be able to read for each ID number what the response was at a particular station for SP and/orMD for each evaluation. I hope i am clear. Art i used this info for Proc Import:

%Macro Import(Sheet);

          PROC IMPORT OUT=STATION&Sheet

                      DATAFILE= "C:\Users\tgeorge\Documents\Rulan_Nephrology\OSCE_NEPHRO_01_27_2012.xls"

                      DBMS=EXCEL REPLACE;

               RANGE="'Station &Sheet.$'";

               GETNAMES=YES;

               MIXED=YES;

               SCANTEXT=YES;

               USEDATE=YES;

               SCANTIME=YES;

          RUN;

%Mend Import;

%Import(1)

%Import(2)

%Import(3)

%Import(4)

%Import(5)

%Import(6)

%Import(7)

Sorry for all the confusion

Thanks,

Thomas

art297
Opal | Level 21

Thomas,

I may or may not understand what you have been asked to do.  The boss, of course, is ALWAYS right!

Before I give you my thoughts, though, I do want you to realize that you are getting programming advice from a psychologist .. NOT a programmer.

Whenever I or any of my staff confront a problem, the first thing I try to teach is to try to understand the problem BEFORE attempting to see how it might be coded.  The simpler you can conceptualize the problem, the more straight forward the programming and the generalizability of the method.

I think you ought to ask your boss if there might be a problem with the way that the outof field has been coded.  It is probably a moot point, as every subject has the exact same codes.  The first nine questions have "Y:1;N:0" values for outof, then 5s for the first five canmed questions, then 5 and 3 for the next two (respectively), a 7 for the first standardized question, 5s for the next four standardized questions and, finally, a  "Y:1;N:0" for the last standardized question.  Somehow I think that last one might be an error.

Here is how I would approach the task:

PROC IMPORT OUT= WORK.have

  DATAFILE= "C:\art\Thomas.xlsx"

  DBMS=excel REPLACE;

  GETNAMES=YES;

run;

data want (keep=rot: can: stan: prg id: comment);

  set have;

  array rotation(9);

  array canmed(7);

  array standardized(6);

  retain rotation: canmed: standardized: comment;

  select (mod(_n_,22));

    when (0) do;

               standardized(6)=response_;

               output;

             end;

    when (1) do;

               call missing(of rotation(*));

               call missing(of canmed(*));

               call missing(of standardized(*));

               rotation(1)=response_;

             end;

    when (2,3,4,5,6,7,8,9) rotation(mod(_n_,22))=response_;

    when (10,11,12,13,14,15,16) do;

                  canmed(mod(_n_,22)-9)=response_;

                        if mod(_n_,22) eq 16 then

                   comment=Comments;

                end;

    otherwise standardized(mod(_n_,22)-16)=response_;

  end;

run;

art297
Opal | Level 21

Thomas,  What your boss want is precisely why there are rename statements and options, as well as a label statement.

I would point out to her that what she wants is definitely possible but, if she wants to do any between station comparisons, that it would be easier if you just added a station field and, that way, could amass all of the data within one file.

The only difference in the variable names would be that they wouldn't be preceded by st1, etc.  However, if she still wants it that way, remember rule #1 (i.e., the boss is always right!).

Renaming the variables, in either case, is easy .. just a little more work to do it the way she wants.

I would use the original question descriptions as labels for the various variables.

ThomasGeorge
Calcite | Level 5

Hey art,

I am doing some data cleaning here now. In station 2 to station 7, instead of rotation specific, it is station specific in the rows. Also, only station 1 has the variable name response_, station 2 to station 7 has just response. So i did this code:

data station1;

set station1;

rename Response_=Response;

run;

But how do i rename station 2 to 7 from Station specific to Rotation specific all at once.

art297
Opal | Level 21

Thomas,

I/we would have to see what stations1 thru 7 currently look like in order to answer your question.

Can you post them?

Astounding
PROC Star

Thomas,

When you got variable names like COL1, COL2 coming out of PROC TRANSPOSE, that's a result of failing to create _NAME_ first.  PROC TRANSPOSE uses _NAME_ to determine the variable names in the output.

I just wanted to confirm, when you have variable names like Q.2.Med1 and Q.2.Med0 ...

SAS will not let you use a dot as part of a variable name.  It will use underscores instead (and PROC TRANSPOSE will do that automatically).  I assume that's OK.

The "1" or "0" at the end of the variable names ... is that the value of SP_vs_MD?  If so, is SP_vs_MD character or numeric?

I think we're in the home stretch here ... not much more to do to wrap this up.

ThomasGeorge
Calcite | Level 5

Art,

I have uploaded the excel file for you Tabachneck_info.xls. Hopefully this gives a better picture of what i am trying to say.

Thanks,

Thomas

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 26 replies
  • 1462 views
  • 0 likes
  • 5 in conversation