BookmarkSubscribeRSS Feed
jbatch
Calcite | Level 5

This is probably simple, but I'm having a hard time knowing the best strategy.  Say I have a data set like this:

ID          Persontype       Med            Weap           

001       Staff                  No              No               

001       Staff                  No              No

001       Staff                  Yes            Yes

002       Staff                  No              No

003       Staff                  No              Yes

003       Other                 No              No

003       Volunteer          Yes             No

003       Staff                  No              No

003       Staff                  Yes            Yes

I'm trying to build a new file that has one record per ID (there can be any number of rows with the same ID) that will put the value of  the rows as the new column with a number

to distinguish, e.g.,

ID       Persontype1   Med1  Weap1  Persontype2   Med2  Weap2   Persontype3   Med3  Weap3    Persontype4   Med4   Weap4     Persontype5      Med5    Weap5

001    Staff                No      No         Staff              No      No         Staff              Yes     Yes

002    Staff                No      No

003    Staff                No      Yes       Other            No       No         Volunteer       Yes     No         Staff               No       No            Staff                 Yes       Yes

Any assistance would be appreciated!

10 REPLIES 10
art297
Opal | Level 21

There are a number of methods to accomplish what you want.  I, personally, like the following one:

data have;

  input (ID Persontype Med Weap) ($);

  cards;

001       Staff                  No              No              

001       Staff                  No              No

001       Staff                  Yes            Yes

002       Staff                  No              No

003       Staff                  No              Yes

003       Other                 No              No

003       Volunteer          Yes             No

003       Staff                  No              No

003       Staff                  Yes            Yes

;

proc sql noprint;

  select max(obs) into :obs

    from ( select count(*) as obs

             from have

              group by id )

  ;

quit;

proc summary nway data=have missing;

  class ID;

  output out = want(drop=_type_ _freq_)

               idgroup(out[&obs](Persontype Med Weap)=);

run;

data_null__
Jade | Level 19

The arrangement you have has better qualities but you asked.

As long as you have 100 or less obs per BY GROUP level you can use PROC SUMMARY to transpose character and numeric variables in one step.

data have;

   input (ID Persontype Med Weap)($);

   cards;

001 Staff No No

001 Staff No No

001 Staff Yes Yes

002 Staff No No

003 Staff No Yes

003 Other No No

003 Volunteer Yes No

003 Staff No No

003 Staff Yes Yes

;;;;

   run;

proc sql noprint;

   select max(obs) into: obs

      from (select count(*) as obs from have group by id);

   quit;

   run;

%put NOTE: OBS=&obs;

proc summary data=have;

   by id;

   output out=transposed(drop=_type_)

      idgroup(out[&obs](Persontype--Weap)=)

      ;

   run;

proc print;

   run;

jbatch
Calcite | Level 5

That worked great, and I never would have thought of it!  I may nix the idea, as the obs ended up being 63, which means way too many null columns, but I'm glad to know how to do it, and can apply it in other places.

MikeZdeb
Rhodochrosite | Level 12

If you would like to read a bit more about the PROC SUMMARY approach, take a look at ...

Transposing Data Using PROC SUMMARY'S IDGROUP Option

http://www.pharmasug.org/download/papers/TT08.pdf

jbatch
Calcite | Level 5

Thank you - I will share with my coworkers!

art297
Opal | Level 21

FWIW: I "stole" the concept from Mike and Data_Null as they were really the originators and I had meant to include that reference but forgot.

Ksharp
Super User

I love call execute:

data have;
  input (ID Persontype Med Weap) ($);
  cards;
001 Staff  No No
001 Staff  No  No
001 Staff  Yes  Yes
002 Staff  No  No
003 Staff  No  Yes
003 Other  No  No
003 Volunteer Yes No
003 Staff No  No
003 Staff Yes  Yes
;
run;
data _null_;
 set have end=last;
 by id notsorted;
 if _n_ eq 1 then call execute('data want;');
 if first.id then do;count=0; call execute('id="'||id||'";');end;
 count+1;
 call execute('persontype'||strip(count)||'="'||persontype||'";');
 call execute('med'||strip(count)||'="'||med||'";');
 call execute('weap'||strip(count)||'="'||weap||'";');
 if last.id then call execute('output;');
 if last then call execute('run;');
run;



Ksharp

MikeZdeb
Rhodochrosite | Level 12

Hi ... though I'd still use PROC SUMMARY to do this, another

approach that writes/executes SAS code is FILE/PUT/%INCLUDE

(no parentheses or concatenations).

Just curious, maybe someone can offer an opinion ...

since CALL EXECUTE is just being used to create a data step for

subsequent execution, is there any advantage to CALL EXECUTE

over FILE/PUT/%INCLUDE in this situation?

data have;

input (ID Persontype Med Weap) (:$3. :$9. 2*:$3.);

cards;

001 Staff  No No

001 Staff  No  No

001 Staff  Yes  Yes

002 Staff  No  No

003 Staff  No  Yes

003 Other  No  No

003 Volunteer Yes No

003 Staff No  No

003 Staff Yes  Yes

;

run;

filename xxx temp;


data _null_;

file xxx;

put 'data want;' ;

do until(last);

do count=1 by 1 until (last.id);

  set have end=last;

  by id;

  if first.id then put 'id="' id $3. '";' ;

  put 'persontype' count '="' persontype $9.  '";' /

      'med'        count '="' med $3.         '";' /

      'weap'       count '="' weap $3.        '";' ;

end;

  put 'output;' ;

end;

put 'run;' ;

stop;

run;

%include xxx;

data_null__
Jade | Level 19

To me the advantages to FILE/PUT/%INC are

  1. Features of PUT make code gen easier
  2. You can make the code look exactly like you wrote it including indentation and the rest.
  3. You can look at it before you execute it. 
  4. Syntax is easier for me.  All those CAT functions get a bit confusing

I see no advantage to CALL EXECUTE over FILE/PUT/INC.  CALL EXECUTE does have it's place but for me after more than a few lines of code it becomes onerous.

Your example program converts all the data to character strings and then "reads" it back in just as the CALL EXECUTE example from ksharp.  That's fine with all character or integer data but when you have floating point data you will have a data loss problem.  My modification of your program address the numeric data

problem in a very simplistic way.  For a complete treatment see this paper http://support.sas.com/resources/papers/proceedings11/064-2011.pdfhttp://support.http://support.sas.com/resources/papers/proceedings11/064-2011.pdfsas.com/resources/papers/proceedings11/064-2011.pdf

data have;

input (ID Persontype Med Weap) (:$3. :$12. 2*:$3.);

y = rannor(10);

z = ranuni(10);

cards;

001 Staff  No No

001 Staff  No  No

001 Staff  Yes  Yes

002 Staff  No  No

003 Staff  No  Yes

003 "Other"  No  No

003 Volunteer's Yes No

003 Staff No  No

003 Staff Yes  Yes

;

run;

proc print;

   run;

filename FT33F001 temp;

data _null_;

   file FT33F001;

   put 'data want;';

   do until(last);

      do count=1 by 1 until (last.id);

         set have end=last;

         by id;

         if first.id then put +3 'do id=' id $quote5. ';';

         array tv

  • persontype med weap;
  •          do i = 1 to dim(tv);

                name = vname(tv);

                value = quote(tv);

                put +6 name $10.-r  count '= ' value ';';

                end;

             array tn

  • y z;
  •          do i = 1 to dim(tn);

                name = vname(tn);

                value = quote(put(tn,binary64.));

                put +6 name $10.-r count '=  input(' value ',binary64.);';

                end;

             end;

          put +6 'output;';

          put +6 'end;';

          end;

       put +3 'run;';

       stop;

       run;

    %include FT33F001 / source2;

    proc print;

       run;

    Ksharp
    Super User

    Hi. Mike.

    there  are no any advantage to CALL EXECUTEover FILE/PUT/%INCLUDE,

    Just be more convenient. Although%include is faster than call execute for execution.

    But I still like call execute which not need to make a file and to execute it by %include,

    because it is very convenient. You just need to run one time and then get the result.

    Ksharp

    SAS Innovate 2025: Save the Date

     SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

    Save the date!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    SAS Training: Just a Click Away

     Ready to level-up your skills? Choose your own adventure.

    Browse our catalog!

    Discussion stats
    • 10 replies
    • 1843 views
    • 6 likes
    • 5 in conversation