BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Miracle
Barite | Level 11

Hi everyone

Can I please ask your help on how can I improve my SAS code to make it shorter?

I would like to reshape from a long data to wide.

Thank you heaps.

 

Long   
idmemberidtypeperiod
1121
1163
1111
1272
1271
1284
1234

 

Wide        
idmemberidB_1F_3A_1G_2G_1H_4C_4
111110000
120001111

 

 

data wide
set long;
if type=1 and period=1 then output A_1;
if type=1 and period=2 then output A_2;
if type=1 and period=3 then output A_3;
if type=1 and period=4 then output A_4;
.
.
.
if type=26 and period=1 then output Z_1;
if type=26 and period=2 then output Z_2;
if type=26 and period=3 then output Z_3;
if type=26 and period=4 then output Z_4;
run;
data wide2;
merge A_1(in=a) A_2(in=aa) A_3(in=aaa) A_4(in=aaaa)
...
Z_1(in=Z) Z_2(in=ZZ) Z_3(in=ZZZ) Z_4(in=ZZZZ);
by id memberid;
if a then A_1=1; else A_1=0;
.
.
.
if ZZZZ then Z_4=1; else Z_4=0;
run;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

In your long format create a new variable that maps the number to the letter. You'll need to do a pass afterwards to set missing to 0. 

Heres a mock up minus that last step. This is untested but should be pretty close. The byte function should return the correct alphabet but you may need tweak depending on your system. 

 

Data want;
Set have;
Group=catx('_', byte(64+type), period);
Count=1;
Run;

Proc sort data = have;
By ID memberID group count;
run;

 

 

proc transpose data=have out=want;

 

by id memberid;

 

var count;

 

id group;

 

run;

View solution in original post

8 REPLIES 8
Patrick
Opal | Level 21

Something doesn't add up in your sample code.

 

Is the value of Type used to determine if output goes to "A" or "Z"? (type=1 -> A, type=2 -> B,...) ?

Miracle
Barite | Level 11

Hi @Patrick.

Thank you for your reply.

i have just corrected my post.

Yes, type 1 = A....., type 26 = Z. 

 

Patrick
Opal | Level 21

What you ask for can be done BUT I believe the naming convention you're using will make it very cumbersome in any downstream code to address the variables. I'd go for some naming convention which would make it easier to address the variables for an array using wildcard syntax.

 

But o.k. - I'll post some code once I have it "for the fun of it".

Reeza
Super User

In your long format create a new variable that maps the number to the letter. You'll need to do a pass afterwards to set missing to 0. 

Heres a mock up minus that last step. This is untested but should be pretty close. The byte function should return the correct alphabet but you may need tweak depending on your system. 

 

Data want;
Set have;
Group=catx('_', byte(64+type), period);
Count=1;
Run;

Proc sort data = have;
By ID memberID group count;
run;

 

 

proc transpose data=have out=want;

 

by id memberid;

 

var count;

 

id group;

 

run;
Patrick
Opal | Level 21

Not as elegant as @Reeza's approch but should work:

data have;
  input id memberid type period;
  datalines;
1 1 2 1
1 1 6 3
1 1 1 1
1 2 7 2
1 2 7 1
1 2 8 4
1 2 3 4
;
run;

/* generate macro variable containing all possible variable names A_1 to Z_4 */
data _null_;
  length string $32767;
  pos_a=rank('A');
  do i=pos_a to (pos_a+25);
    do period='1','2','3','4';
      string=catx(' ',string,(byte(i)||'_'||period));
    end;
  end;
  call symputx('vars',string);
  call symputx('pos_a',pos_a);
run;

/* create wide data set WANT */
data want(drop= _: type period);
  set have;
  by id memberid;

  array vars {26,4} 8 &vars;
  retain vars;

  if first.memberid then
    do;
      /* initialize vars in array to zero */
      do _i=1 to dim(vars);
        do _j=1 to 4;
          vars[_i,_j]=0;
        end;
      end;
    end;

  vars[type,period]=1;

  if last.memberid then output;
run;
Miracle
Barite | Level 11

Dear @Reeza and @Patrick, thank you for your help.

Patrick
Opal | Level 21

...and just because I believe your naming condition will make coding hard for you below some code which generates variables ind_1 to ind_104 but then creates labels for the vars. This way the variables as such will be easily addressable in code but you still got descriptive labels for printing.

data have;
  input id memberid type period;
  datalines;
1 1 2 1
1 1 6 3
1 1 1 1
1 2 7 2
1 2 7 1
1 2 8 4
1 2 3 4
;
run;

filename codegen temp;
data inter(keep=ind_:);

  /* create variables */;
  array ind_ {26,4} (104*0);
  output;

  /* generate code for attrib statement for variables */
  file codegen;
  put "attrib ";
  do _i=1 to dim1(ind_);
    do _j=1 to dim2(ind_);
      varname=vname(ind_[_i,_j]);
      put varname "label='type: " _i ", period: " _j "'";
    end;
  end;
  put ";";

  stop;
run;

/* execute generated code: assign labels to variables */
data inter;
  set inter;
  %include codegen / source2;
run;
filename codegen;


/* create wide data set WANT */
data want(keep=id memberid ind_:);
  set have;
  by id memberid;

  retain _i 1;
  set inter point=_i;

  array ind_vars {26,4} ind_:;

  ind_vars[type,period]=1;

  if last.memberid then output;
run;


proc print data=want(keep=id memberid ind_1-ind_10) label;
run;
Miracle
Barite | Level 11

Thank you @Patrick. I would also like to accept yours as solution if only possible Smiley Happy

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1881 views
  • 2 likes
  • 3 in conversation