Solved: Re: proc transpose or data step

Moe_Issa · Posted 09-16-2015 05:18 PM

Hello

I have a data set ( want ) multiple record for same ID and I want to transpose the rows to columns , however few conditions needs to be met first .

First of all , Each Logo should have its own Vdata1- 5 , also regionquantity will determind how many logos will appear for a region .

Example is the RegionQuantity =2 then just 2 logos should be selected for that ID even if it has 3 record. ( logos)

Second , region P start from block 4 - 6

region B start from block 7 - 9

region S start from block 10 - 12

region T start from block 13 - 22

Also the %%name%%.

I used proc transpose and created the Vdata1-5 . but i dont think its efficient , also i used alot of If-then after the transpose to determin how many logos each id should get and rename statement to assigned the region 4- 6.. etc .

Is there an easy way to do that ?

data have;

INFILE DATALINES DLM=',';

input Fname $ Logo $ Region $ RegionQuantity $ VDATA1 $ VDATA2 $ VDATA3 $ VDATA4 $ Vdata5 $ Priority ID;

datalines;

Peter,hhLKOM,P,1, ,%%name%%, , , ,1, 1

Peter,TTT1,S,1, , , , , , 2, 1

Peter,BBB2,T,2, , , , , , 3, 1

Peter,HESFD,T,2, , , , , , 4, 1

Peter,RWDFG,T,2, , , , , , 5, 1

John,hhLKOM1,P,1,%%name%%, , , , , 1, 2

John,TTT13,B,1, , , , , ,2, 2

John,BBB23,S,1, , , , , , 3, 2

John,HESFD12,T,2 , , , , , , 4, 2

John,RWDFG11,T,2, , , , , , 5, 2

John,RWDFG15,T,2 , , , , , , 6, 2

Cole,WKOM,P,1, , , , , , 1, 3

Cole,sss1,B,1, , , , , , 2, 3

Cole,BBB232,S,1, , , , , , 3, 3

Cole,H111,T,4 , , , , , ,4, 3

Cole,YWQF1,T,4,%%name%%, , , , , 5, 3

Cole,HEHEJRN,T,4 , , , , , , 6, 3

Cole,FMERETH,T,4 , , , , , , 7, 3

;

run;

data want;

INFILE DATALINES DLM=',';

input ID BLOCK4 $ Block4Data1 $ Block4Data2 $ Block4Data3 $ Block4Data4 $ Block4Data5 $ BLOCK5 $ Block5Data1 $ Block5Data2 $ Block5Data3 $ Block5Data4 $ Block5Data5 $ BLOCK6$

Block6Data1 $ Block6Data2 $ Block6Data3 $ Block6Data4 $ Block6Data5 $ BLOCK7 $ Block7Data1$ Block7Data2 $ Block7Data3 $ Block7Data4 $ Block7Data5 $ BLOCK8 $ Block8Data1 $

Block8Data2 $ Block8Data3 $ Block8Data4 $ Block8Data5 $ BLOCK9 $ BlocK9Data1 $ Block9Data2 $ Block9Data3 $ Block9Data4 $ Block9Data5 $ BLOCK10 $ BlocK10Data1 $ Block10Data2 $

Block10Data3 $ Block10Data4 $ Block10Data5 $ BLOCK11 $ BlocK11Data1 $ Block11Data2 $ Block11Data3 $ Block11Data4 $ Block11Data5 $ BLOCK12 $ BlocK12Data1 $ Block12Data2 $

Block12Data3 $ Block12Data4 $ Block12Data5 $ BLOCK13 $ BlocK13Data1 $ Block13Data2 $ Block13Data3 $ Block13Data4 $ Block13Data5 $ BLOCK14 $ BlocK14Data1 $ Block14Data2 $

Block14Data3 $ Block14Data4 $ Block14Data5 $ BLOCK15 $ BlocK15Data1 $ Block15Data2 $ Block15Data3 $ Block15Data4 $ Block15Data5 $ BLOCK16 $ BlocK16Data1 $ Block16Data2 $ Block16Data3 $

Block16Data4 $ Block16Data5 $ ;

datalines;

1,hhLKOM, ,Peter, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,TTT1, , , , , , , , , , , , , , , , , ,BBB2, , , , , ,HESFD , , , , , , , , , , , , , , , , ,

2,hhLKOM1,John, , , , , , , , , , , , , , , , ,TTT13, , , , , , , , , , , , , , , , , ,BBB23 , , , , , , , , , , , , , , , , , ,HESFD12, , , , , ,RWDFG11, , , , , , , , , , , , , , , , ,

3,WKOM, , , , , , , , , , , , , , , , , ,sss1, , , , , , , , , , , , , , , , , ,BBB232, , , , , , , , , , , , , , , , , ,H111, , , , , ,YWQF1,Cole, , , , ,HEHEJRN, , , , , ,FMERETH, , , , , ,

;

run;

thanks

Moe_Issa · Posted 09-21-2015 11:19 AM

here is what i did

proc sort data=have ; by id Region; run;

/* this step to determine who how many logos for each ID */
data num_of_logos ; set have ;
by id region;
N+1;
If first.region then N=1;
if n <= put(regionquantity,1.)
;
run;

proc sort data=num_of_logos ; by id Priority;run;

/* transpose the rows to columns starting from block4 */
DATA trans_col;
ARRAY Block(24) $ 50 ;
do i = 1 to 24;
SET num_of_logos;
BY id Priority ;
IF region='P' then do;
i=3;
Block(i+N)=logo;
end;
IF region='B' then do;
i=6;
Block(i+N)=logo;
end;
if region='S' then do;
i=9;
Block(i+n)=logo;
end;
if region='T' then do;
i=12;
Block(i+n)=logo;
end;
IF last.id THEN RETURN;
END;
RUN;

proc sort data=have out=drivertable nodupkey; by logo region priority ;run;

%MACRO FORMATING;
data forformat;
set drivertable;

array vars $ vdata1-vdata5;
do over vars;
if vars ='%%name%%' then vars=strip(Fname);
run;
%MEND FORMATING;

%formating;

%MACRO FORMATS;
%do j = 1 %to 5;
data Block&j (keep=fmtname start label type);
set forformat ;
retain fmtname "$vdata&j._" type 'c';
rename logo=start vdata&j=label;
run;
proc format library=work cntlin=block&j;run;
%end;
%MEND FORMATS;
%formats;

%MACRO VDATA;
data vdata;
set trans_col;
/* length block4data1 $100. block4data2 $100. block4data3 $100. block4data4 $100. block4data5 $100. ;*/
%do i = 1 %to 24; /*message priority */
%do j = 1 %to 5;
Length Block&i.data&j $100.;
Block&i.data&j=put(Block&i,$vdata&j._.);
%end;
%end;
drop i n Priority Region RegionQuantity n vdata1 vdata2 vdata3 vdata4 Vdata5 ;
run;
%MEND VDATA;

%vdata;

View solution in original post

evp000 · Posted 09-17-2015 10:52 AM

Hi,

I started looking at this but got stuck.

First your REGIONS don't each have a unique REGIONQUANTITY:

/*
Region RegionQuantity
B 1
P 1
S 1
T 2 <-- !
T 4 <-- !

*/

So I did this reassignment:

if region = 'T' and regionquantity = 4 then region = 'T2';

In order to get the right number of records I did this but I have no idea how you want to select them:

proc sort data = have1; by region logo; run;

* WILL SELECT THE FIRST LOGOS IN ALPHABETICAL ORDER, EG FIRST 2 LOGOS IF 2 ARE REQUIRED. HOW DO WE KNOW WHICH LOGOS TO KEEP? *;
data have2;
set have1;
by region logo;
retain count;
if first.region then count = 1;
else count = count + 1;
if count le regionquantity;
run;

Then I didn't understand about how you want to get them into the blocks...

So, anyway, this may help you get on the right path but it's not a complete solution. Good luck.

m.

Moe_Issa · Posted 09-17-2015 12:48 PM

Thanks for your help , Sorry I think i wasn't clear enough but here is what im trying achive .

for each ID region quantity should be unique by region ( like T for first ID is 2 for all region=T records). so regionquantity means just to logos should be selected for that ID from region T based on the priority .

regarding the Block I want to transpose the rows selected to columns and i named it block . and each logo should have 5 vdata .for example first ID has 1 logo in region = P , and P region start from number 4 ( logo4 = block4).

Note:

region P start from block 4 - 6

region B start from block 7 - 9

region S start from block 10 - 12

region T start from block 13 - 22

. and then create 5 columns for that block block4data1 block4data2 .... etc . and replace the %%name%% with the value from column name .

ballardw · Posted 09-17-2015 05:43 PM

What role do the variable RegionQuantity and Priority Play? Are all of the VDATA variables in your actual data character?

It appears that PRIORITY controls the order of the data within the block. Is that the case?

Moe_Issa · Posted 09-17-2015 06:16 PM

RegionQuantity will determine how many logos the ID should get , example is the same ID have 3 logos that has T as Region ( 3 rows ) and RegionQuantity = 2 that means this ID will just receive 2 records from region T based on priority .

hope this help .

Thanks

Ksharp · Posted 09-18-2015 04:15 AM

The simplest way is using proc means OR althernative way is MERGE skill.

data have;
INFILE DATALINES DLM=',';
input  Logo $ Region $ RegionQuantity $ VDATA1 $ VDATA2 $ VDATA3 $ VDATA4 $ Vdata5 $ Priority ID;
datalines;
hhLKOM,P,1, ,%%name%%, , , ,1,         1
TTT1,S,1, , , , , ,           2,         1
BBB2,T,2, , , , , ,           3,         1
HESFD,T,2, , , , , ,        4,         1
RWDFG,T,2, , , , , ,       5,         1
hhLKOM1,P,1,%%name%%, , , , , 1,      2
TTT13,B,1, , , , , ,2,       2
BBB23,S,1, , , , , ,         3,         2
HESFD12,T,2    , , , , , , 4,         2
RWDFG11,T,2, , , , , ,    5,         2
RWDFG15,T,2   , , , , , , 6,         2
WKOM,P,1, , , , , ,         1,         3
sss1,B,1, , , , , ,            2,         3
BBB232,S,1, , , , , ,       3,         3
H111,T,4 , , , , , ,4,        3
YWQF1,T,4,%%name%%, , , , , 5,         3
HEHEJRN,T,4   , , , , , , 6,         3
FMERETH,T,4   , , , , , , 7,         3
;
run;

proc sql;
 select max(n) into : n
  from (select count(*) as n from have group by id);
quit;

proc summary data=have;
by id;
output out=want idgroup(out[&n] (Logo  Region  RegionQuantity  VDATA1  VDATA2  VDATA3  VDATA4  Vdata5 Priority )=);
run;

Moe_Issa · Posted 09-18-2015 08:40 AM

thanks xia ,

here is what i did , however in last step i dont get the logos in the right block ,

region P start from block 4 - 6

region B start from block 7 - 9

region S start from block 10 - 12

region T start from block 13 - 22

also , i didnt create block&i.data1 block&i.data2 block&i.data3 block&i.data4 block&i.data5 for each block . each row is a logo before transposed and each logo has 5 vdata . so after the logos gets transposed i need to create 5 vdata with value of %%name%% if needed.

proc sort data=have ; by id Region; run;

/* this step to determine who how many logos for each ID */
data num_of_logos ; set have ;
by id region;
N+1;
If first.region then N=1;
if n <= put(regionquantity,1.)
;
run;

proc sort data=num_of_logos ; by id Priority;run;

/* transpose the rows to columns starting from block4 */
DATA trans_col;
ARRAY Block(24) $ 50 ;
do i = 1 to 24;
SET num_of_logos;
BY id Priority ;
IF region='P' then i=4;
Block(i)=logo;

IF region='B' then i=7;
Block(i)=logo;

if region='S' then i=10;
Block(i)=logo;

if region='T' then i=13;
Block(i)=logo;
IF last.id THEN RETURN;

END;
RUN;

Moe_Issa · Posted 09-18-2015 11:53 AM

I was able to get the logos to go in right block the way i need however Im struggling in creating a block4adata1 , block4data2 ... etc

here is updated code . any idea

DATA trans_col;
ARRAY Block(24) $ 50 ;
do i = 1 to 24;
SET num_of_logos;
BY id Priority ;
IF region='P' then do;
i=3;
Block(i+N)=logo;
end;
IF region='B' then do;
i=6;
Block(i+N)=logo;
end;
if region='S' then do;
i=9;
Block(i+n)=logo;
end;
if region='T' then do;
i=12;
Block(i+n)=logo;
end;
IF last.id THEN RETURN;
END;
RUN;

Moe_Issa · Posted 09-18-2015 01:52 PM

Nevermind solved it 🙂

ballardw · Posted 09-18-2015 04:39 PM

Please share your solution so others with a similar question can see the solution.

Moe_Issa · Posted 09-21-2015 11:19 AM

here is what i did

proc sort data=have ; by id Region; run;

/* this step to determine who how many logos for each ID */
data num_of_logos ; set have ;
by id region;
N+1;
If first.region then N=1;
if n <= put(regionquantity,1.)
;
run;

proc sort data=num_of_logos ; by id Priority;run;

/* transpose the rows to columns starting from block4 */
DATA trans_col;
ARRAY Block(24) $ 50 ;
do i = 1 to 24;
SET num_of_logos;
BY id Priority ;
IF region='P' then do;
i=3;
Block(i+N)=logo;
end;
IF region='B' then do;
i=6;
Block(i+N)=logo;
end;
if region='S' then do;
i=9;
Block(i+n)=logo;
end;
if region='T' then do;
i=12;
Block(i+n)=logo;
end;
IF last.id THEN RETURN;
END;
RUN;

proc sort data=have out=drivertable nodupkey; by logo region priority ;run;

%MACRO FORMATING;
data forformat;
set drivertable;

array vars $ vdata1-vdata5;
do over vars;
if vars ='%%name%%' then vars=strip(Fname);
run;
%MEND FORMATING;

%formating;

%MACRO FORMATS;
%do j = 1 %to 5;
data Block&j (keep=fmtname start label type);
set forformat ;
retain fmtname "$vdata&j._" type 'c';
rename logo=start vdata&j=label;
run;
proc format library=work cntlin=block&j;run;
%end;
%MEND FORMATS;
%formats;

%MACRO VDATA;
data vdata;
set trans_col;
/* length block4data1 $100. block4data2 $100. block4data3 $100. block4data4 $100. block4data5 $100. ;*/
%do i = 1 %to 24; /*message priority */
%do j = 1 %to 5;
Length Block&i.data&j $100.;
Block&i.data&j=put(Block&i,$vdata&j._.);
%end;
%end;
drop i n Priority Region RegionQuantity n vdata1 vdata2 vdata3 vdata4 Vdata5 ;
run;
%MEND VDATA;

%vdata;

Registration is open

SAS Training: Just a Click Away