BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
gandikk
Obsidian | Level 7

Hi,

 

I have a requirement on custom transpose. I have duplicate account numbers (Column 1) and it’s corresponding indicators (Column 2). I have to transpose the indicators in the way they are mentioned in Column3.  For Example, rows 1, 8, 15, 22 and 23 have plain indicators. So, in column3, those indicators alone should be transposed. Rows 2, 9 and 16 have M indicators. So, those rows should get all the indicators with commas in between.

Can you please let me know whether there is any method to achieve this in SAS?

 

Row Column1 Column2 Column3
1 52212227 15 15,23,24,27,30
2 52212227      15 -M 15-M,23-M,24-M
3 52212227      15 -TS 15-TS,23-TS,24-TS
4 52212227      15 -PR 15-PR,23-PR,24-PR
5 52212227      15 -HQPAR 15-HQPAR,23-HQPAR,24-HQPAR
6 52212227      15 -DU 15-DU,23-DU,24-DU
7 52212227      15 -GU 15-GU,23-GU,24-GU
8 52212227 23 15,23,24,27,30
9 52212227      23 -M 15-M,23-M,24-M
10 52212227      23 -TS 15-TS,23-TS,24-TS
11 52212227      23 -PR 15-PR,23-PR,24-PR
12 52212227      23 -HQPAR 15-HQPAR,23-HQPAR,24-HQPAR
13 52212227      23 -DU 15-DU,23-DU,24-DU
14 52212227      23 -GU 15-GU,23-GU,24-GU
15 52212227 24 15,23,24,27,30
16 52212227      24 -M 15-M,23-M,24-M
17 52212227      24 -TS 15-TS,23-TS,24-TS
18 52212227      24 -PR 15-PR,23-PR,24-PR
19 52212227      24 -HQPAR 15-HQPAR,23-HQPAR,24-HQPAR
20 52212227      24 -DU 15-DU,23-DU,24-DU
21 52212227      24 -GU 15-GU,23-GU,24-GU
22 52212227 27 15,23,24,27,30
23 52212227 30 15,23,24,27,30
SAS@EMMAUS
1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Very similar to Xia keshan, I had it typed earlier at work, so just pasting here after coming home for the sake of it-

 

 

data have1;

set have;

length grp $10;

if notdigit(strip(scan(column2,-1,'-')))=0 then grp='1';

else grp=strip(scan(column2,-1,'-'));

run;

 

proc sort data=have1;

by column1 grp;

run;

 

data want;

set have1;

by column1 grp;

length column3 $50;

retain column3;

if first.column1 or first.grp then call missing(column3);

column3=catx(',',column3,column2);

if last.grp then output;

drop grp;

run;

 

Regards,

Naveen Srinivasan

View solution in original post

8 REPLIES 8
Reeza
Super User

Please clarify what the input and output data sets should look like, for example include a smaller input sample and for that sample show how you would like the output to be created.

gandikk
Obsidian | Level 7

Reeza,

 

Thanks for your reply!!

 

My input is Column1 and Column2 and expected output is Column1 and Column3. 

 

Input:

 

Row Column1 Column2
1 52212227 15
2 52212227      15 -M
3 52212227      15 -TS
4 52212227      15 -PR
5 52212227      15 -HQPAR
6 52212227      15 -DU
7 52212227      15 -GU
8 52212227 23
9 52212227      23 -M
10 52212227      23 -TS
11 52212227      23 -PR
12 52212227      23 -HQPAR
13 52212227      23 -DU
14 52212227      23 -GU
15 52212227 24
16 52212227      24 -M
17 52212227      24 -TS
18 52212227      24 -PR
19 52212227      24 -HQPAR
20 52212227      24 -DU
21 52212227      24 -GU
22 52212227 27
23 52212227 30

 

Expected Output:

 

Row Column1 Column3
1 52212227 15,23,24,27,30
2 52212227 15-M,23-M,24-M
3 52212227 15-TS,23-TS,24-TS
4 52212227 15-PR,23-PR,24-PR
5 52212227 15-HQPAR,23-HQPAR,24-HQPAR
6 52212227 15-DU,23-DU,24-DU
7 52212227 15-GU,23-GU,24-GU
8 52212227 15,23,24,27,30
9 52212227 15-M,23-M,24-M
10 52212227 15-TS,23-TS,24-TS
11 52212227 15-PR,23-PR,24-PR
12 52212227 15-HQPAR,23-HQPAR,24-HQPAR
13 52212227 15-DU,23-DU,24-DU
14 52212227 15-GU,23-GU,24-GU
15 52212227 15,23,24,27,30
16 52212227 15-M,23-M,24-M
17 52212227 15-TS,23-TS,24-TS
18 52212227 15-PR,23-PR,24-PR
19 52212227 15-HQPAR,23-HQPAR,24-HQPAR
20 52212227 15-DU,23-DU,24-DU
21 52212227 15-GU,23-GU,24-GU
22 52212227 15,23,24,27,30
23 52212227 15,23,24,27,30
SAS@EMMAUS
ballardw
Super User

@gandikk wrote:

Reeza,

 

Thanks for your reply!!

 

My input is Column1 and Column2 and expected output is Column1 and Column3. 

 

Input:

 

Row Column1 Column2
1 52212227 15
2 52212227      15 -M
3 52212227      15 -TS
4 52212227      15 -PR
5 52212227      15 -HQPAR
6 52212227      15 -DU
7 52212227      15 -GU
8 52212227 23
9 52212227      23 -M
10 52212227      23 -TS
11 52212227      23 -PR
12 52212227      23 -HQPAR
13 52212227      23 -DU
14 52212227      23 -GU
15 52212227 24
16 52212227      24 -M
17 52212227      24 -TS
18 52212227      24 -PR
19 52212227      24 -HQPAR
20 52212227      24 -DU
21 52212227      24 -GU
22 52212227 27
23 52212227 30

 

Expected Output:

 

Row Column1 Column3
1 52212227 15,23,24,27,30 where do the 23, 24, 27 and 30 come from???
2 52212227 15-M,23-M,24-M where does 15-M, 23-M and 24-M come from???
3 52212227 15-TS,23-TS,24-TS
4 52212227 15-PR,23-PR,24-PR
5 52212227 15-HQPAR,23-HQPAR,24-HQPAR
6 52212227 15-DU,23-DU,24-DU
7 52212227 15-GU,23-GU,24-GU
8 52212227 15,23,24,27,30
9 52212227 15-M,23-M,24-M
10 52212227 15-TS,23-TS,24-TS
11 52212227 15-PR,23-PR,24-PR
12 52212227 15-HQPAR,23-HQPAR,24-HQPAR
13 52212227 15-DU,23-DU,24-DU
14 52212227 15-GU,23-GU,24-GU
15 52212227 15,23,24,27,30
16 52212227 15-M,23-M,24-M
17 52212227 15-TS,23-TS,24-TS
18 52212227 15-PR,23-PR,24-PR
19 52212227 15-HQPAR,23-HQPAR,24-HQPAR
20 52212227 15-DU,23-DU,24-DU
21 52212227 15-GU,23-GU,24-GU
22 52212227 15,23,24,27,30
23 52212227 15,23,24,27,30

 

mkeintz
PROC Star

You can do this by building and using a hash table.  Each "item" (row) in the hash table is keyed on the the second "word" in column 2, and it keeps track of a list of correspond full column2 values.  The variable word_list keep track of the original order the "word2" values were encountered.  Both the hash table and word_list are cleared for re-use at the end of a column1 group:

 


data want2 (keep=column1 column3);
  if _n_=1 then do;
    length word2 $10  column3 $50;
    declare hash w2();
	  w2.definekey('word2');
	  w2.definedata('column1','column3');
	  w2.definedone();
  end;

  length word_list $100;
  do until (last.column1);
    set have;
	by column1;
	word2=scan(column2,2,'-');
    rc=w2.find();
	if rc^=0 then do;
	  word_list=catx(' ',word_list,word2);
	  column3=column2;
	end;
	else column3=catx(',',column3,column2);
	rc=w2.replace();
  end;
  do w=1 to countw(word_list);
	rc=w2.find(key:scan(word_list,w,' '));
    output;
  end;
  rc=w2.clear();
run;

 

I assume your data is already sorted by column1.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
gandikk
Obsidian | Level 7

Hi mkeintz,

 

Thanks for your reply!!

 

I tried your code. The code is writing the last value of column2 in the duplicate group to all the values in column1. I used the below code. I couldn't get what is going wrong here. Can you please let me know?

 

DATA A;
INFILE INPUT1;
INPUT @01 COLUMN1 $CHAR9.
@64 COLUMN2 $CHAR10.;

DATA WANT2 (KEEP=COLUMN1 COLUMN3);
IF _N_=1 THEN DO;
LENGTH WORD2 $10 COLUMN3 $50;
DECLARE HASH W2(DATASET:'A');
W2.DEFINEKEY('COLUMN2');
W2.DEFINEDATA('COLUMN1','COLUMN2');
W2.DEFINEDONE();
END;
LENGTH WORD_LIST $100;
DO UNTIL (LAST.COLUMN1);
SET A;
BY COLUMN1;
WORD2=SCAN(COLUMN2,2,'-');
RC=W2.FIND();

IF RC^=0 THEN DO;
WORD_LIST=CATX(' ',WORD_LIST,WORD2);
COLUMN3=COLUMN2;
END;
ELSE COLUMN3=CATX(',',COLUMN3,COLUMN2);
RC=W2.REPLACE();
END;
DO W=1 TO COUNTW(WORD_LIST);
RC=W2.FIND(KEY:SCAN(WORD_LIST,W,' '));
OUTPUT;
END;
RC=W2.CLEAR();
RUN;
DATA _NULL_;SET WANT2;
FILE OUTPUT;
PUT
@1 COLUMN1 $CHAR9.
@10 COLUMN3 $CHAR50.;

 

my input:

 

052212227      15
052212227      15 -M
052212227      15 -TS
052212227      15 -PR
052212227      15 -HQPAR
052212227      15 -DU
052212227      15 -GU
052212227      23
052212227      23 -M
052212227      23 -TS
052212227      23 -PR
052212227      23 -HQPAR
052212227      23 -DU
052212227      23 -GU
052212227      24
052212227      24 -M
052212227      24 -TS
052212227      24 -PR
052212227      24 -HQPAR
052212227      24 -DU
052212227      24 -GU
052212227      27
052212227      30

 

Output I got:

 

052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
052212227 30
SAS@EMMAUS
novinosrin
Tourmaline | Level 20

Very similar to Xia keshan, I had it typed earlier at work, so just pasting here after coming home for the sake of it-

 

 

data have1;

set have;

length grp $10;

if notdigit(strip(scan(column2,-1,'-')))=0 then grp='1';

else grp=strip(scan(column2,-1,'-'));

run;

 

proc sort data=have1;

by column1 grp;

run;

 

data want;

set have1;

by column1 grp;

length column3 $50;

retain column3;

if first.column1 or first.grp then call missing(column3);

column3=catx(',',column3,column2);

if last.grp then output;

drop grp;

run;

 

Regards,

Naveen Srinivasan

gandikk
Obsidian | Level 7
Hi Srinivasan,

Your solution worked for me!! Many thanks for taking time to post the solution!!
SAS@EMMAUS
Ksharp
Super User
data have;
infile cards expandtabs truncover;
input Row	Column1	: $20. Column2 $20.;
cards;
1	52212227	15
2	52212227	     15 -M
3	52212227	     15 -TS
4	52212227	     15 -PR
5	52212227	     15 -HQPAR
6	52212227	     15 -DU
7	52212227	     15 -GU
8	52212227	23
9	52212227	     23 -M
10	52212227	     23 -TS
11	52212227	     23 -PR
12	52212227	     23 -HQPAR
13	52212227	     23 -DU
14	52212227	     23 -GU
15	52212227	24
16	52212227	     24 -M
17	52212227	     24 -TS
18	52212227	     24 -PR
19	52212227	     24 -HQPAR
20	52212227	     24 -DU
21	52212227	     24 -GU
22	52212227	27
23	52212227	30
;
run;
data temp;
 set have;
 length group $ 80;
 temp=scan(column2,-1,'-');
 if notdigit(strip(temp))=0 then group='___';
  else group=temp;
run;
proc sort data=temp;
 by group;
run;
data temp1;
length want $ 200;
do until(last.group);
 set temp;
 by group;
 want=catx(',',want,column2);
end;
keep want group;
run;
data want;
 merge temp temp1;
 by group;
run;
proc sort data=want;by row;run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1178 views
  • 0 likes
  • 6 in conversation