BookmarkSubscribeRSS Feed
tapas_16880
Calcite | Level 5

I have a file with the below layout and sample data

PARTY_ID CAP_SORT_CODE CAP_ACC_NO MONTH_NO CCAF_EXTRNL_STAT_CODE
1347736 309911 1038471 1 9
1347736 309911 1038471 2 9
1347736 309911 1038471 3 9
1448934 874415 71778768 1 9
1448934 874415 71778768 2 9
1448934 874415 71778768 3 9
1818985 309366 15679868 1 9
1818985 309366 15679868 2 9
1818985 309366 15679868 3 9
2349634 309576 51215260 1 9
2349634 309576 51215260 2 9
2349634 309576 51215260 3 9
2656042 309633 644240 1 9
2656042 309633 644240 2 9
2656042 309633 644240 3 9
2884319 309576 17374768 1 9
2884319 309576 17374768 2 9
2884319 309576 17374768 3 9
3513240 302582 176317 1 9
3513240 302582 176317 2 9
3513240 302582 176317 3 9
4497094 302586 718161 1 9
4497094 302586 718161 2 9
4497094 302586 718161 3 9
5634459 309141 628518 1 9
5634459 309141 628518 2 9
5634459 309141 628518 3 9
5681443 309367 273287 1 9
5681443 309367 273287 2 9
5681443 309367 273287 3 9
6707092 309018 1096728 1 9

 

It is sorted on

PARTY_ID CAP_SORT_CODE CAP_ACC_NO MONTH_NO

 

Now, what I want is - it will look at the values Month_no, based on the max value of this variable, it will define the length of the array. Then for a specific combination of party_id, cap_sort_code and cap_acc_no, if the value of Month_no is 001, then it will load the value of CCAF_EXTRNL_STAT_CODE to CCAF_EXTRNL_STAT_CODE1, then come to the next record, see if the month_no = 002 for the same party_id, cap_sort_code and cap_acc_no combination, then it will move the value of CCAF_EXTRNL_STAT_CODE to CCAF_EXTRNL_STAT_CODE2 and it will go on doing this until it reaches a record where party_id, cap_sort_code and cap_acc_no changes. Once these values are moved to the array specific variables, it will finally delete all the extra records for this party_id, cap_sort_code and cap_acc_no and keeping only one record with all array variables having values. So, the table is below is what I am looking for.

 

PARTY_ID CAP_SORT_CODE CAP_ACC_NO MONTH_NO CCAF_EXTRNL_STAT_CODE1 CCAF_EXTRNL_STAT_CODE2 CCAF_EXTRNL_STAT_CODE3
1347736 309911 1038471 1 9 9 9
1448934 874415 71778768 1 9 9 9
1818985 309366 15679868 1 9 9 9
2349634 309576 51215260 1 9 9 9
2656042 309633 644240 1 9 9 9
2884319 309576 17374768 1 9 9 9
3513240 302582 176317 1 9 9 9
4497094 302586 718161 1 9 9 9
5634459 309141 628518 1 9 9 9
5681443 309367 273287 1 9 9 9
6707092 309018 1096728 1 9    

 

I have just demonstrated this above example only for sample records where Month_no is 1,2,3 (it can go till 36). Once the final table is created, Month_no to be dropped.

 

Need this help urgently.

10 REPLIES 10
RW9
Diamond | Level 26 RW9
Diamond | Level 26

From what I can tell from your post, this should suffice:

proc transpose data=have out=want prefix=ccaf_extrnl_stat_code;
  by party_id cap_sort cap_acc_no;
  var ccaf_extrnl_stat_code;
  id month_no;
run;
tapas_16880
Calcite | Level 5

I ran this, it is creating the new file but no records in it.

 

P.S: I have given the example for just one variable i.e.

ccaf_extrnl_stat_code

 

A file can have more than one variable in which case, all of those will need to be created as array variables.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Quite simply you need to provide test data, in the form of a dataset, using the code window (its the {i} above the post).  I cannot guess what you are working with or why it does not work as expected. 

Anyways if you have more than one variable, arrays make more sense, although as always I advise against using transposed data for anything other than an output report as it will just make your code long winded and hard to maintain.

Try:

proc sql noprint;
  select max(MONTH) 
  into     :M
  from    HAVE;
quit;

data want;
  set have;
  array var1{&M.};
  array var2{&M.};
  by party_id cap_sort cap_acc_no;
  if first.cap_acc_no then ind=1;
  var1{ind}=ccaf_extml_stat_code;
  var2{ind}=someothervar;
  ind=ind+1;
  if last.cap_acc_no then output;
run;
tapas_16880
Calcite | Level 5
rsubmit;
Libname inflat 'VPMS.D06.CAM.FDI.MAC7.EX' disp=shr;

rsubmit;
proc download data = INFLAT.data out = inflat;run;
endrsubmit;

proc sort data = inflat;
by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
run;
proc sql noprint;
  select max(MONTH_no) 
  into     :Mon
  from    inflat;
quit;

data inflat1;
  set inflat;
  array CCAF_EXTRNL_STAT_CODE{&Mon.};
  by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
  if first.CAP_ACC_NO then ind=1;
  CCAF_EXTRNL_STAT_CODE{ind}=CCAF_EXTRNL_STAT_CODE;
  ind=ind+1;
  if last.CAP_ACC_NO then output;
run;

 

it is giving me error, below is the log:

 

-------------------------------------------------------------------------------------------

87 proc sql noprint;

88 select max(MONTH)

89 into :Mon

90 from inflat;

ERROR: The following columns were not found in the contributing tables: MONTH.

91 quit;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE SQL used (Total process time):

real time 0.27 seconds

cpu time 0.01 seconds

 

92 proc sql noprint;

93 select max(MONTH_no)

94 into :Mon

95 from inflat;

96 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.17 seconds

cpu time 0.03 seconds

 

97 data inflat1;

98 set inflat;

99 array CCAF_EXTRNL_STAT_CODE{&Mon.};

---------------------

124

ERROR 124-185: The variable CCAF_EXTRNL_STAT_CODE has already been defined.

100 by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;

101 if first.CAP_ACC_NO then ind=1;

102 CCAF_EXTRNL_STAT_CODE{ind}=CCAF_EXTRNL_STAT_CODE;

ERROR: Undeclared array referenced: CCAF_EXTRNL_STAT_CODE.

ERROR: Variable CCAF_EXTRNL_STAT_CODE has not been declared as an array.

103 ind=ind+1;

104 if last.CAP_ACC_NO then output;

105 run;

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.INFLAT1 may be incomplete. When this step was stopped there were 0 observations and 44 variables.

WARNING: Data set WORK.INFLAT1 was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

real time 0.17 seconds

cpu time 0.04 seconds

Reeza
Super User
The code posted doesn't match the log posted. 





@tapas_16880 wrote:
rsubmit;
Libname inflat 'VPMS.D06.CAM.FDI.MAC7.EX' disp=shr;

rsubmit;
proc download data = INFLAT.data out = inflat;run;
endrsubmit;

proc sort data = inflat;
by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
run;
proc sql noprint;
  select max(MONTH_no) 
  into     :Mon
  from    inflat;
quit;

data inflat1;
  set inflat;
  array CCAF_EXTRNL_STAT_CODE{&Mon.};
  by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
  if first.CAP_ACC_NO then ind=1;
  CCAF_EXTRNL_STAT_CODE{ind}=CCAF_EXTRNL_STAT_CODE;
  ind=ind+1;
  if last.CAP_ACC_NO then output;
run;

 

it is giving me error, below is the log:

 

-------------------------------------------------------------------------------------------

87 proc sql noprint;

88 select max(MONTH)

89 into :Mon

90 from inflat;

ERROR: The following columns were not found in the contributing tables: MONTH.

91 quit;

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE SQL used (Total process time):

real time 0.27 seconds

cpu time 0.01 seconds

 

92 proc sql noprint;

93 select max(MONTH_no)

94 into :Mon

95 from inflat;

96 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.17 seconds

cpu time 0.03 seconds

 

97 data inflat1;

98 set inflat;

99 array CCAF_EXTRNL_STAT_CODE{&Mon.};

---------------------

124

ERROR 124-185: The variable CCAF_EXTRNL_STAT_CODE has already been defined.

100 by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;

101 if first.CAP_ACC_NO then ind=1;

102 CCAF_EXTRNL_STAT_CODE{ind}=CCAF_EXTRNL_STAT_CODE;

ERROR: Undeclared array referenced: CCAF_EXTRNL_STAT_CODE.

ERROR: Variable CCAF_EXTRNL_STAT_CODE has not been declared as an array.

103 ind=ind+1;

104 if last.CAP_ACC_NO then output;

105 run;

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.INFLAT1 may be incomplete. When this step was stopped there were 0 observations and 44 variables.

WARNING: Data set WORK.INFLAT1 was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

real time 0.17 seconds

cpu time 0.04 seconds




 

 

tapas_16880
Calcite | Level 5
proc sort data = inflat;
by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
run;
proc sql noprint;
  select max(MONTH_no) 
  into     :Mon
  from    inflat;
quit;

data inflat2;
  set inflat;
  array var1{&Mon.};
  by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;
  if first.CAP_ACC_NO then ind=1;
  var1{ind}=CCAF_EXTRNL_STAT_CODE;
  ind=ind+1;
  if last.CAP_ACC_NO then output;
run;

Log is below:

 

----------------------------------------------

133 proc sort data = inflat;

134 by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;

135 run;

NOTE: There were 154507 observations read from the data set WORK.INFLAT.

NOTE: The data set WORK.INFLAT has 154507 observations and 7 variables.

NOTE: Compressing data set WORK.INFLAT decreased size by 2.63 percent.

Compressed is 111 pages; un-compressed would require 114 pages.

NOTE: PROCEDURE SORT used (Total process time):

real time 0.35 seconds

cpu time 0.12 seconds

 

136 proc sql noprint;

137 select max(MONTH_no)

138 into :Mon

139 from inflat;

140 quit;

NOTE: PROCEDURE SQL used (Total process time):

real time 0.03 seconds

cpu time 0.03 seconds

 

141

142 data inflat2;

143 set inflat;

144 array var1{&Mon.};

145 by PARTY_ID CAP_SORT_CODE CAP_ACC_NO;

146 if first.CAP_ACC_NO then ind=1;

147 var1{ind}=CCAF_EXTRNL_STAT_CODE;

148 ind=ind+1;

149 if last.CAP_ACC_NO then output;

150 run;

NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column).

147:3

ERROR: Array subscript out of range at line 147 column 3.

RECTYPE=20 MONTH_NO=29 DATE_CREATED=300615 PARTY_ID=0 CAP_SORT_CODE=111111 CAP_ACC_NO=11111111 CCAF_EXTRNL_STAT_CODE=9 var11=. var12=. var13=. var14=.

var15=. var16=. var17=. var18=. var19=. var110=. var111=. var112=. var113=. var114=. var115=. var116=. var117=. var118=. var119=. var120=. var121=.

var122=. var123=. var124=. var125=. var126=. var127=. var128=. var129=. var130=. var131=. var132=. var133=. var134=. var135=. var136=. FIRST.PARTY_ID=0

LAST.PARTY_ID=0 FIRST.CAP_SORT_CODE=0 LAST.CAP_SORT_CODE=0 FIRST.CAP_ACC_NO=0 LAST.CAP_ACC_NO=0 ind=. _ERROR_=1 _N_=2

NOTE: The SAS System stopped processing this step because of errors.

NOTE: There were 3 observations read from the data set WORK.INFLAT.

WARNING: The data set WORK.INFLAT2 may be incomplete. When this step was stopped there were 0 observations and 44 variables.

WARNING: Data set WORK.INFLAT2 was not replaced because this step was stopped.

NOTE: DATA statement used (Total process time):

real time 0.03 seconds

cpu time 0.01 seconds

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Need to retain the ind and var: variables.  Note, this is why we ask for test data in the form of a dataset so we can test code before posting it!  See how in the first step here I create some test code - this should be provided in the question!

data have;
  input party_id	cap_sort_code	cap_acc_no	month_no	ccaf_extrnl_stat_code;
datalines;
1347736	309911	1038471	1	9
1347736	309911	1038471	2	9
1347736	309911	1038471	3	9
1448934	874415	71778768	1	9
1448934	874415	71778768	2	9
1448934	874415	71778768	3	9
;
run;
options mlogic mprint symbolgen;
proc sql noprint;
  select max(MONTH_NO) 
  into   :M
  from   HAVE;
quit;

data want (drop=ind);
  set have;
  array var{&M.};
  retain ind var:;
  by party_id cap_sort_code cap_acc_no;
  if first.cap_acc_no then ind=1;
  var{ind}=ccaf_extrnl_stat_code;
  ind=ind+1;
  if last.cap_acc_no then output;
run;
HB
Barite | Level 11 HB
Barite | Level 11

Please explain again how your output isn't just your input sorted by month and how these repetitive code1, code2, and code3 columns are being generated.

tapas_16880
Calcite | Level 5

based on the month values in Month_no column, all other variables will be created as array variables, so if the max value of the value of the last value of Month_no = 36, then the variable code will be created as code1, code2, code3....code36 as separate variables and the values from the variable code will be pulled and slotted into specific array instances based on that particular month_no value and also ensuring that these population at a particular iteration is happening for a specific party_id, sort_code and account_number combination

HB
Barite | Level 11 HB
Barite | Level 11

var16, var17, var18, var19 etc is just awful data structure.

 

Your explanation makes no sense to me.  I don't see why you need an array.  We are not clear on what you are actually trying to achieve. 

 

https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 16972 views
  • 2 likes
  • 4 in conversation