Solved: Help with arrays

Havi · Posted 03-07-2017 05:48 AM

Hi Guys

Need some help with this problem.

This is the two tables I have:

data sample;  
   infile datalines missover;  
   input Date $ Type1 Type2 $ Money  P_10 P_108 P_9 O_100 O_98 O_5 P_50 P_3 O_355;  
   datalines;  
20170101 0 A 100  50 10 . 40
20170101 0 A 110 . . . . 110
20170101 1 B 1200 200 . . . . 500 . . 500
20170101 1 B 202 100 . . . . . 102
20170101 5 C 5 . . . . 5
;                             
run;

data reference_id;
   infile datalines missover;  
   input ID Descrip $ Status $ category $; 
   datalines;
10	P	PAID	GOOD
108	P	PAID	GOOD		
9	P	PAID	BAD
100	M	PAID	GOOD
98	M	PAID	BAD
5	M	PAID	GOOD
50	P	PAID	BAD
3	P	PAID	GOOD
355	M	PAID	N/A
;
run;

I need to extract all the ID's from sample into an array (remove the prefix) and join it with the ID properties in the reference_id table.

The ID should be replaced with its respective properties e.g P_10 in sample table equals ID 10 in reference_id table - the output should read P_PAID_GOOD.

This is the output I desire.

data output;
 infile datalines missover;  
 input Date $ Type1 Type2 $ Money P_PAID_GOOD P_PAID_BAD M_PAID_GOOD M_PAID_BAD M_PAID_N_A ;
 datalines;
20170101 0 A 100 60 . 40 
20170101 0 A 110 . . . 110
20170101 1 B 1200 200 . 500 . 500
20170101 1 B 202 100 . . 102 .
20170101 5 C 5 . . . 5 .
 ;
run;

RW9 · Posted 03-07-2017 06:22 AM

I am not sure about your output dataset, this doesn't seem to match what you say in the post.

This is where I got to, so can be a start for you:

data sample;  
   infile datalines missover;  
   input Date $ Type1 Type2 $ Money  P_10 P_108 P_9 O_100 O_98 O_5 P_50 P_3 O_355;  
   datalines;  
20170101 0 A 100  50 10 . 40
20170101 0 A 110 . . . . 110
20170101 1 B 1200 200 . . . . 500 . . 500
20170101 1 B 202 100 . . . . . 102
20170101 5 C 5 . . . . 5
;                             
run;

proc transpose data=sample out=inter;
  by date type1 type2 money notsorted;
  var p_: o_:;
run;
data inter;
  set inter;
  id=input(scan(_name_,2,"_"),best.);
run;

data reference_id;
   infile datalines missover;  
   input ID Descrip $ Status $ category $; 
   datalines;
10	P	PAID	GOOD
108	P	PAID	GOOD		
9	P	PAID	BAD
100	M	PAID	GOOD
98	M	PAID	BAD
5	M	PAID	GOOD
50	P	PAID	BAD
3	P	PAID	GOOD
355	M	PAID	N/A
;
run;

proc sql;
  create table COMB as
  select  distinct
          A.DATE,
          A.TYPE1,
          A.TYPE2,
          A.MONEY,
          catx("_",B.STATUS,B.CATEGORY) as LAB,
          sum(COL1) as VAL
  from    (select * from INTER where COL1 ne .) A
  left join REFERENCE_ID B
  on      A.ID=B.ID
  group by A.DATE,
           A.TYPE1,
           A.TYPE2,
           A.MONEY,
           catx("_",B.STATUS,B.CATEGORY);
quit;
proc transpose data=comb out=want;
  by date type1 type2 money;
  var val;
  id lab;
run;

View solution in original post

RW9 · Posted 03-07-2017 06:22 AM

I am not sure about your output dataset, this doesn't seem to match what you say in the post.

This is where I got to, so can be a start for you:

data sample;  
   infile datalines missover;  
   input Date $ Type1 Type2 $ Money  P_10 P_108 P_9 O_100 O_98 O_5 P_50 P_3 O_355;  
   datalines;  
20170101 0 A 100  50 10 . 40
20170101 0 A 110 . . . . 110
20170101 1 B 1200 200 . . . . 500 . . 500
20170101 1 B 202 100 . . . . . 102
20170101 5 C 5 . . . . 5
;                             
run;

proc transpose data=sample out=inter;
  by date type1 type2 money notsorted;
  var p_: o_:;
run;
data inter;
  set inter;
  id=input(scan(_name_,2,"_"),best.);
run;

data reference_id;
   infile datalines missover;  
   input ID Descrip $ Status $ category $; 
   datalines;
10	P	PAID	GOOD
108	P	PAID	GOOD		
9	P	PAID	BAD
100	M	PAID	GOOD
98	M	PAID	BAD
5	M	PAID	GOOD
50	P	PAID	BAD
3	P	PAID	GOOD
355	M	PAID	N/A
;
run;

proc sql;
  create table COMB as
  select  distinct
          A.DATE,
          A.TYPE1,
          A.TYPE2,
          A.MONEY,
          catx("_",B.STATUS,B.CATEGORY) as LAB,
          sum(COL1) as VAL
  from    (select * from INTER where COL1 ne .) A
  left join REFERENCE_ID B
  on      A.ID=B.ID
  group by A.DATE,
           A.TYPE1,
           A.TYPE2,
           A.MONEY,
           catx("_",B.STATUS,B.CATEGORY);
quit;
proc transpose data=comb out=want;
  by date type1 type2 money;
  var val;
  id lab;
run;

Havi · Posted 03-07-2017 06:47 AM

Thanks a million RW9. Works exactly the way I wanted it to.

Astounding · Posted 03-07-2017 06:37 AM

Here's a different start. I just don't have time to finish it.

Because there are so few entries in the second table, taking on a limited set of numeric ID values, this is a good candidate for a rarely uised technique.

data want;

array st {355} $ 1 _temporary_;

array cat {355} $ 4 _temporary_;

if _n_=1 then do until (done);

set reference_id end=done;

st{id} = status;

cat{id} = category;

end;

This gives you two temporary arrays meaning their elements are automatically retained, and automatically dropped at the end of the DATA step. Most of the array elements are missing, but that won't harm anything. It doesn't take much memory to hold 355 elements.

What's left to do in the same DATA step:

set sample;

array entries {9} P_10 P_108 P_9 O_100 ... O_355;

do i=1 to 9;

length P_vs_O $ 1;

P_vs_O = vname(entries{i});

ID = input(substr (vname(entries{i}), 3), 3.);

* Put pieces in the right buckets;

end;

*keep/drop as you see fit;

run;

There's probably a detail or two that I left out, but this approach will be as straightforward as it gets (unfortunately so, perhaps).

Reeza · Posted 03-07-2017 06:42 AM

In my experience the long format is more useful.

Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Re: Help with arrays

Click image to register for webinar

Classroom Training Available!