<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic What Makes Performance Better  - Alternative Methods for PROC SORT &amp;amp; PROC TRANSPOSE in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548258#M152012</link>
    <description>&lt;P&gt;Hello everybody,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have sample code as below,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Have;
Length ID 8 Variable $ 32;
Infile Datalines Missover; 
Input ID Variable;
Datalines;
1 AAA
1 BBB
1 CCC
2 AAA
2 BBB
2 CCC
;

PROC SORT DATA=HAVE(KEEP=Variable ID) OUT=WANT ;
	BY ID;
RUN;
PROC TRANSPOSE DATA=WANT OUT=WANT(drop=_NAME_)
	PREFIX=Test;
	BY ID;
	VAR Variable;
RUN; QUIT;

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;But in real life, when I want to do this process with million values, it takes a lot of time so I want your help to maker it faster. As you can see there is&amp;nbsp; multiplexing problem in my code and I could solve it with PROC SORT and PROC TRANSPOSE. Can somebody help me do it better?&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Wed, 03 Apr 2019 15:49:33 GMT</pubDate>
    <dc:creator>ertr</dc:creator>
    <dc:date>2019-04-03T15:49:33Z</dc:date>
    <item>
      <title>What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548258#M152012</link>
      <description>&lt;P&gt;Hello everybody,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have sample code as below,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Have;
Length ID 8 Variable $ 32;
Infile Datalines Missover; 
Input ID Variable;
Datalines;
1 AAA
1 BBB
1 CCC
2 AAA
2 BBB
2 CCC
;

PROC SORT DATA=HAVE(KEEP=Variable ID) OUT=WANT ;
	BY ID;
RUN;
PROC TRANSPOSE DATA=WANT OUT=WANT(drop=_NAME_)
	PREFIX=Test;
	BY ID;
	VAR Variable;
RUN; QUIT;

&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;But in real life, when I want to do this process with million values, it takes a lot of time so I want your help to maker it faster. As you can see there is&amp;nbsp; multiplexing problem in my code and I could solve it with PROC SORT and PROC TRANSPOSE. Can somebody help me do it better?&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 03 Apr 2019 15:49:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548258#M152012</guid>
      <dc:creator>ertr</dc:creator>
      <dc:date>2019-04-03T15:49:33Z</dc:date>
    </item>
    <item>
      <title>Re: What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548262#M152014</link>
      <description>&lt;P&gt;Just &lt;STRONG&gt;a million?or over a million?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or&lt;STRONG&gt; millions?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Or &lt;STRONG&gt;100's of millions or&amp;nbsp; billion?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And is it always &lt;STRONG&gt;sets of 3&lt;/STRONG&gt; for each BY group like what your sample suggests?&lt;/P&gt;</description>
      <pubDate>Wed, 03 Apr 2019 16:11:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548262#M152014</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-04-03T16:11:48Z</dc:date>
    </item>
    <item>
      <title>Re: What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548269#M152018</link>
      <description>&lt;P&gt;The only reason you are&amp;nbsp; sorting is because dataset HAVE is not sorted.&amp;nbsp; If it's a big data set then a single DATA step using a hash object may be faster than SORT then TRANSPOSE:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data template;
  set have (keep=id);
  length test1-test30 $32;
  t=0;
  stop;
run;
data _null_;
  retain max_t;
  if _n_=1 then do;
   if 0 then set template;
   declare hash h (dataset:'template',ordered:'A');
     h.definekey('id');
     h.definedata(all:'Y');
     h.definedone();
  end;
  array tst {*} $32 test:;
  set have  end=end_of_have;
  if h.find()=0 then t=t+1;
  else t=1;
  tst{t}=variable;
  h.replace();
  max_t=max(max_t,t);
  if end_of_have then do;
    want_descriptor=cats('WANT (keep=ID test1-test',max_t,')');
    rc=h.output(dataset:want_descriptor);
  end;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;OL&gt;
&lt;LI&gt;The data set TEMPLATE is made just to avoid a lot of typing when declaring the hash object.&amp;nbsp;&amp;nbsp; Typing&amp;nbsp;&amp;nbsp; all:'Y' is a lot simpler than typing 'TEST1','TEST2',.....,'TEST30'.&lt;/LI&gt;
&lt;LI&gt;Put a number of TEST variables in TEMPLATE that is sure to exceed the number of tests for any ID.&lt;/LI&gt;
&lt;LI&gt;Note that the order of test values within an id will be preserved, even if&amp;nbsp; they are initially separated by other ID's.&lt;/LI&gt;
&lt;LI&gt;The want_descriptor is the means of dynaimically controlling the number of variables in WANT.&amp;nbsp; It will be the minimum necessary, just like proc transpose.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The value here is to save on disk input/output, at the expense of memory for the hash object.&lt;/P&gt;</description>
      <pubDate>Wed, 03 Apr 2019 16:27:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548269#M152018</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-04-03T16:27:38Z</dc:date>
    </item>
    <item>
      <title>Re: What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548459#M152087</link>
      <description>&lt;P&gt;Thank you for your invaluable response. It seems work, however, I am not familiar with Hash code, can you please explain more deeply(step by step maybe)?&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp; On the other hand, our data sets 50 million rows. On the other hand, are there any alternative methods except from Hash?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 04 Apr 2019 11:43:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548459#M152087</guid>
      <dc:creator>ertr</dc:creator>
      <dc:date>2019-04-04T11:43:51Z</dc:date>
    </item>
    <item>
      <title>Re: What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548470#M152091</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Have;
Length ID 8 Variable $ 32;
Infile Datalines Missover; 
Input ID Variable;
Datalines;
1 AAA
1 BBB
1 CCC
2 AAA
2 BBB
2 CCC
;

proc sql noprint;
create index id on have;
select max(n) into : n
 from (select count(*) as n from have group by id);
quit;
data want;
 do i=1 by 1 until(last.id);
  set have;
  by id;
  array x{*} $ 80 test1-test%left(&amp;amp;n);
  x{i}=variable;
 end;
 drop i variable;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 04 Apr 2019 12:22:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548470#M152091</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2019-04-04T12:22:35Z</dc:date>
    </item>
    <item>
      <title>Re: What Makes Performance Better  - Alternative Methods for PROC SORT &amp; PROC TRANSPOSE</title>
      <link>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548534#M152101</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/90606"&gt;@ertr&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hello everybody,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have sample code as below,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Data Have;
Length ID 8 Variable $ 32;
Infile Datalines Missover; 
Input ID Variable;
Datalines;
1 AAA
1 BBB
1 CCC
2 AAA
2 BBB
2 CCC
;

PROC SORT DATA=HAVE(KEEP=Variable ID) OUT=WANT ;
	BY ID;
RUN;
PROC TRANSPOSE DATA=WANT OUT=WANT(drop=_NAME_)
	PREFIX=Test;
	BY ID;
	VAR Variable;
RUN; QUIT;

&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;But in real life, when I want to do this process with million values, it takes a lot of time so I want your help to maker it faster. As you can see there is&amp;nbsp; multiplexing problem in my code and I could solve it with PROC SORT and PROC TRANSPOSE. Can somebody help me do it better?&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Your example data implies that the values of variable may repeat for all your ID values (though possibly in a different order). Is that actually the case? If so selecting the distinct values of the variable, transposing and join to the id list might be another approach.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Apr 2019 15:10:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/What-Makes-Performance-Better-Alternative-Methods-for-PROC-SORT/m-p/548534#M152101</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-04-04T15:10:58Z</dc:date>
    </item>
  </channel>
</rss>

