<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc sorting multiple dataset in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872270#M344611</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/253321"&gt;@CathyVI&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A word of advice, from efficiency perspective, you will be better served if the code you use does this&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;While sorting, exclude the records and columns you don't need to make it into the Proc Sort. &lt;EM&gt;&lt;STRONG&gt;(Hint: use Where= &amp;amp; Keep= Data set options on the Input data set)&lt;/STRONG&gt;&lt;/EM&gt;&lt;/LI&gt;
&lt;LI&gt;Use Proc Append instead of SET statement. Proc Append reads records in Blocks, while the SET statement reads one record at a time. (Hint: As you loop through the macro, issue Proc Append if you can)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Hope this helps&lt;/P&gt;</description>
    <pubDate>Wed, 26 Apr 2023 14:43:47 GMT</pubDate>
    <dc:creator>AhmedAl_Attar</dc:creator>
    <dc:date>2023-04-26T14:43:47Z</dc:date>
    <item>
      <title>proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872239#M344597</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I have multiple years of datasets (2006-2019). I would like to sort all the data by patient id before appending so as to reduce the size and remove unwanted records. I found a macro code but its not doing what I expected. I was able to sort the first year (2006)&amp;nbsp; if i replace ( out=) with &lt;CODE class=""&gt;claims_rx_2006&lt;/CODE&gt; but its not sorting other years.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data _null_;
   input datasets $char50. ;
   call execute 
      ("proc sort data=work."||strip(datasets)||
/*  "(keep=cum_freq cum_date rename=(cum_freq="||strip(datasets)||"))"||  */
      "out="||strip(datasets)||";"||
       "by pat_id; run;");
   cards;
claims_rx_2006
claims_rx_2007
claims_rx_2008
claims_rx_2009
claims_rx_2010
claims_rx_2011
claims_rx_2012
claims_rx_2013
claims_rx_2014
claims_rx_2015
claims_rx_2016
claims_rx_2017
claims_rx_2018
claims_rx_2019
 ;run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2023 13:23:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872239#M344597</guid>
      <dc:creator>CathyVI</dc:creator>
      <dc:date>2023-04-26T13:23:46Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872248#M344601</link>
      <description>&lt;P&gt;Try this macro:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt; %macro append;
 	%do i=2006 %to 2019;
		proc sort data=claims_rx_&amp;amp;i out= _claims_rx_&amp;amp;i;
			by pat_id;
		run;
	%end; 
	data combined;
		set %do i=2006 %to 2019;
				_claims_rx_&amp;amp;i
			%end;
			;
	run; 
%mend; 
%append; &lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 Apr 2023 13:45:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872248#M344601</guid>
      <dc:creator>A_Kh</dc:creator>
      <dc:date>2023-04-26T13:45:39Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872270#M344611</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/253321"&gt;@CathyVI&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A word of advice, from efficiency perspective, you will be better served if the code you use does this&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;While sorting, exclude the records and columns you don't need to make it into the Proc Sort. &lt;EM&gt;&lt;STRONG&gt;(Hint: use Where= &amp;amp; Keep= Data set options on the Input data set)&lt;/STRONG&gt;&lt;/EM&gt;&lt;/LI&gt;
&lt;LI&gt;Use Proc Append instead of SET statement. Proc Append reads records in Blocks, while the SET statement reads one record at a time. (Hint: As you loop through the macro, issue Proc Append if you can)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Hope this helps&lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2023 14:43:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872270#M344611</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2023-04-26T14:43:47Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872280#M344614</link>
      <description>&lt;P&gt;Save some code. If you do not want to create an additional data set then don't bother with the OUT= option.&lt;/P&gt;
&lt;P&gt;With Call Execute I would also not pass the commented out section as your code does.&lt;/P&gt;
&lt;P&gt;When I run into issues with Call Execute I will create a character value with the code created to examine the results such as&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data junk;&lt;BR /&gt;input datasets $char50. ;&lt;BR /&gt;length code $ 300;&lt;BR /&gt;code = "proc sort data=work."||strip(datasets)||&lt;BR /&gt;/* "(keep=cum_freq cum_date rename=(cum_freq="||strip(datasets)||"))"|| */&lt;BR /&gt;"out="||strip(datasets)||";"||&lt;BR /&gt;"by pat_id; run;";&lt;BR /&gt;cards;&lt;BR /&gt;claims_rx_2006&lt;BR /&gt;claims_rx_2007&lt;BR /&gt;claims_rx_2008&lt;BR /&gt;claims_rx_2009&lt;BR /&gt;;&lt;/PRE&gt;
&lt;P&gt;When I look that the results for the Code variable above I get:&lt;/P&gt;
&lt;PRE&gt;proc sort data=work.claims_rx_2006out=claims_rx_2006;by pat_id; run;&lt;/PRE&gt;
&lt;P&gt;notice that there is no space before OUT. This would likely throw like this:&lt;/P&gt;
&lt;PRE&gt;34   proc sort data=work.claims_rx_2006out=claims_rx_2006;by pat_id; run;
                                          -
                                          22
                                          76

ERROR 22-322: Syntax error, expecting one of the following: DATA, DUPOUT, IN, OUT, SIZE, SORTSEQ,
              SORTSIZE, T, TECH, TECHNIQUE, UNIOUT, UNIQUEOUT.

ERROR 76-322: Syntax error, statement will be ignored.

&lt;/PRE&gt;
&lt;P&gt;What this error doesn't tell is that the source data set likely doesn't exist because it saw the OUT as part of the dataset name.&lt;/P&gt;
&lt;P&gt;The solution would be include a SPACE before the Out= (or drop the Out entirely as not needed from the code shown)&lt;/P&gt;
&lt;PRE&gt;data junk;
   input datasets $char50. ;
   length code $ 500;
   code = "proc sort data=work."||strip(datasets)||
/*  "(keep=cum_freq cum_date rename=(cum_freq="||strip(datasets)||"))"||  */
      &lt;FONT size="5" color="#0000FF"&gt;&lt;STRONG&gt;" out="|&lt;/STRONG&gt;&lt;/FONT&gt;|strip(datasets)||";"||
       "by pat_id; run;";
   cards;
claims_rx_2006
claims_rx_2007
claims_rx_2008
claims_rx_2009
;
&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What did your LOG show. The log typically shows the submitted code from call execute.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2023 15:00:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872280#M344614</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-04-26T15:00:43Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872311#M344624</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/253321"&gt;@CathyVI&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is an example of what I was talking about&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
   input datasets $char50. ;
   call execute 
      ("proc sort data=work."||strip(datasets)||
	  /* if you can use where= and keep= options, the would come here. Just uncomment the line below and populate it as needed */
	  /* '(KEEP=pat_id  WHERE=(1=1))' || */
      " out="||strip(datasets)||"; by pat_id; run; PROC APPEND BASE=combined DATA="||strip(datasets)||"; run;"
	);
   cards;
claims_rx_2006
claims_rx_2007
claims_rx_2008
claims_rx_2009
claims_rx_2010
claims_rx_2011
claims_rx_2012
claims_rx_2013
claims_rx_2014
claims_rx_2015
claims_rx_2016
claims_rx_2017
claims_rx_2018
claims_rx_2019
 ;run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Hope this helps&lt;/P&gt;</description>
      <pubDate>Wed, 26 Apr 2023 16:33:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872311#M344624</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2023-04-26T16:33:46Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872313#M344626</link>
      <description>&lt;P&gt;You are working way too hard.&amp;nbsp;&lt;STRONG&gt;Copy and paste is your friend.&lt;/STRONG&gt; It would be just as easy to create the proc sort steps directly as it would be to make the data step you showed.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=claims_rx_2006 ;
  by pat_id;
run;
proc sort data=claims_rx_2007 ;
  by pat_id;
run;
proc sort data=claims_rx_2008 ;
  by pat_id;
run;
proc sort data=claims_rx_2009 ;
  by pat_id;
run;
proc sort data=claims_rx_2010 ;
  by pat_id;
run;
proc sort data=claims_rx_2011 ;
  by pat_id;
run;
proc sort data=claims_rx_2012 ;
  by pat_id;
run;
proc sort data=claims_rx_2013 ;
  by pat_id;
run;
proc sort data=claims_rx_2014 ;
  by pat_id;
run;
proc sort data=claims_rx_2015 ;
  by pat_id;
run;
proc sort data=claims_rx_2016 ;
  by pat_id;
run;
proc sort data=claims_rx_2017 ;
  by pat_id;
run;
proc sort data=claims_rx_2018 ;
  by pat_id;
run;
proc sort data=claims_rx_2019 ;
  by pat_id;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you did want to do it with code then just use a do loop from 2006 to 2019 rather than typing in all of the dataset names.&amp;nbsp; Also use the CAT... series of functions to make your code easier to read.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  do year=2006 to 2019;
    call execute(catx(' '
      ,'proc sort data=',cats('claims_rx_',year)
      ,';', 'by pat_id;','run;'
    ));
  end;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;NOTE: CALL EXECUTE generated line.
1    + proc sort data= claims_rx_2006 ; by pat_id; run;
2    + proc sort data= claims_rx_2007 ; by pat_id; run;
3    + proc sort data= claims_rx_2008 ; by pat_id; run;
...&lt;/PRE&gt;
&lt;P&gt;And once you have them sorted there is no need for copy and paste or other code generation methods to combine all of the datasets.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set claims_rx_2006 - claims_rx_2019;
  by pat_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 Apr 2023 16:51:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872313#M344626</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-04-26T16:51:01Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872314#M344627</link>
      <description>&lt;P&gt;But the code did a lot of work to sort the individual datasets.&lt;/P&gt;
&lt;P&gt;So use SET so that the combined dataset is also sorted.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set claims_rx_2006 - claims_rx_2019;
  by pat_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 26 Apr 2023 16:49:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872314#M344627</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-04-26T16:49:40Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872326#M344633</link>
      <description>How big are the data sets? I wonder if an index may be a better approach.</description>
      <pubDate>Wed, 26 Apr 2023 17:17:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872326#M344633</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-04-26T17:17:50Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872613#M344742</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp; &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13868"&gt;@AhmedAl_Attar&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/321371"&gt;@A_Kh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you all. I tested some of the codes and they all worked. However&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;If i want to remove duplicate records using your code here will I insert NODUKEY.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2023 17:18:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872613#M344742</guid>
      <dc:creator>CathyVI</dc:creator>
      <dc:date>2023-04-27T17:18:19Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872618#M344745</link>
      <description>&lt;P&gt;What do you mean by duplicate records?&lt;/P&gt;
&lt;P&gt;If you want to eliminate duplicates from the individual files then include the NODUPKEY option in the generated PROC SORT code.&amp;nbsp; Placement should be obvious, or easy to fix once you try it and get an error message because you put it in the wrong place.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you want to eliminate duplicates caused by the same PAT_ID appearing in multiple datasets then you can use subsetting IF statement in the data step that combines them.&amp;nbsp; For example to keep the earliest observation you could do:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set claims_rx_2006 - claims_rx_2019 ;
  by pat_id;
  if first.pat_id;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Apr 2023 17:27:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872618#M344745</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-04-27T17:27:03Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872622#M344748</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp; The placement seems obvious but i have tried to insert in every possible area with no luck.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data _null_;
do year=2006 to 2019;
call execute(catx(' '
,'proc sort data=', cats('claims_rx_',year)
,';', 'by pat_id;','run;'
));
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Apr 2023 17:44:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872622#M344748</guid>
      <dc:creator>CathyVI</dc:creator>
      <dc:date>2023-04-27T17:44:56Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872625#M344749</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
do year=2006 to 2019;
call execute(catx(' '
,'proc sort data=', cats('claims_rx_',year)
,' nodupkey;', 'by pat_id;','run;'
));
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If it errors out, check if the nodupkey is being added to the end of the data set name. May need to switch from CATS to a different concatenate operator that won't steal your spaces.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2023 18:07:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872625#M344749</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-04-27T18:07:59Z</dc:date>
    </item>
    <item>
      <title>Re: proc sorting multiple dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872631#M344751</link>
      <description>&lt;P&gt;It is part of the PROC statement itself.&lt;/P&gt;
&lt;P&gt;Try any of these locations.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  do year=2006 to 2019;
    call execute(catx(' '
      ,'proc sort nodupkey data=', cats('claims_rx_',year)
      ,';', 'by pat_id;','run;'
    ));
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  do year=2006 to 2019;
    call execute(catx(' '
      ,'proc sort data=', cats('claims_rx_',year)
      ,'nodupkey;', 'by pat_id;','run;'
    ));
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;or&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  do year=2006 to 2019;
    call execute(catx(' '
      ,'proc sort data=', cats('claims_rx_',year)
      ,'nodupkey',';', 'by pat_id;','run;'
    ));
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Apr 2023 18:28:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sorting-multiple-dataset/m-p/872631#M344751</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-04-27T18:28:20Z</dc:date>
    </item>
  </channel>
</rss>

