<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic PROQ SQL NEW VAR  &amp;amp; GROUPING in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413973#M279975</link>
    <description>&lt;P&gt;Good day&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;i have a data set that looks like below :&lt;/P&gt;
&lt;P&gt;data have;&lt;BR /&gt; input UNIQUE $ DATE $ NEW_VAR ;&lt;BR /&gt; NEW_VAR = SUBSTR(DATE,3,7);&lt;BR /&gt;&lt;BR /&gt;1 12JAN13&amp;nbsp; &amp;nbsp; JAN&lt;BR /&gt;2 01MAY15&amp;nbsp; &amp;nbsp; MAY&lt;BR /&gt;3 17APR14&amp;nbsp; &amp;nbsp; APR&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;I want to group &amp;amp; order by unique &amp;amp; new_var&amp;nbsp; but I get an error that the new_var does not exist , I want this to be done in&amp;nbsp; PROC SQL and not a data step as it has over 24million records and space is an issue.&lt;/P&gt;</description>
    <pubDate>Thu, 16 Nov 2017 12:41:51 GMT</pubDate>
    <dc:creator>EMC9</dc:creator>
    <dc:date>2017-11-16T12:41:51Z</dc:date>
    <item>
      <title>PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413973#M279975</link>
      <description>&lt;P&gt;Good day&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;i have a data set that looks like below :&lt;/P&gt;
&lt;P&gt;data have;&lt;BR /&gt; input UNIQUE $ DATE $ NEW_VAR ;&lt;BR /&gt; NEW_VAR = SUBSTR(DATE,3,7);&lt;BR /&gt;&lt;BR /&gt;1 12JAN13&amp;nbsp; &amp;nbsp; JAN&lt;BR /&gt;2 01MAY15&amp;nbsp; &amp;nbsp; MAY&lt;BR /&gt;3 17APR14&amp;nbsp; &amp;nbsp; APR&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;I want to group &amp;amp; order by unique &amp;amp; new_var&amp;nbsp; but I get an error that the new_var does not exist , I want this to be done in&amp;nbsp; PROC SQL and not a data step as it has over 24million records and space is an issue.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 12:41:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413973#M279975</guid>
      <dc:creator>EMC9</dc:creator>
      <dc:date>2017-11-16T12:41:51Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413974#M279976</link>
      <description>&lt;P&gt;Please elaborate. How do you want your grouping done? The data is unique in both variables, so there is not much grouping to be done.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Post a clear description of how your data looks and what your desired outcome looks like.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 12:44:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413974#M279976</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2017-11-16T12:44:32Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413975#M279977</link>
      <description>&lt;P&gt;&lt;SPAN&gt;UNIQUE DATE NEW_VAR&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1 12JAN13&amp;nbsp; &amp;nbsp; JAN&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1 15JAN13&amp;nbsp; &amp;nbsp; JAN&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1 14JAN13&amp;nbsp; &amp;nbsp; JAN&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1 19JAN13&amp;nbsp; &amp;nbsp; JAN&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;1 30JAN13&amp;nbsp; &amp;nbsp; JAN&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2 01MAY15&amp;nbsp; &amp;nbsp; MAY&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;3 17APR14&amp;nbsp; &amp;nbsp; APR&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;so one Unique&amp;nbsp; number could be in multiple dates so I want to group that unique number by unique month , if that makes sense.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 12:50:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413975#M279977</guid>
      <dc:creator>EMC9</dc:creator>
      <dc:date>2017-11-16T12:50:27Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413977#M279978</link>
      <description>&lt;P&gt;Not sure if this is what you are after&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input UNIQUE DATE$;
datalines;
1 12JAN13
1 15JAN13
1 14JAN13
1 19JAN13
1 30JAN13
2 01MAY15
3 17APR14
;

proc sql;
	create table want as
	select *
	      ,input(date, date9.) as NewVar format=monname3.
	from have;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 16 Nov 2017 13:02:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413977#M279978</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2017-11-16T13:02:59Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413978#M279979</link>
      <description>&lt;P&gt;hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;yes, this is what I am after with grouping - when I try group by unique , newvar it does not initialize ?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 13:09:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413978#M279979</guid>
      <dc:creator>EMC9</dc:creator>
      <dc:date>2017-11-16T13:09:12Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413979#M279980</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Right, a few issues here.&lt;/P&gt;
&lt;P&gt;First off, DONT WRITE OR CODE ALL IN UPPERCASE!&lt;/P&gt;
&lt;P&gt;Next off your "test" datastep does not work.&amp;nbsp; You input new_var from what I assume to be your datalines (although datalines is missing), as a numeric, which JAN is not for instance.&amp;nbsp; So the variable gets created as numeric with no data.&lt;/P&gt;
&lt;P&gt;Next up is the substr, which is being put into a numeric variable - that will also fail.&lt;/P&gt;
&lt;P&gt;So to correct your test data step to start with looks like this:&lt;/P&gt;
&lt;PRE&gt;data have;
  input unique $ date $;
  new_var=substr(date,3,7);
datalines;
1 12JAN13 JAN
2 01MAY15 MAY
3 17APR14 APR
;
run;&lt;/PRE&gt;
&lt;P&gt;Now onto your question.&amp;nbsp; You say you want this to be done in SQL, there is only one technical reason why you would&amp;nbsp;&lt;STRONG&gt;have&lt;/STRONG&gt; to use proc sql, and that is if you are passing the SQL code through to a database.&amp;nbsp; Otherwise there is no&amp;nbsp;&lt;STRONG&gt;requirement&lt;/STRONG&gt; to use SQL, and in some cases the resource usage will be higher than if you use datastep.&amp;nbsp; So your statement that space is an issue actually works against the use SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now you say you get an error in the code you are running, please present the code you are running and sufficient test data to be able to check, as running the one I present above will have a new_var variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Finally, if your working with 24 million records, then its likely you need to approach the problem differently, use some big data techniques, filter the data down as much as you can, add flags and groups rather than creating multiple datasets, sort only when absolutely necessary.&amp;nbsp; Just switching to using proc sql will not solve these things for you.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 13:11:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/413979#M279980</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-11-16T13:11:32Z</dc:date>
    </item>
    <item>
      <title>Re: PROQ SQL NEW VAR  &amp; GROUPING</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/414029#M279981</link>
      <description>&lt;P&gt;If your data was actually read with the code you provide you should have a bunch of lines in the log that look something like this:&lt;/P&gt;
&lt;PRE&gt;NOTE: Invalid data for NEW_VAR in line 5 14-16.
NOTE: Invalid third argument to function SUBSTR at line 3 column 12.
NOTE: Invalid numeric data, 'JAN13' , at line 3 column 12.
RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+--
5          1 12JAN13    JAN
UNIQUE=1 DATE=12JAN13 NEW_VAR=. _ERROR_=1 _N_=1
&lt;/PRE&gt;
&lt;P&gt;Since you read new_var on the input statement without a $ it is assumed to be numeric. Then values such as Jan May Apr are invalid numeric values. Also the assignment&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;NEW_VAR = SUBSTR(DATE,&lt;/FONT&gt;&lt;FONT color="#008080" face="SAS Monospace" size="2"&gt;3&lt;/FONT&gt;&lt;FONT face="SAS Monospace" size="2"&gt;,&lt;/FONT&gt;&lt;FONT color="#008080" face="SAS Monospace" size="2"&gt;7&lt;/FONT&gt;&lt;FONT face="SAS Monospace" size="2"&gt;)&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;subsequently fails because New_var is numeric.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;So new_var is always missing.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;Show the actual Proc SQL that your are attempting to see if you are doing something similar in your SQL.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;You may need to use&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;group by calculated new_var &lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;to tell SAS that the variable used for grouping is calculated on the Select clause.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="SAS Monospace" size="2"&gt;However if you use actual SAS Date values instead of character many things can be done with date functions or formats that may not require adding an additional variable at all, depending on why you need the month abbreviation.&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 15:07:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROQ-SQL-NEW-VAR-amp-GROUPING/m-p/414029#M279981</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2017-11-16T15:07:07Z</dc:date>
    </item>
  </channel>
</rss>

