<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to summarize both character &amp;amp; numeric variables by id? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785850#M250850</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/127222"&gt;@acordes&lt;/a&gt;&amp;nbsp;The easiest way to see if I have solved a problem is to use the &lt;STRONG&gt;SITE:&amp;nbsp;&lt;/STRONG&gt;in your favorite search engine. For example, type the following into a search engine:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;concatenate string delimiter site:blogs.sas.com/content/iml&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;You will find &lt;A href="https://blogs.sas.com/content/iml/2014/05/14/concatenating-iml-strings.html" target="_self"&gt;"Tips for concatenating strings in SAS/IML,"&lt;/A&gt;&amp;nbsp;&amp;nbsp;which describes this problem and provides a solution.&lt;/P&gt;</description>
    <pubDate>Mon, 13 Dec 2021 18:58:02 GMT</pubDate>
    <dc:creator>Rick_SAS</dc:creator>
    <dc:date>2021-12-13T18:58:02Z</dc:date>
    <item>
      <title>How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785462#M250669</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;May I ask for your help?&amp;nbsp; &amp;nbsp;Here's the data I have &amp;amp; want. I have difficulty in summarizing both character &amp;amp; numeric variables, as I need to combine the character variable "product_ID" which cannot use proc sql (I thought). Also, I have many numeric variables actually; is there any way that I don't need to list the variable names one by one? Many thanks for your help in advance!&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data Have;&lt;BR /&gt;input&lt;BR /&gt;ID $ visit_date :mmddyy8. product_ID $ Plan_A Plan_B Plan_C Cost;&lt;BR /&gt;format visit_date mmddyy8.;&lt;BR /&gt;datalines;&lt;BR /&gt;AB 10/01/98 ABCD 50 10 20 30&lt;BR /&gt;AB 10/01/98 EFGH 50 10 20 30&lt;BR /&gt;AB 10/02/98 ABCD 50 10 20 30&lt;BR /&gt;AC 10/01/98 ABCD 50 10 20 30&lt;BR /&gt;AC 10/01/98 EFGH 50 10 20 30&lt;BR /&gt;AD 10/01/98 ABCD 50 10 20 30&lt;BR /&gt;AD 10/02/98 EFGH 50 10 20 30&lt;BR /&gt;AD 10/03/98 HIJK 50 10 20 30&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data Want;&lt;BR /&gt;input&lt;BR /&gt;ID $ visit_date :mmddyy8. product_ID $ Plan_A Plan_B Plan_C Cost;&lt;BR /&gt;format visit_date mmddyy8.;&lt;BR /&gt;datalines;&lt;BR /&gt;AB 10/01/98 ABCD&amp;amp;EFGH 100 20 40 60&lt;BR /&gt;AB 10/02/98 ABCD 50 10 20 30&lt;BR /&gt;AC 10/01/98 ABCD&amp;amp;EFGH 100 20 40 60&lt;BR /&gt;AD 10/01/98 ABCD 50 10 20 30&lt;BR /&gt;AD 10/02/98 EFGH 50 10 20 30&lt;BR /&gt;AD 10/03/98 HIJK 50 10 20 30&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Dec 2021 19:52:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785462#M250669</guid>
      <dc:creator>nop</dc:creator>
      <dc:date>2021-12-10T19:52:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785475#M250678</link>
      <description>&lt;P&gt;Concatenating character values over a group is done in a DATA step with a RETAINed variable.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have;
by id visit_date;
retain all_id;
length all_id $100;
if first.visit_date
then all_id = product_id;
else all_id = catx('&amp;amp;',all_id,product_id);
/* add similar code for summarizing the numeric variables */
if last.visit_date;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 10 Dec 2021 21:36:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785475#M250678</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2021-12-10T21:36:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785515#M250697</link>
      <description>&lt;P&gt;Thanks alot Kurt!&amp;nbsp; It works very good for the first part of character variables.&amp;nbsp; But for numeric variables, sorry as I am not very familiar with the summing in data-set (only know bit about proc sql).&amp;nbsp; Is there any way I can summarize the variable by visit_date e.g. plan_A--Cost without listing each variable name one by one?&amp;nbsp; &amp;nbsp;Thanks so much!&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 09:12:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785515#M250697</guid>
      <dc:creator>nop</dc:creator>
      <dc:date>2021-12-11T09:12:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785520#M250700</link>
      <description>&lt;P&gt;The basic data step for summarizing looks like this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;/* set up the data step, with new and incoming dataset, and defining groups */
data want;
set have;
by id visit_date;
/* define new variables */
length
  all_id $100
  sum_plan_a 8
  sum_plan_b 8
  sum_plan_c 8
  sum_cost 8
;
/* add a FORMAT statement if formats are needed */
/* RETAIN them */
retain
  all_id
  sum_plan_a
  sum_plan_b
  sum_plan_c
  sum_cost
;
/* Initialize at beginning of group */
if first.visit_date
then do;
  all_id = product_id;
  sum_plan_a = plan_a;
  sum_plan_b = plan_b;
  sum_plan_c = plan_c;
  sum_cost = cost;
end;
/* otherwise summarize */
else do;
  all_id = catx('&amp;amp;',all_id,product_id);
  sum_plan_a + plan_a;
  sum_plan_b + plan_b;
  sum_plan_c + plan_c;
  sum_cost + cost;
end;
/* output at end of group only with a Subsetting IF */
if last.visit_date;
/* KEEP only variables of interest */
keep
  id
  visit_date
  all_id
  sum_plan_a
  sum_plan_b
  sum_plan_c
  sum_cost
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This is the code that the data step compiler needs to see; you now need to decide the more efficient approach: write it all out manually (the above code needed about 5 minutes to write), or expend the effort needed to automate it. Which only makes sense if this is to be used repeatedly on datasets with changing structure.&lt;/P&gt;
&lt;P&gt;Automation needs a clear definition of variables to be automatically summarized, so you can pull their names from DICTIONARY.COLUMNS or SASHELP.VCOLUMN and use that to create code dynamically. This definition can be based on name patterns, types, formats, you name it.&lt;/P&gt;
&lt;P&gt;You also need to clearly define what kind of summarization you want (just sums, or min/max/averages, or some kind of flags) for each variable.&lt;/P&gt;
&lt;P&gt;You can also do a multiple-step approach by using PROC SUMMARY for the numeric variables:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want1;
set have;
by id visit_date;
length
  all_id $100
;
length
  all_id $100
;
retain
  all_id
;
if first.visit_date
then do;
  all_id = product_id;
end;
else do;
  all_id = catx('&amp;amp;',all_id,product_id);
end;
if last.visit_date;
keep
  id
  visit_date
  all_id
;
run;

proc summary data=have;
by id visit_date;
var plan_a--cost;
output out=want2 (drop=_type_ _freq_) sum()=;
run;

data want;
merge
  want1
  want2
;
by id visit_date;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note that I could not use the keyword _NUMERIC_ in the VAR statement because of the presence of the numeric date.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 10:25:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785520#M250700</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2021-12-11T10:25:03Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785523#M250703</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data Have;
input
ID $ visit_date :mmddyy8. product_ID $ Plan_A Plan_B Plan_C Cost;
format visit_date mmddyy8.;
datalines;
AB 10/01/98 ABCD 50 10 20 30
AB 10/01/98 EFGH 50 10 20 30
AB 10/02/98 ABCD 50 10 20 30
AC 10/01/98 ABCD 50 10 20 30
AC 10/01/98 EFGH 50 10 20 30
AD 10/01/98 ABCD 50 10 20 30
AD 10/02/98 EFGH 50 10 20 30
AD 10/03/98 HIJK 50 10 20 30
;

proc summary data=have nway;
class id visit_date;
var _numeric_;
output out=part1 sum=;
run;

data part2;
 do until(last.visit_date);
  set have;
  by id visit_date;
  length products $ 200;
  products=catx('&amp;amp;',products,product_ID);
 end;
keep id visit_date products;
run;

data want;
 merge part2 part1;
 by id visit_date;
 drop _type_ _freq_;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 11 Dec 2021 10:54:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785523#M250703</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2021-12-11T10:54:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785553#M250711</link>
      <description>&lt;P&gt;Thank you so much Kurt and Ksharp!! Both solutions worked so well and I have used them for my real data.&amp;nbsp; I actually spent 2 days but without proper training, I really couldn't figure that out.&amp;nbsp; &amp;nbsp;Million thanks for the guidance and illustration from Kurt and Ksharp!&amp;nbsp; Have a good day!&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 15:48:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785553#M250711</guid>
      <dc:creator>nop</dc:creator>
      <dc:date>2021-12-11T15:48:18Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785556#M250712</link>
      <description>&lt;P&gt;Use the FINDW() function to tell whether the PRODUCT_ID is duplicated.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Have;
  input ID $ visit_date :mmddyy8. product_ID $ Plan_A Plan_B Plan_C Cost;
  format visit_date yymmdd10.;
datalines;
AB 10/01/98 ABCD 50 10 20 30
AB 10/01/98 EFGH 50 10 20 30
AB 10/02/98 ABCD 50 10 20 30
AC 10/01/98 ABCD 50 10 20 30
AC 10/01/98 EFGH 50 10 20 30
AD 10/01/98 ABCD 50 10 20 30
AD 10/02/98 EFGH 50 10 20 30
AD 10/03/98 HIJK 50 10 20 30
;

data want;
  if 0 then set have;
  length product_list $200;
  array raw plan_a plan_b plan_c cost;
  array total sum_a sum_b sum_c sum_cost ;
  do until (last.visit_date);
    set have;
    by id visit_date ;
    if not findw(product_list,product_id,',','it') then product_list=catx(',',product_list,product_id);
    do index=1 to dim(raw);
      total[index]=sum(total[index],raw[index]);
    end;
  end;
  drop index product_id plan_a plan_b plan_c cost;
  rename
    product_list=product_ID
    sum_a=Plan_A
    sum_b=Plan_B
    sum_c=Plan_C
    sum_cost=Cost
  ; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Results&lt;/P&gt;
&lt;PRE&gt;                           product_
Obs    ID    visit_date    ID           Plan_A    Plan_B    Plan_C    Cost

 1     AB    1998-10-01    ABCD,EFGH      100       20        40       60
 2     AB    1998-10-02    ABCD            50       10        20       30
 3     AC    1998-10-01    ABCD,EFGH      100       20        40       60
 4     AD    1998-10-01    ABCD            50       10        20       30
 5     AD    1998-10-02    EFGH            50       10        20       30
 6     AD    1998-10-03    HIJK            50       10        20       30
&lt;/PRE&gt;
&lt;P&gt;PS&amp;nbsp; Do NOT use only two years for dates.&amp;nbsp; Also avoid using MDY or DMY order to display dates as either choice will confuse half of your audience.&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 15:48:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785556#M250712</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-12-11T15:48:59Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785558#M250713</link>
      <description>&lt;P&gt;Do you really want to write your own summary using data step?&lt;/P&gt;
&lt;P&gt;Why not let PROC SUMMARY do it?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
  class id visit_date ;
  var _numeric_;
  output out=want(drop=_type_ ) sum=
         idgroup (out[5] (product_id)=)
  ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If you really want the PRODUCT_ID stuffed into a single variable then add a post-processing step.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
  set want ;
  length product_ID $200;
  product_id=catx(',',of product_id_:);
  drop product_id_: ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Results&lt;/P&gt;
&lt;PRE&gt;                                                                           product_
Obs    ID    visit_date    _FREQ_    Plan_A    Plan_B    Plan_C    Cost    ID

 1     AB    1998-10-01       2        100       20        40       60     ABCD,EFGH
 2     AB    1998-10-02       1         50       10        20       30     ABCD
 3     AC    1998-10-01       2        100       20        40       60     ABCD,EFGH
 4     AD    1998-10-01       1         50       10        20       30     ABCD
 5     AD    1998-10-02       1         50       10        20       30     EFGH
 6     AD    1998-10-03       1         50       10        20       30     HIJK
&lt;/PRE&gt;</description>
      <pubDate>Sat, 11 Dec 2021 15:56:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785558#M250713</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-12-11T15:56:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785586#M250724</link>
      <description>&lt;P&gt;Thanks alot Tom!&amp;nbsp; I agree; like you, Kurt and Ksharp mentioned, proc summary is definitely a much better way as I only knew proc sql before.&amp;nbsp; Also, thanks for pointing out about the date format and will sure note to use "yymmdd10." in the future.&amp;nbsp; Great thanks Tom!&lt;/P&gt;</description>
      <pubDate>Sat, 11 Dec 2021 18:34:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785586#M250724</guid>
      <dc:creator>nop</dc:creator>
      <dc:date>2021-12-11T18:34:31Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785802#M250827</link>
      <description>&lt;P&gt;Proc IML solution.&lt;/P&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13684"&gt;@Rick_SAS&lt;/a&gt;&amp;nbsp;how can I concatenate more efficiently a char vector like a={"abcd", "xyz"} into a single string with delimiter like "|" between them? So I want to get "abcd|xyz".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc iml;
use have;
read all var _CHAR_  into y[colname=varny1];
read all var _num_  into x[colname=varny2];
close;

_date_id=catx("_", putn(x[,{"visit_date"}], 'ddmmyy10.'), y[,{"id"}]);

u=unique(_date_id);

res=j(ncol(u), 5,.);
_prod=j(ncol(u),2,"                               ");

do i=1 to ncol(u);
idx=loc(element(_date_id, u[i]));
res[i, ]= x[idx[1], 1] || x[idx, 2:5] [+,];
_prod[i,1]=u[i];
temp=y[idx, {"product_id"}]`;
if ncol(temp)=1 then _prod[i,2]=temp[1];
if ncol(temp) &amp;gt; 1 then do;
do n=1 to ncol(temp);
_prod[i,2]=strip(_prod[i,2]) + "&amp;amp;" + temp[n];
end;
end;
end;

create _x from res  [colname=varny2];
append from res;
close;

create _y from _prod[colname={"date+id", "product_combi"}];
append from _prod;
close;

data want;
merge _x _y;
format visit_date date9.;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="ok.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/66706i831D265AA8F713C5/image-size/large?v=v2&amp;amp;px=999" role="button" title="ok.png" alt="ok.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Dec 2021 17:11:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785802#M250827</guid>
      <dc:creator>acordes</dc:creator>
      <dc:date>2021-12-13T17:11:15Z</dc:date>
    </item>
    <item>
      <title>Re: How to summarize both character &amp; numeric variables by id?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785850#M250850</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/127222"&gt;@acordes&lt;/a&gt;&amp;nbsp;The easiest way to see if I have solved a problem is to use the &lt;STRONG&gt;SITE:&amp;nbsp;&lt;/STRONG&gt;in your favorite search engine. For example, type the following into a search engine:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;concatenate string delimiter site:blogs.sas.com/content/iml&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;You will find &lt;A href="https://blogs.sas.com/content/iml/2014/05/14/concatenating-iml-strings.html" target="_self"&gt;"Tips for concatenating strings in SAS/IML,"&lt;/A&gt;&amp;nbsp;&amp;nbsp;which describes this problem and provides a solution.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Dec 2021 18:58:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-summarize-both-character-amp-numeric-variables-by-id/m-p/785850#M250850</guid>
      <dc:creator>Rick_SAS</dc:creator>
      <dc:date>2021-12-13T18:58:02Z</dc:date>
    </item>
  </channel>
</rss>

