<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to replace specific values to a newly generated dataset from an old dataset in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938989#M20973</link>
    <description>&lt;P&gt;Your example code is creating Date_creation as a character variable length 20 (and Repeat_instance). Either assign a length of 8 (Not $ &lt;span class="lia-unicode-emoji" title=":smiling_face_with_sunglasses:"&gt;😎&lt;/span&gt; to numeric variables or don't place them on a length statement. One suspects was the result of a quicky approach to keep variables in specific column order. But this is causing errors because the character values of Creatio_date cannot use the numeric informat YYMMDD10.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See if this handles the Table1 and Table1_returned as desired.&lt;/P&gt;
&lt;PRE&gt;proc sort data=table1;
   by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
run;

proc sort data=table1_returned;
   by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
run;

data example;
  update table1           (in=inbase)
         table1_returned  (in=inreturn drop=Date_creation)
  ;
  by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
  if inreturn and status=:"Closed" then delete;

  /* what date would be assigned???*/
run;
&lt;/PRE&gt;
&lt;P&gt;Really need some rules about what "creation_date" would be changed to. You didn't provide any guideline.&lt;/P&gt;
&lt;P&gt;This update might require dropping the values of Creation_date from the Returned table, otherwise matches not dropped would have the creation_date updated to the value in the returned data set. If your returned data has more variables than shown then on the Update statement only the BY variables plus the Status should be kept as dataset options for the returned data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 12 Aug 2024 16:50:20 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2024-08-12T16:50:20Z</dc:date>
    <item>
      <title>How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938941#M20970</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;This is my first post here, so I'll try to be as clear as possible (I'm currently using SAS OnDemand for Academics).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have to generate a listing of queries monthly, and use the previous generated queries sas file (if it exists) to update the new one and avoid sending a query which has already been answered.&lt;/P&gt;&lt;P&gt;What I want to update in the sas file would be the date and the status of the newly generated queries.&lt;/P&gt;&lt;P&gt;Here is a short example of the sas file structure, and how it should looks like at the end :&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is the first iteration, the generated table always has the Ongoing status.&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data table1;&lt;BR /&gt;infile cards dsd dlm=',' truncover ;&lt;BR /&gt;length record_id $7 Center $2 Event_name $20 Repeat_instance Form $20 Query $50 Date_creation Editcheck_ID $20 Status $50;&lt;BR /&gt;input record_id $ Center $ Event_name $ Repeat_instance Form $ Query $ Date_creation :yymmdd10. Editcheck_ID $ Status $;&lt;BR /&gt;cards;&lt;BR /&gt;01-001, 01, Procedure, ., Procedure, Incorrect date, 2024-08-06, procdt_A, Ongoing&lt;BR /&gt;01-001, 01, Procedure, ., Treatment, Missing data, 2024-08-06, trtname_A, Ongoing&lt;BR /&gt;01-001, 01, Procedure, 1, AE, Discrepancy in AE date, 2024-08-06, aedt_A, Ongoing&lt;BR /&gt;02-015, 02, Visit 1, ., Treatment, Missing data, 2024-08-06, trtname_A, Ongoing&lt;BR /&gt;02-015, 02, Visit 1, 3, AE, AE is not declared, 2024-08-06, aename_A, Ongoing&lt;BR /&gt;02-003, 02, Procedure, ., Treatment, Out of bounds data, 2024-08-06, trtdose_A, Ongoing&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I send the previous table in a Excel file to each center, and I receive it with the status updated a month later.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data table1_returned;&lt;BR /&gt;infile cards dsd dlm=',' truncover ;&lt;BR /&gt;length record_id $7 Center $2 Event_name $20 Repeat_instance Form $20 Query $50 Date_creation Editcheck_ID $20 Status $50;&lt;BR /&gt;input record_id $ Center $ Event_name $ Repeat_instance Form $ Query $ Date_creation :yymmdd10. Editcheck_ID $ Status $;&lt;BR /&gt;format Date_creation yymmdd10.;&lt;BR /&gt;cards;&lt;BR /&gt;01-001, 01, Procedure, ., Procedure, Incorrect date, 2024-08-06, procdt_A, Ongoing&lt;BR /&gt;01-001, 01, Procedure, ., Treatment, Missing data, 2024-08-06, trtname_A, Closed - value corrected&lt;BR /&gt;01-001, 01, Procedure, 1, AE, Discrepancy in AE date, 2024-08-06, aedt_A, Closed - value corrected&lt;BR /&gt;02-015, 02, Visit 1, ., Treatment, Missing data, 2024-08-06, trtname_A, Ongoing&lt;BR /&gt;02-015, 02, Visit 1, 3, AE, AE is not declared, 2024-08-06, aename_A, Closed - value corrected&lt;BR /&gt;02-003, 02, Procedure, ., Treatment, Out of bounds data, 2024-08-06, trtdose_A, Closed - value not corrected&lt;BR /&gt;;&lt;BR /&gt;run;&lt;BR /&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;This is the second iteration, some older queries appear again.&lt;/P&gt;&lt;PRE&gt;data table2;&lt;BR /&gt;infile cards dsd dlm=',' truncover ;&lt;BR /&gt;length record_id $7 Center $2 Event_name $20 Repeat_instance Form $20 Query $50 Date_creation Editcheck_ID $20 Status $50;&lt;BR /&gt;input record_id $ Center $ Event_name $ Repeat_instance Form $ Query $ Date_creation :yymmdd10. Editcheck_ID $ Status $;&lt;BR /&gt;cards;&lt;BR /&gt;05-002, 05, Inclusion, ., Inclusion, Incorrect date, 2024-09-13, incdt_A, Ongoing&lt;BR /&gt;01-001, 01, Procedure, ., Procedure, Incorrect date, 2024-09-13, procdt_A, Ongoing&lt;BR /&gt;05-002, 05, Inclusion, ., Medical history, Out of bounds data, 2024-09-13, mhiketdose_A, Ongoing&lt;BR /&gt;06-007, 06, Inclusion, ., Demography, Discrepancy in Birth date, 2024-09-13, dembddt_A, Ongoing&lt;BR /&gt;07-015, 07, Visit 3, ., Treatment, Missing data, 2024-09-13, trtname_A, Ongoing&lt;BR /&gt;02-015, 02, Visit 1, ., Treatment, Missing data, 2024-09-13, trtname_A, Ongoing&lt;BR /&gt;02-003, 02, Procedure, ., Treatment, Out of bounds data, 2024-09-13, trtdose_A, Ongoing&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;And this is where I'm stuck, on how to update the new dataset by comparing it to the previous one. Since each real dataset can be a thousand lines long, I don't think a proc compare is the best.&lt;/P&gt;&lt;P&gt;If the queries already exists and has been answered (ex : out of bounds data) it should be deleted, if it exists and has not been answered, the Date_creation should be updated.&lt;/P&gt;&lt;PRE&gt;data table2_final;&lt;BR /&gt;infile cards dsd dlm=',' truncover ;&lt;BR /&gt;length record_id $7 Center $2 Event_name $20 Repeat_instance Form $20 Query $50 Date_creation Editcheck_ID $20 Status $50;&lt;BR /&gt;input record_id $ Center $ Event_name $ Repeat_instance Form $ Query $ Date_creation :yymmdd10. Editcheck_ID $ Status $;&lt;BR /&gt;cards;&lt;BR /&gt;05-002, 05, Inclusion, ., Inclusion, Incorrect date, 2024-09-13, incdt_A, Ongoing&lt;BR /&gt;01-001, 01, Procedure, ., Procedure, Incorrect date, 2024-08-06, procdt_A, Ongoing&lt;BR /&gt;05-002, 05, Inclusion, ., Medical history, Out of bounds data, 2024-09-13, mhiketdose_A, Ongoing&lt;BR /&gt;06-007, 06, Inclusion, ., Demography, Discrepancy in Inclusion date, 2024-09-13, dembddt_A, Ongoing&lt;BR /&gt;07-015, 07, Visit 3, ., Treatment, Missing data, 2024-09-13, trtname_A, Ongoing&lt;BR /&gt;02-015, 02, Visit 1, ., Treatment, Missing data, 2024-08-06, trtname_A, Ongoing&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your answers !&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 15:33:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938941#M20970</guid>
      <dc:creator>Dmrise</dc:creator>
      <dc:date>2024-08-12T15:33:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938953#M20971</link>
      <description>&lt;P&gt;What combination of fields (if any) form a unique identifier for your records?&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 12:58:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938953#M20971</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2024-08-12T12:58:03Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938980#M20972</link>
      <description>&lt;P&gt;Your answer made me realize my example was incomplete, so I updated the tables I used with the&amp;nbsp;Editcheck_ID variable (which is the variable name associated to the discrepancy).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The unique identifier would be a combination of record_id+Event_name+Repeat_instance+Form+Query+Editcheck_ID&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 15:36:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938980#M20972</guid>
      <dc:creator>Dmrise</dc:creator>
      <dc:date>2024-08-12T15:36:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938989#M20973</link>
      <description>&lt;P&gt;Your example code is creating Date_creation as a character variable length 20 (and Repeat_instance). Either assign a length of 8 (Not $ &lt;span class="lia-unicode-emoji" title=":smiling_face_with_sunglasses:"&gt;😎&lt;/span&gt; to numeric variables or don't place them on a length statement. One suspects was the result of a quicky approach to keep variables in specific column order. But this is causing errors because the character values of Creatio_date cannot use the numeric informat YYMMDD10.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;See if this handles the Table1 and Table1_returned as desired.&lt;/P&gt;
&lt;PRE&gt;proc sort data=table1;
   by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
run;

proc sort data=table1_returned;
   by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
run;

data example;
  update table1           (in=inbase)
         table1_returned  (in=inreturn drop=Date_creation)
  ;
  by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
  if inreturn and status=:"Closed" then delete;

  /* what date would be assigned???*/
run;
&lt;/PRE&gt;
&lt;P&gt;Really need some rules about what "creation_date" would be changed to. You didn't provide any guideline.&lt;/P&gt;
&lt;P&gt;This update might require dropping the values of Creation_date from the Returned table, otherwise matches not dropped would have the creation_date updated to the value in the returned data set. If your returned data has more variables than shown then on the Update statement only the BY variables plus the Status should be kept as dataset options for the returned data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 12 Aug 2024 16:50:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/938989#M20973</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-08-12T16:50:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/939063#M20974</link>
      <description>&lt;P&gt;Thanks for the input, adding 8. in the length statement and a format statement for the date variable changed it to the proper format.&lt;/P&gt;&lt;P&gt;About the "rules" of this date variable, it is automatically created when running the SAS program and its value is the day the program has been ran, through the &lt;STRONG&gt;&lt;EM&gt;date()&amp;nbsp;&lt;/EM&gt;&lt;/STRONG&gt;function.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I wasn't aware of the update statement, but by changing the tables used in your example (I wanted to compare table1_return with the newly generated table2) and not dropping the date variable (I wanted to replace the new date by the old date if the query already exists), the outputted &lt;EM&gt;example&lt;/EM&gt; dataset perfectly match the&amp;nbsp; desired result from &lt;EM&gt;table2_final&lt;/EM&gt;.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data example;
  update table2		  (in=inbase)
         table1_returned  (in=inreturn)
  ;
  by  record_id Event_name Repeat_instance Form Query Editcheck_ID;
  if inreturn and status=:"Closed" then delete;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Your solution solved my issue &lt;span class="lia-unicode-emoji" title=":ok_hand:"&gt;👌&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Tue, 13 Aug 2024 13:50:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/939063#M20974</guid>
      <dc:creator>Dmrise</dc:creator>
      <dc:date>2024-08-13T13:50:48Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/939065#M20975</link>
      <description>&lt;P&gt;Glad to help.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Some things to be aware of with UPDATE that you may not catch in reading the documentation the first time.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First is the base data set, the first one on the UPDATE statement can only have ONE observation with the values of the by statement.&lt;/P&gt;
&lt;P&gt;Second is the transaction data set, the second on the Update state, may have more than one observation and will apply the all the values from all the observations in sequence.&lt;/P&gt;
&lt;P&gt;Any combination of BY variables in the transaction data set not appearing in the base data will be added to the result. From your problem description I couldn't tell if this might be an issue or not. If so you may want to have another IN= variable for the base data set and only write the output data set when the observation is from that set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 Aug 2024 14:01:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/939065#M20975</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-08-13T14:01:10Z</dc:date>
    </item>
    <item>
      <title>Re: How to replace specific values to a newly generated dataset from an old dataset</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/942124#M20978</link>
      <description>&lt;P&gt;To achieve the desired outcome, you can approach this by using a combination of &lt;CODE&gt;MERGE&lt;/CODE&gt; and conditional processing in SAS. Here’s how you can proceed:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Step 1: Sort the datasets&lt;/STRONG&gt;&lt;BR /&gt;To ensure that the &lt;CODE&gt;MERGE&lt;/CODE&gt; step aligns the datasets correctly, you need to sort both datasets by the common keys.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Step 2: Merge the datasets&lt;/STRONG&gt;&lt;BR /&gt;Use the &lt;CODE&gt;MERGE&lt;/CODE&gt; statement to join the previous dataset (&lt;CODE&gt;table1_returned&lt;/CODE&gt;) with the new dataset (&lt;CODE&gt;table2&lt;/CODE&gt;). You can then conditionally update the &lt;CODE&gt;Date_creation&lt;/CODE&gt; and &lt;CODE&gt;Status&lt;/CODE&gt; based on the logic provided.&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P&gt;&lt;STRONG&gt;Step 3: Filter the results&lt;/STRONG&gt;&lt;BR /&gt;After the merge, keep only the records that meet your criteria: either update the date or remove answered queries.&lt;/P&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Here is a sample SAS code to implement this logic:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* Step 1: Sort the datasets */&lt;BR /&gt;proc sort data=table1_returned;&lt;BR /&gt;by record_id Center Event_name Repeat_instance Form Editcheck_ID;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc sort data=table2;&lt;BR /&gt;by record_id Center Event_name Repeat_instance Form Editcheck_ID;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/* Step 2: Merge datasets */&lt;BR /&gt;data merged_data;&lt;BR /&gt;merge table2(in=new) table1_returned(in=old);&lt;BR /&gt;by record_id Center Event_name Repeat_instance Form Editcheck_ID;&lt;/P&gt;&lt;P&gt;/* Step 3: Apply logic */&lt;BR /&gt;if new and old then do;&lt;BR /&gt;if Status in ('Ongoing') then Date_creation = Date_creation; /* Retain old date if ongoing */&lt;BR /&gt;else delete; /* Remove if already closed */&lt;BR /&gt;end;&lt;/P&gt;&lt;P&gt;if new and not old then do;&lt;BR /&gt;/* New queries remain unchanged */&lt;BR /&gt;Status = 'Ongoing';&lt;BR /&gt;end;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/* Step 4: Filter final output */&lt;BR /&gt;data table2_final;&lt;BR /&gt;set merged_data;&lt;BR /&gt;if not missing(Status);&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Sep 2024 01:08:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/How-to-replace-specific-values-to-a-newly-generated-dataset-from/m-p/942124#M20978</guid>
      <dc:creator>Sarath_A_SAS</dc:creator>
      <dc:date>2024-09-02T01:08:04Z</dc:date>
    </item>
  </channel>
</rss>

