<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Check uniqueness of a variable in a complicated dataset in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835882#M35987</link>
    <description>Hello Ballardw, thank you so much for your response! Apparently I did not think deep enough, I apologize. After confirming with my teammate, the dataset is in the format that when there are multiple rows for the same subject, at least the city would be different. The state can repeat or change back and forth, but city is ALWAYS unique (So for the same id, there will not be rows with both same state and city). So if there are 4 rows for a subject, the 4 rows would have 4 different cities.&lt;BR /&gt;&lt;BR /&gt;Using the example in my question and consider "state" as the variable of interest, I am hoping in general the output would be something like:&lt;BR /&gt;id: 1&lt;BR /&gt;unique state: A, B; changed 1 time&lt;BR /&gt;unique cities: A, B, C, D; changed 3 times&lt;BR /&gt;unique salary: 100, 101, 102; changed 2 times&lt;BR /&gt;...&lt;BR /&gt;similarly for other id's. The example is not perfect as I realized. Consider the state in your example with id=4 and 6 rows, the output would be:&lt;BR /&gt;id:4&lt;BR /&gt;unique state: C,D; changed 5 times.&lt;BR /&gt;salary: 120, 110, 99, 100, 88, 90; changed 5 times. (hopefully the difference for each change can also be computed).&lt;BR /&gt;&lt;BR /&gt;There is also a date variable which is how the data is sorted by originally, we will stick with this order.&lt;BR /&gt;&lt;BR /&gt;I am really stuck on how to generalize the logic since the dataset contains about 200 subjects. Any advice is welcomed, thank you in advance!&lt;BR /&gt;</description>
    <pubDate>Thu, 29 Sep 2022 15:59:23 GMT</pubDate>
    <dc:creator>hellorc</dc:creator>
    <dc:date>2022-09-29T15:59:23Z</dc:date>
    <item>
      <title>Check uniqueness of a variable in a complicated dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835868#M35984</link>
      <description>&lt;P&gt;Hello SAS community, I have a dataset which looks like:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data have;
input ID state $ city $ salary @@;
datalines;
1 A A 100
1 A B 100
1 A C 101
1 B D 102
2 B E 99
2 B F 99
2 B G 99
3 A C 88
4 C H 120
4 D J 110
;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;For each id, I would like to compute all unique values of a variable, how many times a variable change from all rows; if possible, I would also need to compute how much is the change if the variable is numeric.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let me elaborate using id=1 as an example, from the 4 observations for id=1, the unique salary values are 100, 101, and 102, so salary changed 2 times from all rows for id=1, and the changes are 1 and 2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Using id=4 and city as another example, the output for would be H and J, and city changed 1 time.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am still new to SAS and I really am having difficulty thinking of a logic to work out those outputs for each id. I tried using first.id and last.id to check but that wouldn't include the 'middle' observations. Can this be done via data step, or is SQL required? Might someone be willing to provide some assistance?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2022 15:14:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835868#M35984</guid>
      <dc:creator>hellorc</dc:creator>
      <dc:date>2022-09-29T15:14:35Z</dc:date>
    </item>
    <item>
      <title>Re: Check uniqueness of a variable in a complicated dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835872#M35986</link>
      <description>&lt;P&gt;You should include an example of how you expect the final output to appear.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There may be some other considerations on how that output would be generated/reported.&lt;/P&gt;
&lt;P&gt;For example your discussion of City did not include State at all. If the State changes would you expect there to not be a change in City? Should that actually be considered a change?&lt;/P&gt;
&lt;P&gt;What if the value changes back such as the following. Does the city actually change 5 times? That would depend on the ORDER of the data and it might be that for some reason the order has gotten changed from a different one where all the city H values were in sequence.&lt;/P&gt;
&lt;PRE&gt;4 C H 120
4 D J 110
4 C H 99
4 D J 100
4 C H 88
4 D J 90&lt;/PRE&gt;</description>
      <pubDate>Thu, 29 Sep 2022 15:24:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835872#M35986</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-09-29T15:24:22Z</dc:date>
    </item>
    <item>
      <title>Re: Check uniqueness of a variable in a complicated dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835882#M35987</link>
      <description>Hello Ballardw, thank you so much for your response! Apparently I did not think deep enough, I apologize. After confirming with my teammate, the dataset is in the format that when there are multiple rows for the same subject, at least the city would be different. The state can repeat or change back and forth, but city is ALWAYS unique (So for the same id, there will not be rows with both same state and city). So if there are 4 rows for a subject, the 4 rows would have 4 different cities.&lt;BR /&gt;&lt;BR /&gt;Using the example in my question and consider "state" as the variable of interest, I am hoping in general the output would be something like:&lt;BR /&gt;id: 1&lt;BR /&gt;unique state: A, B; changed 1 time&lt;BR /&gt;unique cities: A, B, C, D; changed 3 times&lt;BR /&gt;unique salary: 100, 101, 102; changed 2 times&lt;BR /&gt;...&lt;BR /&gt;similarly for other id's. The example is not perfect as I realized. Consider the state in your example with id=4 and 6 rows, the output would be:&lt;BR /&gt;id:4&lt;BR /&gt;unique state: C,D; changed 5 times.&lt;BR /&gt;salary: 120, 110, 99, 100, 88, 90; changed 5 times. (hopefully the difference for each change can also be computed).&lt;BR /&gt;&lt;BR /&gt;There is also a date variable which is how the data is sorted by originally, we will stick with this order.&lt;BR /&gt;&lt;BR /&gt;I am really stuck on how to generalize the logic since the dataset contains about 200 subjects. Any advice is welcomed, thank you in advance!&lt;BR /&gt;</description>
      <pubDate>Thu, 29 Sep 2022 15:59:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835882#M35987</guid>
      <dc:creator>hellorc</dc:creator>
      <dc:date>2022-09-29T15:59:23Z</dc:date>
    </item>
    <item>
      <title>Re: Check uniqueness of a variable in a complicated dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835883#M35988</link>
      <description>&lt;P&gt;If&amp;nbsp; you're familiar with SQL, I would think of using COUNT of SELECT DISTINCT as a starting point.&amp;nbsp; Your concept 'changed X times' is just the number of distinct values - 1.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2022 16:05:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835883#M35988</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2022-09-29T16:05:49Z</dc:date>
    </item>
    <item>
      <title>Re: Check uniqueness of a variable in a complicated dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835906#M35990</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;If&amp;nbsp; you're familiar with SQL, I would think of using COUNT of SELECT DISTINCT as a starting point.&amp;nbsp; Your concept 'changed X times' is just the number of distinct values - 1.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;The difference values of the Salary variable will still require a data step to process the data in order.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Sep 2022 18:08:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Check-uniqueness-of-a-variable-in-a-complicated-dataset/m-p/835906#M35990</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-09-29T18:08:53Z</dc:date>
    </item>
  </channel>
</rss>

