<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Trouble with nlevels in ODS and Base Reporting</title>
    <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552522#M22649</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/154458"&gt;@righcoastmike&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Take a step back and do this test&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table test as
select *
from have
group by studyid
having count(distinct bi_var)&amp;gt;1;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You should get zero records. If not , we got take one step back further to &lt;STRONG&gt;test the have.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One step at a time.&lt;/P&gt;</description>
    <pubDate>Fri, 19 Apr 2019 18:44:16 GMT</pubDate>
    <dc:creator>novinosrin</dc:creator>
    <dc:date>2019-04-19T18:44:16Z</dc:date>
    <item>
      <title>Trouble with nlevels</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552511#M22647</link>
      <description>&lt;P&gt;Hi Everyone,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset that looks roughly like this (except there are about 30 variables and 5.5 million records, with each study ID having between 1 and 60 records each.) The binary variable is the only one pertinent to this question so it's the only one I've included here. Bi_var will never vary between records for an individual studyID - so if person #2 is assigned a 0, they will get a zero for every record in the dataset :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input studyID Bi_var ;
cards;
1 1
1 1
1 1
2 0
2 0
2 0
2 0
2 0
3 1
3 1
4 0 
5 0
5 0
5 0
5 0
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;And I have it split into 2 separate datasets using this code:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want1;
set have;
where Bi_var = 1;
run: 

data want2;
set have;
where Bi_var = 0;
run: &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So far so good. Now, I need to know how many unique study ID's are in each dataset. To do this I have tried both of the following methods:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;ods select nlevels;

proc freq data=have nlevels;
   tables studyid;
run; 

ods select nlevels;

proc freq data=want1 nlevels;
   tables studyid;
run; 

ods select nlevels;

proc freq data=want2 nlevels;
   tables studyid;
run; &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;and&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
   create table new as 
     select count(distinct(studyid)) as IDcount
            from have;
quit;

proc sql;
   create table new as 
     select count(distinct(studyid)) as IDcount
            from want1;
quit;

proc sql;
   create table new as 
     select count(distinct(studyid)) as IDcount
            from want2;
quit;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Here's the issue, for some reason in both methods,&amp;nbsp; 'want2' (which should be just the records where bi_var=0) has the same number of distinct values as the 'have' dataset. This should be impossible because want1 and want2 are just subsets of the main 'have' dataset. The code seems to have worked fine for the want1 dataset, but I don't know if I can trust it because of this other weirdness.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Other things I have checked:&amp;nbsp;&lt;/P&gt;&lt;P&gt;1. 'Want2' has a different (smaller) number of records than 'have', which makes sense.&amp;nbsp;&lt;/P&gt;&lt;P&gt;2. just to check if the study id's were the same, i summed up the studyid columns in 'want2' and 'have' - if they were the same then their total should have been identical. it is not.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If anyone can give me an idea of what could be happening that results in 'want2' having the same number of unique values as 'have' it would be much appreciated. In essence I have multiple records per person in my dataset, and I'm just trying to figure out how many unique people (studyid) I have.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any thoughts would be really helpful, I'm at a total loss.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks so much.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2019 18:08:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552511#M22647</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2019-04-19T18:08:26Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with nlevels</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552518#M22648</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want1;
set sashelp.class;
where sex = 'F';
run;

data want2;
set sashelp.class;
where sex = 'M';
run;
proc sql;
	create table want1_count as
	select count(distinct(age) ) as age_count
	from want1;
quit;
proc sql;
	create table want2_count as
	select count(distinct(age) ) as age_count
	from want2;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;or&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want1;
set sashelp.class;
where sex = 'F';
run;

data want2;
set sashelp.class;
where sex = 'M';
run;
proc sql;
	create table have_count as
	select count(age) as total_age_records
	from sashelp.class;
quit;
proc sql;
	create table want1_count as
	select count(distinct(age) ) as age_count,
	count(age) as total_age_records
	from want1;
quit;
proc sql;
	create table want2_count as
	select count(distinct(age) ) as age_count,
		count(age) as total_age_records
	from want2;
quit;



&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;or if we want to check our work&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want1;
set sashelp.class;
where sex = 'F';
run;

data want2;
set sashelp.class;
where sex = 'M';
run;
proc sql;
	create table have_count as
	select count(age) as total_age_records
	from sashelp.class
	order by total_age_records;
quit;
proc sql;
	create table want1_count as
	select count(distinct(age) ) as age_count,
	count(age) as total_age_records
	from want1
	order by total_age_records;
quit;
proc sql;
	create table want2_count as
	select count(distinct(age) ) as age_count,
		count(age) as total_age_records
	from want2
	order by total_age_records;
quit;
data counts(drop=age_count);
	merge have_count(rename=(total_age_records=have_records))
		want1_count(rename=(total_age_records=female_records))
		want2_count(rename=(total_age_records=male_records));
	misscount = have_records - (female_records+male_records);
	
run;



&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 19 Apr 2019 19:05:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552518#M22648</guid>
      <dc:creator>VDD</dc:creator>
      <dc:date>2019-04-19T19:05:50Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with nlevels</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552522#M22649</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/154458"&gt;@righcoastmike&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Take a step back and do this test&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table test as
select *
from have
group by studyid
having count(distinct bi_var)&amp;gt;1;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You should get zero records. If not , we got take one step back further to &lt;STRONG&gt;test the have.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One step at a time.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2019 18:44:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552522#M22649</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-04-19T18:44:16Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with nlevels</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552540#M22650</link>
      <description>&lt;P&gt;Having the same counts doesn't seem that strange. Perhaps every STUDYID has both types of records.&amp;nbsp; Or perhaps the data was designed to have the same number of STUDYID values in each group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why not just count the original group and see?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
  create table full_count as 
    select Bi_var,count(distinct studyid) as IDcount
    from want2
    group by Bi_var
  ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note: The DISTINCT keyword is not a function, so no need for ( ) after it.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2019 20:00:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552540#M22650</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-04-19T20:00:16Z</dc:date>
    </item>
    <item>
      <title>Re: Trouble with nlevels</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552572#M22651</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/122002"&gt;@VDD&lt;/a&gt;&amp;nbsp;,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp; and&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;First off, thanks so much for your help. I went through and did the tests that you recommended and found out that the problem was actually in the original dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I had been told that although there were multiple records for each studyid, the binary variable would stay the same for every record of a given studyid. For instance, if studyid #1 had ten records, then all ten of those records would be a 1, or all ten would be a zero.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It turns out that wasn't the case. When I went back and looked at the data more closely, It turns out that the same study ID would have either all zero's, or one "1" and then the rest zeros.&amp;nbsp; This is the reason why my nlevels for the Bi-Var=0 dataset was the same as the original "have" dataset, every studyid had at least one 0 in it.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Once I figured that out (and wrote a strongly worded email to the person who provided me with the data) it was easy to sort out.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to give "solve" credit to all of you for pointing me in the right direction but unfortunately I don't think the forum will let me do that.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As always, I am humbled at the amazing support I get on here. Thank you all so much.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Mike&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2019 23:39:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Trouble-with-nlevels/m-p/552572#M22651</guid>
      <dc:creator>righcoastmike</dc:creator>
      <dc:date>2019-04-19T23:39:45Z</dc:date>
    </item>
  </channel>
</rss>

