<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Subsetting Issue in ODS and Base Reporting</title>
    <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35985#M5181</link>
    <description>Okay Cynthia i will do just that but the worrying thing for me is that even when i used an "IF" statement in the same context i observed the same result.&lt;BR /&gt;
&lt;BR /&gt;
Plus i am of the understanding that SAS isn't case sensitive, one of the reasons we decided to use SAS was because of its ability to filter similar data into smaller data sets where we could apply whatever technique we wanted to without causing any major impact on our infrastructure. &lt;BR /&gt;
&lt;BR /&gt;
What difference will the PROC FREQ make as the data will still remain the same.</description>
    <pubDate>Wed, 16 Jun 2010 14:43:19 GMT</pubDate>
    <dc:creator>chuckdee4</dc:creator>
    <dc:date>2010-06-16T14:43:19Z</dc:date>
    <item>
      <title>Subsetting Issue</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35983#M5179</link>
      <description>Hello guys, i ran this little piece of code below in an attempt to subset some data from a csv file.&lt;BR /&gt;
&lt;BR /&gt;
&lt;I&gt;LIBNAME Catalog 'E:\Source Dataset';&lt;BR /&gt;
LIBNAME Autopay 'E:\Breakdown\Autopay';&lt;BR /&gt;
PROC IMPORT OUT= Catalog.June08&lt;BR /&gt;
            DATAFILE= "E:\Source data\B06 - Switch Summary_csv_2010_0608.csv" &lt;BR /&gt;
            DBMS=CSV REPLACE;&lt;BR /&gt;
     GETNAMES=NO;&lt;BR /&gt;
     DATAROW=2; &lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
DATA Autopay.GTB;&lt;BR /&gt;
	SET Catalog.june08;&lt;BR /&gt;
	WHERE var14 = 'Autopay Transactions';&lt;BR /&gt;
RUN;&lt;BR /&gt;
&lt;BR /&gt;
PROC PRINT DATA = Autopay.GTB;&lt;BR /&gt;
	TITLE 'GTB AUTOPAY TRXNS';&lt;BR /&gt;
RUN;&lt;/I&gt;&lt;BR /&gt;
&lt;BR /&gt;
My results below show that 28929 records were read from the Source data (B06 - Switch Summary_csv_2010_0608.csv) but no data was subset into the data set Autopay.GTB.&lt;BR /&gt;
I have checked all possible errors&lt;BR /&gt;
All data sets were initialized&lt;BR /&gt;
Field names are correct&lt;BR /&gt;
File paths are correct&lt;BR /&gt;
&lt;BR /&gt;
Can anyone help me out here, the log code is shown below:&lt;BR /&gt;
&lt;BR /&gt;
&lt;I&gt;443      data CATALOG.JUNE08                              ;&lt;BR /&gt;
444     %let _EFIERR_ = 0; /* set the ERROR detection macro variable */&lt;BR /&gt;
445      infile 'E:\Source data\B06 - Switch Summary_csv_2010_0608.csv' delimiter = ',' MISSOVER DSD&lt;BR /&gt;
445! lrecl=32767 firstobs=2 ;&lt;BR /&gt;
446         informat VAR1 $24. ;&lt;BR /&gt;
447         informat VAR2 $10. ;&lt;BR /&gt;
448         informat VAR3 $6. ;&lt;BR /&gt;
449         informat VAR4 $16. ;&lt;BR /&gt;
450         informat VAR5 $11. ;&lt;BR /&gt;
451         informat VAR6 $4. ;&lt;BR /&gt;
452         informat VAR7 $18. ;&lt;BR /&gt;
453         informat VAR8 $19. ;&lt;BR /&gt;
454         informat VAR9 $29. ;&lt;BR /&gt;
455         informat VAR10 $22. ;&lt;BR /&gt;
456         informat VAR11 $16. ;&lt;BR /&gt;
457         informat VAR12 best32. ;&lt;BR /&gt;
458         informat VAR13 $12. ;&lt;BR /&gt;
459         informat VAR14 $50. ;&lt;BR /&gt;
460         format VAR1 $24. ;&lt;BR /&gt;
461         format VAR2 $10. ;&lt;BR /&gt;
462         format VAR3 $6. ;&lt;BR /&gt;
463         format VAR4 $16. ;&lt;BR /&gt;
464         format VAR5 $11. ;&lt;BR /&gt;
465         format VAR6 $4. ;&lt;BR /&gt;
466         format VAR7 $18. ;&lt;BR /&gt;
467         format VAR8 $19. ;&lt;BR /&gt;
468         format VAR9 $29. ;&lt;BR /&gt;
469         format VAR10 $22. ;&lt;BR /&gt;
470         format VAR11 $16. ;&lt;BR /&gt;
471         format VAR12 best12. ;&lt;BR /&gt;
472         format VAR13 $12. ;&lt;BR /&gt;
473         format VAR14 $50. ;&lt;BR /&gt;
474      input&lt;BR /&gt;
475                  VAR1 $&lt;BR /&gt;
476                  VAR2 $&lt;BR /&gt;
477                  VAR3 $&lt;BR /&gt;
478                  VAR4 $&lt;BR /&gt;
479                  VAR5 $&lt;BR /&gt;
480                  VAR6 $&lt;BR /&gt;
481                  VAR7 $&lt;BR /&gt;
482                  VAR8 $&lt;BR /&gt;
483                  VAR9 $&lt;BR /&gt;
484                  VAR10 $&lt;BR /&gt;
485                  VAR11 $&lt;BR /&gt;
486                  VAR12&lt;BR /&gt;
487                  VAR13 $&lt;BR /&gt;
488                  VAR14 $&lt;BR /&gt;
489      ;&lt;BR /&gt;
490      if _ERROR_ then call symput('_EFIERR_',1);  /* set ERROR detection macro variable */&lt;BR /&gt;
491      run;&lt;BR /&gt;
&lt;BR /&gt;
NOTE: Numeric values have been converted to character values at the places given by: (Line):(Column).&lt;BR /&gt;
      490:44&lt;BR /&gt;
NOTE: The infile 'E:\Source data\B06 - Switch Summary_csv_2010_0608.csv' is:&lt;BR /&gt;
      File Name=E:\Source data\B06 - Switch Summary_csv_2010_0608.csv,&lt;BR /&gt;
      RECFM=V,LRECL=32767&lt;BR /&gt;
&lt;BR /&gt;
NOTE: 28929 records were read from the infile 'E:\Source data\B06 - Switch Summary_csv_2010_0608.csv'.&lt;BR /&gt;
      The minimum record length was 129.&lt;BR /&gt;
      The maximum record length was 203.&lt;BR /&gt;
NOTE: The data set CATALOG.JUNE08 has 28929 observations and 14 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           0.34 seconds&lt;BR /&gt;
      cpu time            0.32 seconds&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
28929 rows created in CATALOG.JUNE08                            from E:\Source data\B06 - Switch&lt;BR /&gt;
Summary_csv_2010_0608.csv.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
NOTE: CATALOG.JUNE08 was successfully created.&lt;BR /&gt;
NOTE: PROCEDURE IMPORT used (Total process time):&lt;BR /&gt;
      real time           0.68 seconds&lt;BR /&gt;
      cpu time            0.62 seconds&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
492&lt;BR /&gt;
493  DATA Autopay.GTB;&lt;BR /&gt;
494      SET Catalog.june08;&lt;BR /&gt;
495      WHERE var14 = 'Autopay Transactions';&lt;BR /&gt;
496  RUN;&lt;BR /&gt;
&lt;BR /&gt;
NOTE: There were 0 observations read from the data set CATALOG.JUNE08.&lt;BR /&gt;
      WHERE var14='Autopay Transactions';&lt;BR /&gt;
NOTE: The data set AUTOPAY.GTB has 0 observations and 14 variables.&lt;BR /&gt;
NOTE: DATA statement used (Total process time):&lt;BR /&gt;
      real time           0.06 seconds&lt;BR /&gt;
      cpu time            0.01 seconds&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
497&lt;BR /&gt;
498  PROC PRINT DATA = Autopay.GTB;&lt;BR /&gt;
499      TITLE 'GTB AUTOPAY TRXNS';&lt;BR /&gt;
500  RUN;&lt;BR /&gt;
&lt;BR /&gt;
NOTE: No observations in data set AUTOPAY.GTB.&lt;BR /&gt;
NOTE: PROCEDURE PRINT used (Total process time):&lt;BR /&gt;
      real time           0.00 seconds&lt;BR /&gt;
      cpu time            0.00 seconds&lt;/I&gt;</description>
      <pubDate>Wed, 16 Jun 2010 14:13:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35983#M5179</guid>
      <dc:creator>chuckdee4</dc:creator>
      <dc:date>2010-06-16T14:13:12Z</dc:date>
    </item>
    <item>
      <title>Re: Subsetting Issue</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35984#M5180</link>
      <description>Hi:&lt;BR /&gt;
  I'd suggest, for a start, that you run a PROC FREQ on the variable of interest to make sure that the way you are specifying your string for the WHERE clause is correct (that everything is mixed case, instead of all uppercase or all lowercase, etc). For example, for purposes of the WHERE statement, there would be a difference between 'AUTOPAY TRANSACTIONS' and "autopay transactions" and "Autopay Transactions". &lt;BR /&gt;
 &lt;BR /&gt;
[pre]&lt;BR /&gt;
PROC FREQ data=Catalog.June08;&lt;BR /&gt;
  tables VAR14;&lt;BR /&gt;
  title 'What is in Var14';&lt;BR /&gt;
run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
                               &lt;BR /&gt;
This PROC FREQ would be run AFTER your PROC IMPORT, but before your DATA step program. Examining the results of the PROC FREQ should show you what values are in the VAR14 variable for selection purposes.&lt;BR /&gt;
 &lt;BR /&gt;
cynthia</description>
      <pubDate>Wed, 16 Jun 2010 14:20:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35984#M5180</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2010-06-16T14:20:35Z</dc:date>
    </item>
    <item>
      <title>Re: Subsetting Issue</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35985#M5181</link>
      <description>Okay Cynthia i will do just that but the worrying thing for me is that even when i used an "IF" statement in the same context i observed the same result.&lt;BR /&gt;
&lt;BR /&gt;
Plus i am of the understanding that SAS isn't case sensitive, one of the reasons we decided to use SAS was because of its ability to filter similar data into smaller data sets where we could apply whatever technique we wanted to without causing any major impact on our infrastructure. &lt;BR /&gt;
&lt;BR /&gt;
What difference will the PROC FREQ make as the data will still remain the same.</description>
      <pubDate>Wed, 16 Jun 2010 14:43:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35985#M5181</guid>
      <dc:creator>chuckdee4</dc:creator>
      <dc:date>2010-06-16T14:43:19Z</dc:date>
    </item>
    <item>
      <title>Re: Subsetting Issue</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35986#M5182</link>
      <description>Hi:&lt;BR /&gt;
  SAS variable names are NOT case sensitive. For example, you could refer to your variable as var14, VaR14, VAR14, vAR14, or any combination thereof and the variable reference would be recognized. &lt;BR /&gt;
&lt;BR /&gt;
  However, for comparison purposes, such as IF and WHERE statements, case is an issue. There are however, programming techniques you can use to make case irrelevant. So, for example, this:&lt;BR /&gt;
&lt;BR /&gt;
WHERE upcase(var14) = 'AUTOPAY TRANSACTIONS';&lt;BR /&gt;
&lt;BR /&gt;
would cause the UPCASE function to be used for the purpose of the comparison, so that no matter what case was -inside- the variable value, your comparion would be performed on an uppercase value. Also, what if for the first 100 rows or so in your worksheet, folks were really good and typed in "Autopay Transactions" but then for about another 500 rows, they typed only "Autopay Transaction" (without the S) and then for another 1000 rows, they typed just "Autopay" and then for the rest of the rows, they entered "AT"??? In order to select all of these possible values, you will have to code the WHERE or IF statement accordingly. PROC FREQ will show you exactly what is in the VAR14 variable.&lt;BR /&gt;
&lt;BR /&gt;
Do you know, unequivocally, that "Autopay Transactions" occurs -exactly- like that on every row that you want to select??? Proc Freq will show you -exactly- what is inside VAR14. What if you miscounted the columns and VAR14 does not contain 'Autopay Transactions' -- maybe it's VAR13 or VAR15.  It is possible that the columns in Excel did not get read in as you expected or that the variable numbers are not as you expect.&lt;BR /&gt;
&lt;BR /&gt;
What if you had this in the Excel sheet:&lt;BR /&gt;
[pre]&lt;BR /&gt;
      A                       B     C     D     E&amp;lt;--Excel col nums&lt;BR /&gt;
 1   VAR1                    VAR2  VAR3  VAR4 VAR5&lt;BR /&gt;
 2   Regular Transactions&lt;BR /&gt;
 3                          alan  111    111   111&lt;BR /&gt;
 4                          bob   222    222   222&lt;BR /&gt;
 5                          carl  333    333   333&lt;BR /&gt;
 6   Autopay Transactions&lt;BR /&gt;
 7                          dave  444    444   444&lt;BR /&gt;
 8                          edwin 555    555   555&lt;BR /&gt;
 ^&lt;BR /&gt;
 |Excel row nums&lt;BR /&gt;
[/pre]&lt;BR /&gt;
                &lt;BR /&gt;
In the above "sheet simulation" VAR1 shows a category of either Regular Transactions or Autopay Transactions. Rows 3, 4 and 5 are the Regular Transactions and rows 7 and 8 are the Autopay Transactions. However, when SAS reads in data that looks like this, SAS thinks that VAR1 is missing or blank for rows 3, 4, 5, 7 and 8.&lt;BR /&gt;
&lt;BR /&gt;
Without knowing what was originally in your worksheet, it is impossible to tell why your WHERE statement is NOT working. However, using PROC FREQ will allow you to explore the imported data to make sure that the data got imported into the variables correctly and WHAT is in each field. &lt;BR /&gt;
&lt;BR /&gt;
Consider the following PROC PRINT:&lt;BR /&gt;
[pre]&lt;BR /&gt;
  proc print data=sashelp.class;&lt;BR /&gt;
    title 'Eighteen year old students';&lt;BR /&gt;
    where age = 18;&lt;BR /&gt;
  run;&lt;BR /&gt;
[/pre]&lt;BR /&gt;
 &lt;BR /&gt;
When I run that PROC PRINT, the WHERE statement will result in this message in the log:&lt;BR /&gt;
[pre]&lt;BR /&gt;
207  proc print data=sashelp.class;&lt;BR /&gt;
208    title 'Eighteen year old students';&lt;BR /&gt;
209    where age = 18;&lt;BR /&gt;
210  run;&lt;BR /&gt;
              &lt;BR /&gt;
NOTE: No observations were selected from data set SASHELP.CLASS.&lt;BR /&gt;
NOTE: There were 0 observations read from the data set SASHELP.CLASS.&lt;BR /&gt;
      WHERE age=18;&lt;BR /&gt;
                &lt;BR /&gt;
[/pre]&lt;BR /&gt;
&lt;BR /&gt;
But, if I had run a PROC FREQ before hand, I would have already known that nobody was age 18 in the data:&lt;BR /&gt;
[pre]&lt;BR /&gt;
LOG:&lt;BR /&gt;
212  proc freq data=sashelp.class;&lt;BR /&gt;
213    tables age;&lt;BR /&gt;
214  run;&lt;BR /&gt;
           &lt;BR /&gt;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.&lt;BR /&gt;
NOTE: PROCEDURE FREQ used (Total process time):&lt;BR /&gt;
      real time           0.59 seconds&lt;BR /&gt;
      cpu time            0.00 seconds&lt;BR /&gt;
              &lt;BR /&gt;
OUTPUT:&lt;BR /&gt;
                                                    Cumulative    Cumulative&lt;BR /&gt;
                    Age    Frequency     Percent     Frequency      Percent&lt;BR /&gt;
                    --------------------------------------------------------&lt;BR /&gt;
                     11           2       10.53             2        10.53&lt;BR /&gt;
                     12           5       26.32             7        36.84&lt;BR /&gt;
                     13           3       15.79            10        52.63&lt;BR /&gt;
                     14           4       21.05            14        73.68&lt;BR /&gt;
                     15           4       21.05            18        94.74&lt;BR /&gt;
                     16           1        5.26            19       100.00&lt;BR /&gt;
[/pre]&lt;BR /&gt;
                         &lt;BR /&gt;
cynthia</description>
      <pubDate>Wed, 16 Jun 2010 15:51:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35986#M5182</guid>
      <dc:creator>Cynthia_sas</dc:creator>
      <dc:date>2010-06-16T15:51:05Z</dc:date>
    </item>
    <item>
      <title>Re: Subsetting Issue</title>
      <link>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35987#M5183</link>
      <description>Many thanks Cynthia, i will apply these techniques and let you know how it goes.</description>
      <pubDate>Wed, 16 Jun 2010 16:16:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/ODS-and-Base-Reporting/Subsetting-Issue/m-p/35987#M5183</guid>
      <dc:creator>chuckdee4</dc:creator>
      <dc:date>2010-06-16T16:16:18Z</dc:date>
    </item>
  </channel>
</rss>

