BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tnmahmoo
Fluorite | Level 6

Hello, 

 

I would like 5-10 min of someone's time to help me with the Dietary Screener Questionnaire SAS code. I do not know how to use SAS, some help from my advisor, youtube videos and this community have helped me with the first few errors but now I am at a dead end.

 

Below are both the instructions and coding that led to the error: variable cereal2fc has been defined as both character and numeric. The screenshot is what the columns look like in the database relevant to this code. 

 

Please let me know what code(s) do I need to add/edit to resolve this error. 

 

Your help will be greatly appreciated, thank you for your time!

 

DSQ_020 and DSQ_xx3 MUST be converted to an 8 digit foodcode by the user
1592
1593
1594
1595 The current variables (DSQ_020 and DSQ_xx3) are character variables that represent
1596
1597 assorted cereals. Using the 'calib.DSQ.cereal.ntile.xlsx' file, the user
1598
1599 must create new variables, which will be called:
1600
1601
1602
1603 food_code_1 and
1604
1605 food_code_2
1606
1607
1608
1609 and contain an eight digit FNDDS foodcode which represents the particular cereal
1610
1611 that was reported. If only one cereal is reported, food_code_2 should remain blank.
1612

 

Below is the actual coding, error message is located towards the end.

 

1821 data ntile;
1822 set ntile.'CALIB.DSQ.CEREAL.NTILE'n; (this is a separate excel file used that contained all of the possible food codes and cereal brands for this program.)
1823 run;

NOTE: The import data set has 392 observations and 7 variables.
NOTE: There were 392 observations read from the data set NTILE.'CALIB.DSQ.CEREAL.NTILE'n.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds


1824
1825 proc sort data=ntile;
1826 by food_code; 
1827 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1828
1829 data ntile (keep=cereal1fc cereal2fc whgnt sugnt calcnt fibnt);
1830 set ntile;
1831 by food_code;
1832 if first.food_code;
1833 cereal1fc=food_code;
1834 cereal2fc=food_code;
1835 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1836
1837 proc sort data=dtq;
1838 by cereal1fc;
1839 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 63 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


1840
1841 proc sort data=ntile;
1842 by cereal1fc;
1843 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.06 seconds
cpu time 0.00 seconds


1844
1845 data dtq (drop= whgnt sugnt calcnt fibnt);
1846 merge dtq (in=d) ntile (drop=cereal2fc);
1847 by cereal1fc;
1848 if d;
1849
1850 c1whgnt=whgnt;
1851 c1sugnt=sugnt;
1852 c1calcnt=calcnt;
1853 c1fibnt=fibnt;
1854 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


1855
1856 proc sort data=dtq;
1857 by cereal2fc;
1858 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds


1859
1860 proc sort data=ntile;
1861 by cereal2fc;
1862 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1863
1864 data dtq (drop= whgnt sugnt calcnt fibnt);
1865 merge dtq (in=d) ntile (drop=cereal1fc);
ERROR: Variable cereal2fc has been defined as both character and numeric.
1866 by cereal2fc;
1867 if d;
1868
1869 c2whgnt=whgnt;
1870 c2sugnt=sugnt;
1871 c2calcnt=calcnt;
1872 c2fibnt=fibnt;
1873 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DTQ may be incomplete. When this step was stopped there were 0
observations and 71 variables.
WARNING: Data set WORK.DTQ was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds

 

DSQ database.JPG 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You have two choices to fix this, and I would recommend option #2.

 

1. Convert the variables in question type to the type that it should be - refer to the source documentation to see if it's a character or numeric variable. 

2. Redo your import procedures to correctly import the data with the same types and structures. 

 

 

PROC IMPORT is a guessing procedure. It guesses at types. Excel does not enforce any concept of types on the fields, you can have characters and numeric values in a  single column. This makes it a bad way to store data. Databases, and SAS, enforce types when storing data and SAS has only two types, numeric and character.

 

When SAS imports from Excel it guesses at your variable types, sometimes incorrectly as it did in this case. Files considered identical to our eyes can have differences that causes the same variable to be imported as character in one file and numeric in another. Then when you try and combine the files you get an error about mismatch of types. The optimal solution is to fix the import. The reason I'm recommending this is because this type of survey data is usually shared as either a SAS data set or CSV, so you should have another option to import. The other reason, is if it read the data incorrectly in the first time you can't be sure it is correct for your research/paper and you most definitely want to make sure it is correct. If you convert, you'll only convert the data already read in as correct and you don't know what SAS ignored the first time.  From your screenshot it looks like periods in the file. SAS will interpret that to be a character field. You really do want to use a data step to import the file from a CSV ideally, or a XPORT/CPORT file. 

 


@tnmahmoo wrote:

Hello, 

 

I would like 5-10 min of someone's time to help me with the Dietary Screener Questionnaire SAS code. I do not know how to use SAS, some help from my advisor, youtube videos and this community have helped me with the first few errors but now I am at a dead end.

 

Below are both the instructions and coding that led to the error: variable cereal2fc has been defined as both character and numeric. The screenshot is what the columns look like in the database relevant to this code. 

 

Please let me know what code(s) do I need to add/edit to resolve this error. 

 

Your help will be greatly appreciated, thank you for your time!

 

DSQ_020 and DSQ_xx3 MUST be converted to an 8 digit foodcode by the user
1592
1593
1594
1595 The current variables (DSQ_020 and DSQ_xx3) are character variables that represent
1596
1597 assorted cereals. Using the 'calib.DSQ.cereal.ntile.xlsx' file, the user
1598
1599 must create new variables, which will be called:
1600
1601
1602
1603 food_code_1 and
1604
1605 food_code_2
1606
1607
1608
1609 and contain an eight digit FNDDS foodcode which represents the particular cereal
1610
1611 that was reported. If only one cereal is reported, food_code_2 should remain blank.
1612

 

Below is the actual coding, error message is located towards the end.

 

1821 data ntile;
1822 set ntile.'CALIB.DSQ.CEREAL.NTILE'n; (this is a separate excel file used that contained all of the possible food codes and cereal brands for this program.)
1823 run;

NOTE: The import data set has 392 observations and 7 variables.
NOTE: There were 392 observations read from the data set NTILE.'CALIB.DSQ.CEREAL.NTILE'n.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds


1824
1825 proc sort data=ntile;
1826 by food_code; 
1827 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1828
1829 data ntile (keep=cereal1fc cereal2fc whgnt sugnt calcnt fibnt);
1830 set ntile;
1831 by food_code;
1832 if first.food_code;
1833 cereal1fc=food_code;
1834 cereal2fc=food_code;
1835 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1836
1837 proc sort data=dtq;
1838 by cereal1fc;
1839 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 63 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


1840
1841 proc sort data=ntile;
1842 by cereal1fc;
1843 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.06 seconds
cpu time 0.00 seconds


1844
1845 data dtq (drop= whgnt sugnt calcnt fibnt);
1846 merge dtq (in=d) ntile (drop=cereal2fc);
1847 by cereal1fc;
1848 if d;
1849
1850 c1whgnt=whgnt;
1851 c1sugnt=sugnt;
1852 c1calcnt=calcnt;
1853 c1fibnt=fibnt;
1854 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


1855
1856 proc sort data=dtq;
1857 by cereal2fc;
1858 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds


1859
1860 proc sort data=ntile;
1861 by cereal2fc;
1862 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1863
1864 data dtq (drop= whgnt sugnt calcnt fibnt);
1865 merge dtq (in=d) ntile (drop=cereal1fc);
ERROR: Variable cereal2fc has been defined as both character and numeric.
1866 by cereal2fc;
1867 if d;
1868
1869 c2whgnt=whgnt;
1870 c2sugnt=sugnt;
1871 c2calcnt=calcnt;
1872 c2fibnt=fibnt;
1873 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DTQ may be incomplete. When this step was stopped there were 0
observations and 71 variables.
WARNING: Data set WORK.DTQ was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds

 

DSQ database.JPG 


 

View solution in original post

5 REPLIES 5
PGStats
Opal | Level 21

"I would like 5-10 min of someone's time to help me"

 

How did you estimate the time required?

PG
tnmahmoo
Fluorite | Level 6
I estimated it would take 5-10 min for someone who knows this error/SAS coding to see the info and give a response. If you know the error and have a suggestion, please let me know, I would greatly appreciate your time in helping me. Thanks!
Kurt_Bremser
Super User

The error is a consequence of using the Excel file format. SAS has to guess the types of variables, and sometimes it guesses wrong and causes inconsistencies. In this case different types for your key variable.

Save your data to csv files from Excel, and read those with data steps, so you have control over column attributes.

tnmahmoo
Fluorite | Level 6

Thank you so much for @Kurt_Bremser and @Reeza! I re-ran my SAS code with CSV files and got through that error by adjusting my codes to data steps. I did not know what data steps meant so I googled it and came across this: https://libguides.library.kent.edu/SAS/DataStep which helped me a lot with adjusting my codes.

I do have other errors in my log that appeared but I will post another message if I cannot resolve them. 

 

I really appreciate your time and efforts! 

 

To those who need the code I ended up with, here it is...

 

filename yourdata"C:\Users\tnmah\Desktop\Tara Stuff\DSQ files\SBMDdata.csv";  

 

proc import datafile ="C:\Users\tnmah\Desktop\Tara Stuff\DSQ files\SBMDdata.csv"
dbms=CSV
out=work.SBMDdata
replace;
Getnames=YES;
run;

PROC PRINT DATA=WORK.SBMDdata; RUN;


libname ntile xlsx "C:\Users\tnmah\Desktop\Tara Stuff\DSQ files\calib.DSQ.cereal.ntile.csv";

libname psize xlsx "C:\Users\tnmah\Desktop\Tara Stuff\DSQ files\calib.portion.size.csv";

libname rcoeff xlsx "C:\Users\tnmah\Desktop\Tara Stuff\DSQ files\calib.equation.coeff.csv";


proc format;

value gender 1='male'

2='female';
run;


data dtq;

set SBMDdata;

ageinyr=DSQ_xx1;

gender=DSQ_xx2;

if DSQ_xx2='A' then gender=1;
if DSQ_xx2='B' then gender=2;

cereal1fc=food_code_1;

cereal2fc=food_code_2;

if cereal1fc=. and cereal2fc=. then numcer=0;
if cereal1fc > 0 and cereal2fc > 0 then numcer=2;
if cereal1fc > 0 then numcer=1;
run;

 

I hope this helps anyone else who is using the dietary screener questionnaire SAS package provided online.

Reeza
Super User

You have two choices to fix this, and I would recommend option #2.

 

1. Convert the variables in question type to the type that it should be - refer to the source documentation to see if it's a character or numeric variable. 

2. Redo your import procedures to correctly import the data with the same types and structures. 

 

 

PROC IMPORT is a guessing procedure. It guesses at types. Excel does not enforce any concept of types on the fields, you can have characters and numeric values in a  single column. This makes it a bad way to store data. Databases, and SAS, enforce types when storing data and SAS has only two types, numeric and character.

 

When SAS imports from Excel it guesses at your variable types, sometimes incorrectly as it did in this case. Files considered identical to our eyes can have differences that causes the same variable to be imported as character in one file and numeric in another. Then when you try and combine the files you get an error about mismatch of types. The optimal solution is to fix the import. The reason I'm recommending this is because this type of survey data is usually shared as either a SAS data set or CSV, so you should have another option to import. The other reason, is if it read the data incorrectly in the first time you can't be sure it is correct for your research/paper and you most definitely want to make sure it is correct. If you convert, you'll only convert the data already read in as correct and you don't know what SAS ignored the first time.  From your screenshot it looks like periods in the file. SAS will interpret that to be a character field. You really do want to use a data step to import the file from a CSV ideally, or a XPORT/CPORT file. 

 


@tnmahmoo wrote:

Hello, 

 

I would like 5-10 min of someone's time to help me with the Dietary Screener Questionnaire SAS code. I do not know how to use SAS, some help from my advisor, youtube videos and this community have helped me with the first few errors but now I am at a dead end.

 

Below are both the instructions and coding that led to the error: variable cereal2fc has been defined as both character and numeric. The screenshot is what the columns look like in the database relevant to this code. 

 

Please let me know what code(s) do I need to add/edit to resolve this error. 

 

Your help will be greatly appreciated, thank you for your time!

 

DSQ_020 and DSQ_xx3 MUST be converted to an 8 digit foodcode by the user
1592
1593
1594
1595 The current variables (DSQ_020 and DSQ_xx3) are character variables that represent
1596
1597 assorted cereals. Using the 'calib.DSQ.cereal.ntile.xlsx' file, the user
1598
1599 must create new variables, which will be called:
1600
1601
1602
1603 food_code_1 and
1604
1605 food_code_2
1606
1607
1608
1609 and contain an eight digit FNDDS foodcode which represents the particular cereal
1610
1611 that was reported. If only one cereal is reported, food_code_2 should remain blank.
1612

 

Below is the actual coding, error message is located towards the end.

 

1821 data ntile;
1822 set ntile.'CALIB.DSQ.CEREAL.NTILE'n; (this is a separate excel file used that contained all of the possible food codes and cereal brands for this program.)
1823 run;

NOTE: The import data set has 392 observations and 7 variables.
NOTE: There were 392 observations read from the data set NTILE.'CALIB.DSQ.CEREAL.NTILE'n.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds


1824
1825 proc sort data=ntile;
1826 by food_code; 
1827 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 392 observations and 7 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds


1828
1829 data ntile (keep=cereal1fc cereal2fc whgnt sugnt calcnt fibnt);
1830 set ntile;
1831 by food_code;
1832 if first.food_code;
1833 cereal1fc=food_code;
1834 cereal2fc=food_code;
1835 run;

NOTE: There were 392 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1836
1837 proc sort data=dtq;
1838 by cereal1fc;
1839 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 63 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


1840
1841 proc sort data=ntile;
1842 by cereal1fc;
1843 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.06 seconds
cpu time 0.00 seconds


1844
1845 data dtq (drop= whgnt sugnt calcnt fibnt);
1846 merge dtq (in=d) ntile (drop=cereal2fc);
1847 by cereal1fc;
1848 if d;
1849
1850 c1whgnt=whgnt;
1851 c1sugnt=sugnt;
1852 c1calcnt=calcnt;
1853 c1fibnt=fibnt;
1854 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds


1855
1856 proc sort data=dtq;
1857 by cereal2fc;
1858 run;

NOTE: There were 61 observations read from the data set WORK.DTQ.
NOTE: The data set WORK.DTQ has 61 observations and 67 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds


1859
1860 proc sort data=ntile;
1861 by cereal2fc;
1862 run;

NOTE: There were 283 observations read from the data set WORK.NTILE.
NOTE: The data set WORK.NTILE has 283 observations and 6 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds


1863
1864 data dtq (drop= whgnt sugnt calcnt fibnt);
1865 merge dtq (in=d) ntile (drop=cereal1fc);
ERROR: Variable cereal2fc has been defined as both character and numeric.
1866 by cereal2fc;
1867 if d;
1868
1869 c2whgnt=whgnt;
1870 c2sugnt=sugnt;
1871 c2calcnt=calcnt;
1872 c2fibnt=fibnt;
1873 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.DTQ may be incomplete. When this step was stopped there were 0
observations and 71 variables.
WARNING: Data set WORK.DTQ was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds

 

DSQ database.JPG 


 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1201 views
  • 0 likes
  • 4 in conversation