- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I am using SAS Studio. On my very first SAS homework, I have to manually enter the data. The first column is a non-numeric indicator ($); the second column contains both positive and negative values. After manually entering the code below, I ran the code to inspect the data and the negative values are showing up as *
Code:
DATA a;
INPUT var1 $ 1 var2 3;
DATALINES;
1 0
1 -3
1 -2
1 -1
2 1
2 2
2 1
2 0
3 10
3 12
3 8
3 5
4 3
4 2
4 3
4 4
;
RUN;
QUIT;
Screen shot of data view attached. Many thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@eileencant wrote:
Thanks, everyone. I don't know enough SAS to interpret your messages, but I very much appreciate your time and I eventually got it to work by adding more columns. Here is what worked and I truly appreciate your help as a SAS novice. (I don't even know if the "version" I am using is "SAS Studio" or "SAS UE" but I should have mentioned this. New code:
DATA a;
INPUT var1 $ 1 var2 3-5;
When you put digits after a variable on an input statement then you telling a explicit column to read from. So when you said
input var1 $ 1 var2 3;
you said Var2 exists only in column 3. Without the $ the variable is numeric. So when the - sign was encountered that was all SAS was allowed to read and a negative sign needs at least one other digit.
This style of reading values is called "Fixed Column" as you read from only the indicated column(s) for each variable. When you added 3-5 that mean you were reading from columns 3 to 5. the + and leading 0 weren't need.
Another form of input is called "List" and assumes that values are separated by one or more characters defaulting to spaces.
So
input var1 $ var2;
says the first variable is expected to be 1 to 8 characters (the default for $, you need to provide more info if reading longer) character then after one or more spaces look for numeric value (length basically irrelevant though issues with more than 15 positions arise).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Look at your log before you look at your data display. When I run your code, the log shows:
1378 DATA a;
1379 INPUT var1 $ 1 var2 3;
1380 DATALINES;
NOTE: Invalid data for var2 in line 1382 3-3.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1382 1 -3
var1=1 var2=. _ERROR_=1 _N_=2
NOTE: Invalid data for var2 in line 1383 3-3.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
1383 1 -2
var1=1 var2=. _ERROR_=1 _N_=3
NOTE: Invalid data for var2 in line 1384 3-3.
1384 1 -1
var1=1 var2=. _ERROR_=1 _N_=4
NOTE: The data set WORK.A has 16 observations and 2 variables.
Interpret the NOTE on the log, which says invalid data for VAR2 in line xxx 3-3 (columns 3 through 3). Look at column 3 in the RULE line and DATA line that follow.
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set
Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets
--------------------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your INPUT statement is incorrect - check your SAS log for the messages regarding invalid data. Try this:
INPUT var1 $ var2;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks, everyone. I don't know enough SAS to interpret your messages, but I very much appreciate your time and I eventually got it to work by adding more columns. Here is what worked and I truly appreciate your help as a SAS novice. (I don't even know if the "version" I am using is "SAS Studio" or "SAS UE" but I should have mentioned this. New code:
DATA a;
INPUT var1 $ 1 var2 3-5;
DATALINES;
1 +00
1 -03
1 -02
1 -01
2 +01
2 +02
2 +01
2 +00
3 +10
3 +12
3 +08
3 +05
4 +03
4 +02
4 +03
4 +04
;
RUN;
QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@eileencant wrote:
Thanks, everyone. I don't know enough SAS to interpret your messages, but I very much appreciate your time and I eventually got it to work by adding more columns. Here is what worked and I truly appreciate your help as a SAS novice. (I don't even know if the "version" I am using is "SAS Studio" or "SAS UE" but I should have mentioned this. New code:
DATA a;
INPUT var1 $ 1 var2 3-5;
When you put digits after a variable on an input statement then you telling a explicit column to read from. So when you said
input var1 $ 1 var2 3;
you said Var2 exists only in column 3. Without the $ the variable is numeric. So when the - sign was encountered that was all SAS was allowed to read and a negative sign needs at least one other digit.
This style of reading values is called "Fixed Column" as you read from only the indicated column(s) for each variable. When you added 3-5 that mean you were reading from columns 3 to 5. the + and leading 0 weren't need.
Another form of input is called "List" and assumes that values are separated by one or more characters defaulting to spaces.
So
input var1 $ var2;
says the first variable is expected to be 1 to 8 characters (the default for $, you need to provide more info if reading longer) character then after one or more spaces look for numeric value (length basically irrelevant though issues with more than 15 positions arise).
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much. This is the explanation I was lacking. I appreciate your time and clarity.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Read you LOG.
You would have seen something like:
NOTE: Invalid data for var2 in line 1040 3-3. RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+- 1040 1 -3 var1=1 var2=. _ERROR_=1 _N_=2 NOTE: Invalid data for var2 in line 1041 3-3. 1041 1 -2 var1=1 var2=. _ERROR_=1 _N_=3 NOTE: Invalid data for var2 in line 1042 3-3. 1042 1 -1 var1=1 var2=. _ERROR_=1 _N_=4 NOTE: The data set USER.A has 16 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.02 seconds cpu time 0.01 seconds
What was your purpose for having the 3 at the end of the input statement? It is really a good idea to attempt to limit the length of numeric variables as the storage for numeric values is different than for character.
Try:
DATA a; INPUT var1 $1. var2 ; DATALINES; 1 0 1 -3 1 -2 1 -1 2 1 2 2 2 1 2 0 3 10 3 12 3 8 3 5 4 3 4 2 4 3 4 4 ; RUN;
and see if that behaves better.
You attempted to put a value into less space than needed to store it.
Length in Bytes
|
Largest Integer Represented Exactly
|
Exponential Notation
|
Significant Digits Retained
|
---|---|---|---|
3
|
8,192
|
213
|
3
|
4
|
2,097,152
|
221
|
6
|
5
|
536,870,912
|
229
|
8
|
6
|
137,438,953,472
|
237
|
11
|
7
|
35,184,372,088,832
|
245
|
13
|
8
|
9,007,199,254,740,992
|
253
|
15
|