02-14-2014 06:23 AM
When I run a JOB in DI the following warning appears:
- WARNING: Variable FLAG has different lengths on BASE and DATA files (BASE 3 DATA 8).
Anyone know how to resolve this warning?
This field is declared in the table as numeric (3).
I have physically deleted the table and warning continues.
02-14-2014 10:52 AM
Well, I dont know much about SCD Type 1 Loader but the Warning seems to be produced by a Proc Append with Force.
The following small SAS program generates the same Warning message:
length flag 3;
do flag=1 to 5;
length flag 8;
do flag=6 to 8;
proc append base=base data=data force;
17 data base;
18 length flag 3;
19 do flag=1 to 5;
NOTE: The data set WORK.BASE has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
23 data data;
24 length flag 8;
25 do flag=6 to 8;
NOTE: The data set WORK.DATA has 3 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
29 proc append base=base data=data force;
NOTE: Appending WORK.DATA to WORK.BASE.
WARNING: Variable flag has different lengths on BASE and DATA files (BASE 3 DATA 8).
NOTE: FORCE is specified, so dropping/truncating will occur.
NOTE: There were 3 observations read from the data set WORK.DATA.
NOTE: 3 observations added.
NOTE: The data set WORK.BASE has 8 observations and 1 variables.
NOTE: PROCEDURE APPEND used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
look at the code generated by the transformation and If this is the case then you should follow the suggestion given by data_null_;
02-15-2014 03:32 AM
Even though it's possible I wouldn't use anything else than 8 bytes for a numeric SAS variable. If it's a flag then you could define the column instead as character with a length of '1'.
You shouldn't get the warning if:
- The length of the flag variable on metadata level is '3' from source to target
- You delete the underlying physical target table so that it gets recreated with the variable attributes as defined in the metadata target table.
If the warning doesn't go away: Investigate the code generated by the transformation, especially the bit which creates the target variable. If there shouldn't be a length statement for the flag variable then it would get created with a length of '8'.
02-15-2014 11:24 AM
I disagree. I have had excellent experience using shorter SAS numeric variables for storing large numbers of small integers. However, you MUST know what you're doing.
02-15-2014 01:37 PM
@Tom i disagree with you for the freedom of length of numeric variables. See: 41214 - Observation length, alignment, and padding of a SAS data set
and the other is SAS(R) 9.4 Language Reference: Concepts, Second Edition (numeric precision) as most people even do not understand the basic mathematics on that. (using partially 2's complemen). By not understanding those there are somtimes failures in expectations.
The worst thing I have seen is moving data between mainframe and Windows and losing the precision using with 4 digits.
As it were region-numbers not even to be meany to do calculations on they would be better character defined.
There is already a common mistake account-numbers gender (F/M using as 0-1) is a misperception coming from the hollerith-card still being common practic. But the IBAN SEPA banking-number is a joke for numbers International Bank Account Number - Wikipedia, the free encyclopedia containing letters. Calculating mod-97 is requiring handling precision wiht 30-digits (decimal not binary). Proc DS2 for the rescue. No flaoting gpu or whatever.
02-15-2014 04:46 PM
Thank you for the reference to the SAS Note...very interesting!
And yes, you're pretty much correct on everything else in your note. I too had the "joy" of transferring large numbers of numeric variables between mainframe, Unix, and Windows. What fun!
How about if I amend my comment to the following:
I believe that there are circumstances in which using shorter SAS numeric variables can be very beneficial, such as instances of storing large numbers of small integers. However, before doing so ensure that you are VERY familiar with the underlying technologies, and the implications of doing this. If you're uncertain, use full length numerics.
02-16-2014 03:44 AM
Agree with that one.
And to add: do not mix up numbers (floating) wiht characters being limited to a dedecicated range as those are constraints
02-16-2014 01:49 PM
A topic to have some fun with!
Starting with the black and white ends of the spectrum:
Both numbers, with potentially infinite numbers of digits and infinite numbers of decimal places, to the point of measurement error.
Classifications with values being limited to two; the fact that they are integers is only coincidental, they could also be alphabetic, funny characters, Greek letters, etc.
In the first cases, they can clearly be used as the subject of mathematical operations and analysis, and in the second they can't.
Moving into gray:
These are both frequently used as classification variables, and as subjects of analysis (in some cases both in the same statistical result).
One horrible misuse; using floating point number to store keys that have decimal places, like library book numbers. Thereby ensuring that they will NEVER match (yes, I got to live the adventure once. I just about died when I found out what they'd done.)
I've never tripped over a theoretical treatment of this topic, but I think that there are subtleties that go beyond having only two types of data under consideration.
I'd be interested in everyone's opinion on whether there is a conceptual treatment of this issue, or if we're doomed always to fly by the seat of our pants.
02-16-2014 02:46 PM
Assuming SAS data types:
02-16-2014 03:56 PM
Nice getting reactions.......
The first black/white points should be clear they can terrible fail. The number(8) just does not support that number of numbers.
Assume you would asked to print the first 100 digits of pi. Yes for normal human environments 5 digits would be sufficient, but in some areas you want more. Measurements with high numbers of accuracy History of the metre - Wikipedia, the free encyclopedia
The old greek did not have numbers but were using the letter - alfabet for that. We are just using them for som couple of ages (French revolution).
Greek/Roman did not use numbers as we know. It is Arabic - Indian inheritance inlcude the masterpiec of the number 0.
It is the hollerith card Punched card - Wikipedia, the free encyclopedia that caused the misperception of being only allowed coding 0-9 by many people. Indeed it is better to use Char-type and than using letters for that not being trapped to the tempatation of doing calcuations. Specfying the The expected value of Seks being 0.5. Getting to wonder why this value is not in the dataset (....question was done in real life).
Age numbers of length 4 would be sufficient but what value you are using Years/days. In healtcare days would be more applicable measuring effects. Years on the other hand would do for binnig the age of a person. Going for seconds an beyond keep it on the 8 bytes.
card. Seen people introducing dates like 32 may as indicator of product status alteration. not ware they could also use letters in the product status field.
Geography that is a nice area with measures. Would go for my holidays navigating on a sailing-boat. Earht big-circle is about 40.000km as being defined by Napoleon with 100degr for a right angle. But we are using 90-degrees for that with minutes (divided by 60) and seconds(divided by 60) giving 1 Nm (1852Nm) leaving out feet fathom landmile etc. Use the NS lines for that not the E-W as they are not following a big circle. From mathematics the angles are measured in rad. Terrible calculations on that. SAS/graph map datasets are given in rad. Projections are transforming everything. In a mercator projection you will fail in about 10m at the edges for a map representing 300km
Ah Financial also nice. They are given it by two digits as cents. But what happend when calculating interests. It should be calculated as a continous one not being made to discrete intervals. But working with continous interest is actually processing it is floating. Compound interest - Wikipedia, the free encyclopedia. What Every Computer Scientist Should Know About Floating-Point Arithmetic There are some anecdotes on working with those roundings. snopes.com: The Salami Embezzlement Technique. Real experience is getting hard to proof that.
I remember an example of a disapproved conversion as there was some differenc of 3,67 after working on ammounts into billions. It was really wrong after carefull examination.
Another example was the complaint as the ammount of 50,00 was not the same as 50,00 after having done some calculations (not being rounded before comparision).
O yes the obvious integers of ammounts still have their challenges. The same kind of calculations failing to accomplish human expectations at the first calculators (1970). These are masking that today so you are not aware of them anymore.