@Patrick wrote: In principal the results should be the same. There could be environment differences like your friend running code with UTF-8 session encoding and you with ANSI (single byte encode environment). If that's the case then do you see any "funny" characters in your data? Such issues could cause character based expressions to return undesired results. Thanks Patrick. You're right that there is some encoding issues in the data. I import the data from a CSV into SAS 9.4 (Unicode) and clean the data and send my colleague the cleaned SAS dataset file. She uses SAS onDemand for Academics. She uses my input dataset and writes code to do some work on it. She sends me the final code to get to her completed dataset and shows me screenshots of her final dataset demonstrating the required results. I run her code on my local machine on 3 different versions: SAS 9.4 (Unicode), SAS 9.4 and SAS Studio. and then finally online on SAS onDemand. When I import the data on my local machine on SAS 9.4 (Unicode) and run her code , it completes without any errors in the log but incorrect results. The logic is completely garbled. When I run the code on SAS 9.4 or SAS Studio locally: NOTE: Data file TEMP.EXPORTS.DATA is in a format that is native to another host, or the file
encoding does not match the session encoding. Cross Environment Data Access will be used, which
might require additional CPU resources and might reduce performance.
79 run;
ERROR: Some character data was lost during transcoding in the dataset TEMP.EXPORTS. Either the
data contains characters that are not representable in the new encoding or truncation occurred
during transcoding. My friend is using SAS on Demand for Academics so I finally gave in and registered an account on there and I uploaded my dataset then ran her code and the results were perfect, similar to what she said. Interestingly, the final dataset on my SAS 9.4 (Unicode) has the correct number of observations and variables but the values of certain observations in certain variables are not correct which indicates that the logic steps did not complete properly. Since I am using more data/set steps and proc sql I read that it could be due to sorting issues so now I attempted to sort the data before and after every datastep/proc sql batch but still not luck. "some of our observations are slightly different" What are the differences? Some rounding? Please be very specific and share a few examples. I have 8 variables that get chosen based on conditional logic of the initial dataset that is iterated over 20 times to end up with a final data set. Some variables are created on the way to flag but the final dataset is the same number of variables as we started with, just fewer observations. The logic moves around the values between observations and deletes the observations that are not useful anymore. The final value in each observation of a particular variable is based on the logic of the prior dataset. For example, if I have 3 variables Timestamp Ticker1 Ticker2 Ticker3 for each observation and after the logic steps when I compare my final dataset with my colleagues: for example under Ticker1, I see GOOG but when my friends code is run she shows, correctly, AAPL. So somewhere in the logic we're getting different results. Also try to identify the exact row where the differences occur and ideally find the expression in your code/data step for this row and data used that causes this difference. This is what I'm trying to do today. I don't want to just use this SAS on Demand if i don't have to for extended periods. I didn't even know there was a free version like this of SAS till now. Its important for me to find out what is happening so I can trust my future results. "NOTE: The query requires remerging summary statistics back with the original data." That's only a SAS note you get for certain SQL syntax. It's nothing to worry about and only worth looking into if it's about improving performance. Thank you.
... View more