<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Python and SAS returning different values in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750404#M236041</link>
    <description>&lt;P&gt;I don't know how Python handles division by zero, but returning a missing value, as sas does, is correct.&lt;/P&gt;</description>
    <pubDate>Fri, 25 Jun 2021 10:05:21 GMT</pubDate>
    <dc:creator>andreas_lds</dc:creator>
    <dc:date>2021-06-25T10:05:21Z</dc:date>
    <item>
      <title>Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750394#M236039</link>
      <description>&lt;P&gt;I am currently in the process of converting a Python script to SAS / PROC SQL and whilst everything has been going smoothly I've been stumped by an odd case where calculating Z-Scores with the following scripts give me different answers.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Python:
HO_HOME_PROVINCE_MEAN = df_A10[["HOME_PROVINCE", "EUCLIDEAN_HOME_OFFICE_DIST_M"]].groupby("HOME_PROVINCE").mean().rename(columns = {"EUCLIDEAN_HOME_OFFICE_DIST_M" : "HO_HOME_PROVINCE_MEAN"}).reset_index()
HO_HOME_PROVINCE_STD = df_A10[["HOME_PROVINCE", "EUCLIDEAN_HOME_OFFICE_DIST_M"]].groupby("HOME_PROVINCE").std().rename(columns = {"EUCLIDEAN_HOME_OFFICE_DIST_M" : "HO_HOME_PROVINCE_STD"}).reset_index()

df_A11 = pd.merge(df_A10, HO_HOME_PROVINCE_MEAN, on = "HOME_PROVINCE", how = "left")
df_A11 = pd.merge(df_A10, HO_HOME_PROVINCE_STD, on = "HOME_PROVINCE", how = "left")

df11["HO_HOME_DISTRICT_ZSCORE"] = (df_A11["EUCLIDEAN_HOME_OFFICE_DIST_M"] - df_A11["HO_HOME_DISTRICT_MEAN"]) / df_A11["HO_HOME_DISTRICT_STD"]&lt;/CODE&gt;&lt;/PRE&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;SAS:

PROC SQL; CREATE TABLE DEVDAT11 AS
SELECT * FROM DEVDAT10 AS A
LEFT JOIN HOME_PROVINCE_MEAN AS B
ON A.HOME_PROVINCE = B.HOME_PROVINCE;

PROC SQL; CREATE TABLE DEVDAT12 AS
SELECT * FROM DEVDAT11 AS A
LEFT JOIN HOME_PROVINCE_STD AS B
ON A.HOME_PROVINCE = B.HOME_PROVINCE;

DATA WORK.DEVDAT13;
SET DEVDAT12;
HO_HOME_PROVINCE_ZSCORE = (EUCLIDEAN_HOME_OFFICE_DIST_M - HO_HOME_PROVINCE_MEAN) / HO_HOME_PROVINCE_STD;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I've made sure to check if I've made any miscalculations using .sum() for Python and Proc Means Sum for SAS and so far it doesn't seem like I've made any mistakes on that front. I suspect that it might have something to do with the way Python handles null values or zeroes in rows as opposed to SAS as I keep getting this message from the log:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;NOTE: Division by zero detected at line 2202 column 90.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The file that I've attached to this post contains a zero and a null value in column "HO_HOME_PROVINCE_STD". Any advice would be very much appreciated&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 09:18:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750394#M236039</guid>
      <dc:creator>danielchoi626</dc:creator>
      <dc:date>2021-06-25T09:18:21Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750400#M236040</link>
      <description>&lt;P&gt;Have you actually looked at your datasets with your own eyes to identify where the differences are, and what might be causing this? This is investigative work that you can (and should) do, and will most likely resolve the problem. If it doesn't, then show us the very specific observations in the data that are not giving the same answers, for SAS variables named EUCLIDEAN_HOME_OFFICE_DIST_M, HO_HOME_PROVINCE_MEAN and HO_HOME_PROVINCE_STD.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 09:50:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750400#M236040</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-06-25T09:50:14Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750404#M236041</link>
      <description>&lt;P&gt;I don't know how Python handles division by zero, but returning a missing value, as sas does, is correct.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 10:05:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750404#M236041</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2021-06-25T10:05:21Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750456#M236054</link>
      <description>In neither of these tools should you be doing z scores manually. PROC STDIZE in SAS will ensure that your calculations are correct. &lt;BR /&gt;</description>
      <pubDate>Fri, 25 Jun 2021 15:21:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750456#M236054</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-06-25T15:21:38Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750459#M236056</link>
      <description>&lt;P&gt;But the Z-score code shown in a DATA step ought to produce the exact same results as PROC STDIZE — assuming the mean and standard deviation were calculated correctly, which is something the original poster ought to have checked already.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 15:37:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750459#M236056</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-06-25T15:37:10Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750461#M236058</link>
      <description>Yes, but doing things manually and not using a proc leaves you more open to mistakes as shown exactly in this post by someone trying to do it manually in different applications. &lt;BR /&gt;&lt;BR /&gt;And as OP has stated, they're not super familiar with SAS so knowing that PROC STDIZE can do it more efficiently is worth mentioning.</description>
      <pubDate>Fri, 25 Jun 2021 15:32:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750461#M236058</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-06-25T15:32:58Z</dc:date>
    </item>
    <item>
      <title>Re: Python and SAS returning different values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750467#M236064</link>
      <description>&lt;P&gt;And I completely agree with you that people should use PROC STDIZE to calculate z-scores (and similar quantities).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was trying to point out that the line of code which computes the z-scores in the original message could not be the cause of the problems that the original poster is experiencing. The error must be something else.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jun 2021 15:43:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Python-and-SAS-returning-different-values/m-p/750467#M236064</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2021-06-25T15:43:14Z</dc:date>
    </item>
  </channel>
</rss>

