SAS Hacker's Hub

Where your Curiosity Leads to Innovation
BookmarkSubscribeRSS Feed

TheItalianJob - Ethical Data Analysis | Graduate School Admission (Student track option 2)

Started ‎08-29-2024 by
Modified ‎10-19-2024 by
Views 2,112
Team Name TheItalianJob
Track Student Track 2
Use Case Ethical Data Analysis
Technology SAS Viya, Python
Region EMEA
Team lead Pasquale Maritato
Team members @AAlessandrelli 
Social media handles *all team members' social media links here*
Is your team interested in participating in an interview? N
Optional: Expand on your technology expertise  

 

 

Pitch Video

TheItalianJob-Pitch.mp4
Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
    • Chapters
    • descriptions off, selected
    • captions off, selected
      (view in My Videos)

       

      Jury Video

      TheItalianJob-Jury.mp4
      Video Player is loading.
      Current Time 0:00
      Duration 0:00
      Loaded: 0%
      Stream Type LIVE
      Remaining Time 0:00
       
      1x
        • Chapters
        • descriptions off, selected
        • captions off, selected
          (view in My Videos)

           

          Team photo

          theitalianjob_pic.jpg

          Comments

          Great work, @TheItalianJob + @PasqualeM!

           

          Your Team Profile is complete and looks great.  Thank you for putting the correct tag – “Student Track 2” so it’ll be easier to find and judge, when it’s time. 

           

          If you’re excited to learn more about the Hack before September 16th – including a sneak-peak of the use case – please see my post here: https://communities.sas.com/t5/SAS-Hacker-s-Hub/SAS-Hackathon-2024-Student-Track-Details/ba-p/941054

           

          Good luck!

          Wonderful job @TheItalianJob!!!  You did an excellent job of creating an easy-to-follow analysis using both SAS Visual Analytics and SAS Model Studio.  In particular, I like how you carefully walked through the descriptive statistics - complete with maps - in SAS Visual Analytics.  You then used the Fairness + Bias Assessment tools in SAS Model Studio to refine your models to make the offers of admissions more equitable.  Finally - I love the recommendations for improving data collection at iLink University in the future...

          Yay!

          One question: did you notice any issues with the Legacy Admissions variable?  I purposely made it biased, and highly predictive, and was wondering what you saw in your model.  Regardless, great work!

          @LGroves

          Hey Lincoln, thank you very much for your feedback.

          I'm glad you asked, since being in a hurry and not being video editing wizards, we skipped showing a lot of our analysis.

          We agree that Legacy Admissions was indeed highly predictive; we built a page in SAS VA basically for each feature that showed the different admission rates for Legacy Admission values.

          PasqualeM_1-1730200067413.png

          Once we found out there was bias against some applicants, we dropped the sensitive features (gender, cultural identity, and country region).

          At that point, there was still some bias, mainly for cultural identity. We had two options (actually, they are not mutually exclusive): dropping other features or mitigating the bias with exponentiated gradient reduction through the mitigate bias action. We tried both and also combinations of both.

          Our results showed that dropping the Legacy Admission feature would not have a high impact both before and after the mitigation.

          Example:

          • Before mitigation, keeping Legacy Admission

          PasqualeM_2-1730200629719.png

          • Before mitigation, dropping Legacy Admission

          PasqualeM_3-1730200688872.png

           

          As you can see, there was a 3% drop in prediction parity bias for Cultural Identity, but an increase for Country Region and Gender.

          Probably the features we built with feature engineering on the Mission Statement really offset the effect of Legacy Admission in terms of predictive power.

          After mitigating the bias, dropping Legacy Admission only slightly increased the misclassification rate but had basically no effect on bias metrics (actually the model mitigated for demographic parity had a lower prediction bias when keeping Legacy Admission).

          The dataset was really small, so the results might vary with a different split (we used a 60-20-20 split stratified on Admissions and Cultural Identity). At the same time, since it was so small, we preferred to mitigate the bias rather than dropping too many features.

          For our suggestion to predict the future students performance instead of predicting the admission based on historical data, Legacy Admission could be a good predictor (in econometrics there are studies that show how parents educations affects students performances),  anyway it'd be important to check its effect on bias metrics.

           

          We still have access to our pipelines so if you have any other question feel free to ask.

           

          Andrea & Pasquale

           

           

           

           

           

          Maybe you can exclude  gender, cultural identity, and country region.  Maybe other variables may be less bias. Do you have other variables for your sample model? Maybe the econometrics analysis would be better. I worked in the banking industry and using gender, cultural identity, and country region would have to be excluded in the models to prevent bias. Are you running logistic regression? How? Are you using stepwise methods?

           

          Jonas V. Bilenas

          jonas.bilenas@gmail.com

          BLOG: https://jonasbilenascom.com/

           

          Great response, @andrea & Pasqual!  Recapping a months worth of work in 10 minutes is never easy... and I appreciate the added detail provided above.  Moreover, I appreciate how critically both of you have thought about the challenges at hand.

           

          Again, great work!

          Version history
          Last update:
          ‎10-19-2024 10:30 AM
          Updated by:

          sas-innovate-white.png

          Our biggest data and AI event of the year.

          Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

          Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

           

          Register now!

          Article Tags

          Ready to join fellow brilliant minds for the SAS Hackathon?

          Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Join us at the 2025 SAS Hackathon Sept. 15 – Oct 10. Visit the SAS Hackathon homepage.

          Check it out!