In this article we will discuss using mock generated company data provided by Kaggle. We will discuss the effectiveness of gaining useful insight from the data and go through the process of analyzing the data to understand its usefulness. This demonstration will not only give you a practical understanding of working with employee data but also illustrate the potential of data analytics in HR and organizational management.
Our objective in this demo is to provide you with a glimpse of the kind of insights and decisions that can be derived from employee data. By the end of this post, you will have a better understanding of how data can be used to optimize human resources, improve organizational efficiency, and make informed decisions.
Begin by introducing you to the mock employee dataset, including the key variables and fields it contains. This will give you an idea of the type of information we are working with. We will need to load the CSV file into the SAS Explorer; once the data is loaded in the proper location, we can use macro variables to establish the path of the datafile. Once the path is established, we need to know what type of separator is be used, in this demo we know that the data is comma separated.
%let path=/cisviya-export/cisviya/homes/Dee.Mckoy@sas.com/Blgdata/data;
%let pathout=&path/output;
proc import datafile="&path/Employee_Dataset.csv"
out=work.Education
dbms=dlm
replace;
delimiter=',';
getnames=YES;
run;
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
From the above image we can see the first 10 observations from the dataset. From the dataset we see our listed our attributes, Education, JoiningYear, PaymentTier, Age, EverBenched, ExperienceinCurrentDomain, and LeaveOrNot. The variable Everbenched indicates if whether employee has ever been temporarily without assigned work. The variable "Experience in Current Domain" stands for the number of years of experience employees have in their current field.
proc means data=work.Education;
run;
In this step, we run a proc means step to understand of the dataset. The descriptive statistics are measures we can use to learn more about the distribution of observations in variables for analysis, transforming variables, and reporting.
In this first data visualization example, we will look at the age, education level, and sex of the employees to see if there is any insight to gain from the visualization.
proc sgplot data=work.Education;
title "PaymentTeir by Age and Sex";
vline Age / response=PaymentTier stat=mean markers
group=Gender lineattrs=(thickness=5px);
styleattrs datasymbols=(TriangleFilled CircleFilled)
datalinepatterns=(ShortDash LongDash);
run;
From the image above, we see line plot showing the different payment Tiers of employees based on Age and gender. This type of visualization could be useful in understanding the payment scale based around age which could require more investigation.
Now that we have seen the PaymentTier for the Ages of employees at the company. Let's take a look at the levels of Education versus Gender and get a count for number of employees with different levels of education.
proc sgplot data=work.Education;
title Employees Joining By Year;
vline JoiningYear / group=Education stat=mean markers datalabel;
run;
proc sgplot data=work.Education;
title Employees Joining By Year;
vline JoiningYear / group=Education stat=mean markers datalabel;
run;
In this step, let's take a look the education level of the employees from each year starting from 2012 until 2019. By looking at this visualization we will be able gain insight as to how many employees are being hired each year and what level of education the employee enters the company. From the data we are able to see that through each cycle year there was increase of employees who joined with only a bachelor's degree. This would be good insight for comparing recruiting numbers for recent graduates of universities.
From the demonstration, we can conclude that using some of these techniques will help with gaining further knowledge and insight from metrics used in HR and other analytical services. By using data similar to the demo data used could help making better decision in the future regarding recruiting, pay tier per departments, and skill level of employees.
Find more articles from SAS Global Enablement and Learning here.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.