Because the Visual Analytics Alphabet Series was inspired at the screening of a horror movie (see A is for Aggregated Data), most of the data used in this series is horror based.
HORROR_MOVIES from Kaggle, a data set of horror films from 1950 – 2022, was extracted from The Movie Database (TMDB) using the TMDB API. The Movie Database is a community built movie and TV database that contains information about movies, TV shows, cast, and community reviews.
If you would like to use the TMDB API for your own data adventures, check out the documentation.
| Name | Label | Description | Unique Count | Range |
|---|---|---|---|---|
| id | Movie ID | Unique ID for TMDB, used to construct link to movie page | 32,540 | |
| original_title | Original Title | Original movie title | 30,294 | |
| title | Movie Title | Movie title | 29,563 | |
| original_language | Original Lanugage | Language in which the movie was made (for example, en for English, no for Norwegian, de for German) | 97 | |
| overview | Description | Description of movie | 31,021 | |
| tagline | Tagline | Tagline of movie | 12, 514 | |
| release_date | Release Date | Release date (mm/dd/yyyy) | 10,999 | |
| poster_path | Poster Image | Unique name of the movie poster. This can be used to generate a link to the poster image. | 28,049 | |
| popularity | Popularity Score | Lifetime popularity score generated by the community. For movies, this is based on daily metrics (like number of votes, number of views, times favorited, times watchlisted), release date, total votes, and the previous day’s score. | 0 - 5,088.584 | |
| vote_count | # User Ratings | Number of user ratings for movie | 0 - 16,900 | |
| vote_average | User Score (1-1) | Average rating for movie. Ratings range from 1 to 10 stars. | 0.5 - 10 | |
| budget | Budget (in $) | Budget of movie in US dollars | $1 - $200,000,000 | |
| revenue | Revenue (in $) | Revenue made from movie in US dollars | $1 -$701.842,551 | |
| runtime | Movie Runtime (min) | Official runtime of movie (in minutes) | 1 - 683 | |
| status | Movie Status | Status of movie (Released, In Production, Post Production, Planned) at time of extract (late 2022) | 4 | |
| adult | <not used> | <not used> | <not used> | |
| backdrop_path | Backdrop Image | Unique name of movie backdrop image. This can be used to generate a link to the poster image. | 13,537 | |
| genre_name | Genre(s) | List of movie genres | 772 | |
| collection | Collection ID | ID of the collection, used to construct link to collection page. | 816 | |
| collection_name | Collection Name | Name of collection | 816 |
A few data cleansing techniques needed to be applied to the HORROR_MOVIES table to get the data report ready:
KILLCOUNTS from Github, a data set of horror films from 1922-2025, was sourced from community projects (like Dead Meat, MovieBodyCounts, List of Deaths Wiki, and work done by Randal Olson.
| Name | Label | Description | Unique Count | Range |
|---|---|---|---|---|
| title | Movie Title | Movie Title | 469 | |
| year | Release Year | Release year | 63 | |
| count | Kill Count | Total confirmed kills | 1 - 4,295 | |
| tmdb_id | TMDB ID | The Movie Database (TMDB) unique ID | 482 |
A few data cleansing techniques needed to be applied to the KILLCOUNTS table to get the data report ready:
HAUNTED_PLACES from Kaggle, a data set of haunted places in the United States was compiled by Tim Renner using The Shadowlands Haunted Places Index.
| Name | Label | Description | Unique Count | Range |
|---|---|---|---|---|
| city | City | City where the haunted place is located | 4,285 | |
| country | Country | Country where the haunted place is located (all United States) | 1 | |
| description | Description | Description of the haunted place | 10,979 | |
| location | Location | Name of the haunted place | 9,691 | |
| state | State | US state where the haunted place is located | 51 | |
| state_abbrev | state_abbrev | US two-letter state abbreviation where the haunted place is located | 51 | |
| longitude | Location Longitude | Longitude of the haunted place | -164.7224104 - -66.6667528 | |
| latitude | Location Latitutde | Latitude of the haunted place | 19.632069 - 66.8925886 | |
| city_longitude | City Longitude | Longitude of the city center | -164.7238888 - -67.8402316 | |
| city_latitude | City Latitude | Latitude of the city center | 19.5756191 - 66.8983333 |
A few data cleansing techniques needed to be applied to the HAUNTED_PLACES table to get the data report ready:
This product uses the TMDB API but is not endorsed or certified by TMDB.
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.