Skip to content

An analysis of Mixed Martial Arts fights taken place under the promotion of the Ultiimate Fighting Championship (UFC).

Notifications You must be signed in to change notification settings

a9s2w5/Analysis_of_UFC_Fights

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Analysis of UFC Fights

An analysis of Mixed Martial Arts fights taken place under the promotion of the Ultimate Fighting Championship (UFC).

Data Collection

Data for this analysis has been collected through http://www.ufcstats.com

Initial Data

In total 10,658 unique url's where processed for the initial Datasets. Using this program initially
to collect the data can take several hours to complete depending on computer hardware and internet bandwidth.

For my personal computer hardware / internet bandwidth combination it took:
Approx 4hrs to retrieve all data.

Updating Data

If the program has been ran already, then only new information will be collected and appended to the
csv files. This will save time in updating data and stop the need to run the program entirely everytime.

To do this, a list of URL's visited is kept and checked against before proceeding with the scraping.
Caveats to this are a) a flat file of all URL's is to be maintained and when updating data the scraper must
scrape the initial URL's to see if anything has changed.

Collecting the Data

To collect the Data, Get_Data.py can be ran which will update the data contained in the csv files.
Data is updated on the website after each new fight event. Be careful not to run the program DURING a live a event.
Your data will be skewed.

Running the Get_Data.py script will initiate these methods:

1. Getting the fighter info & normalizing

def get_fighters():
    
    # Get the Fighter Basic Information
    fighters_df = scrp.get_fighters()
    
    # Get additional Fighter details
    fighter_details_df = scrp.get_further_fighter_details()
    
    if len(fighter_details_df) < 1:
        return 
    else:
        # Add a blank DOB column to the end of the DataFrame
        columns = len(fighters_df.columns)
        fighters_df.insert(columns,'DOB','')
        
        # Append DOB info to the fighters Dataframe
        for i in range(len(fighters_df)):
            fighters_df.loc[i, 'DOB'] = fighter_details_df['DOB'].loc[i]
    
    # More code below here for normalization......
    
    return fighters_df
    
# If no new fighters to get info on print msg and do nothing
if fighters_df == None:
    print('Done checking Fighters.')
# Otherwise print the info and append to existing csv
else:
    fighters_df.to_csv('../data/fighters.csv', mode='a', index=False, header=False)

Data Info

There are 3 Datasets contained in the Data folder.

  • Fighter Information (fighters.csv)
  • Events Information (events.csv)
  • Fights Information (fight_data.csv)

The Fighter information

This is the data information in the fighters.csv file
(After normalization and export but before cleaning and dealing with null/missing values)

Fighter Data Info

Fighter Table Info snippet:

Fighter Table Snippet

Tools Used

All tools and languages used, including packages from within each language

Prerequisites are:

  • Python3 is instaled
  • R and R Studio are installed
  • Anaconda3 is installed
Installing Anaconda Instructions:

For Windows: https://docs.anaconda.com/anaconda/install/windows/
For Mac: https://docs.anaconda.com/anaconda/install/mac-os/
For Linux: https://docs.anaconda.com/anaconda/install/linux/

Installing R & R Studio:

Instructions here: https://rstudio-education.github.io/hopr/starting.html

Python

Spyder IDE from within the Anaconda3 framework

Packages Imported:

  • Pandas
  • Numpy
  • string
  • tqdm
  • BeautifulSoup
  • requests

R

R Studio

About

An analysis of Mixed Martial Arts fights taken place under the promotion of the Ultiimate Fighting Championship (UFC).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages