The Science of Magic: Building an MtG Land Recommender from Scratch

23 min readApr 12, 2023

Framing

You just finished a draft in the popular card game Magic: The Gathering. You’ve made all the necessary cuts and think your curve looks acceptable, and now you just need to solidify your mana base. What now?! Luckily, there are some tried and true methods to figuring this out. My favorite was to start with 17 lands, look at your mana curve and either go up or down a couple of lands. Sounds simple! However, there are so many things outside of mana curve to keep track of: how many cards can produce other colors? What about cards with dual faces that can be cast for two different costs? Do some of your cards have cost reduction?

Magic is a complex game, that’s what draws so many people to it. But it's hard to feel like you’re optimizing your draft or constructed deck with the perfect mana base. So, to try and discover the perfect balance of lands and spells I decided to create a land recommender system. It uses data to take some of the guesswork out of deck building and enabling newcomers and veterans alike to craft the ultimate deck (until they draw 5 lands in a row and wonder why they play this game in the first place).

Bringing the idea to life

Like anyone before me who has done a rather complicated solo-dev project, its best to start with some healthy outlining. We know what we want our end product to be, some sort of machine learning model that can take a .txtfile deck as an input and output the mana base. Let’s try to break this down into a flowchart and the some more detail on each component.

Technicals

→ Data source: We will need to train the model on something, the main decision tree here was do I want it to be based on decks which already exist (easy) or based on some advanced statistical model with heavy mathematics to simulate hand-drawing without replacement (hard, but probably much more accurate). Well, as a solo developer with limited time and experience, I think you know which I chose.

Narrator: “It was the easy one”

→ Card Data: Luckily, Scryfall provides an easy-to-use API (An API is a set of rules that allow different software applications to communicate with each other) or a downloadable JSON (think: standardized data format) with all of the card data. My model was trained on cards before the Phyrexia: All Will Be One set.

→ Model Choice: This one was a bit more complicated, but the choice was between a neural network and a random forest model. I decided to try both and test out the performance and ultimately decided with an unboosted random forest. The “unboosted” part of the name refers to the fact that the model is not boosted, meaning that each tree in the forest is built independently without any adaptive boosting or other techniques to improve its performance. This contrasts with boosted models, which combine multiple weak learners in a series to create a strong learner. The model I used was a multi-output regressor (more on this later).

→ Output: I wanted something interactive which you can either browse your local files for a .txt file or copy + paste it into some GUI which spits out the recommendations.

→ Hosting: Haha, just kidding. The model is quite simple. For now, I’m hosting it on GitHub, here. If there is high demand for this, I will be happy to host it somewhere more user friendly and expand my interface beyond what it is now, even considering making it into an app with computer vision so you can upload a deck picture directly from your phone and have it spit out the magic numbers to you. This all sounds exciting to work on, so let’s see where the future takes us!

My setup

→ Editor: I coded primarily in Virtual Studio Code using .ipynb .ipynb files with VSCode’s Jupyter Extension. I just loved how reproducible notebooks are and how easy it is to test and play around with data while building. This choice has become a no-brainer for me but was an evolution in itself as I used to only use .pyfiles.

I also customized my VSCode with a great pets extension. Just look how cute:

seriously, look at the little foxes :3

→ Programming Language: Python was my weapon of choice as it’s what I’m most comfortable with and has a bunch of nifty libraries that I used to make things easier. Here are the big ones which I’ll go into more detail on later:

Pandas — useful for data wrangling
Beautiful Soup — helped me scrape deck data from the web
Scikit-Learn — has a lot of useful machine learning transformers and makes model training and testing a breeze

There are some other ancillary libraries used, but I’ll go over those along with some code-specific examples in a bit.

Putting it all together

Step 1: Feature Extraction

Like I mentioned earlier, first we need a lexicon of cards to even process what is going on in a deck. I downloaded a JSON file of raw card data from Scryfall, and then used pandas to read the json into a dataframe.

import pandas as pd
import os

# Get the absolute path of the current working directory
current_dir = os.path.abspath(os.getcwd())

# Get the absolute path of the parent directory
parent_dir = os.path.abspath(os.path.join(current_dir, os.pardir))

# Set file path to the json file with card data downloaded directly from scryfall
file_path = os.path.join(parent_dir, 'data', 'default-cards-20230308220851.json')

print(file_path) # Confirm the correct file path

# Note that orient='records' is just saying that we expect the data
# to look like a list of dictionaries
df_source = pd.read_json(file_path, orient='records')

Now that we have some raw data, we can start to think about what we need the card data to look like if we want to feed it into a model. First, let’s break down the anatomy of a single card and try to theorize what’s important.

Anatomy of a card

Here we have an example of a pretty standard creature type card. When deckbuilding, the only thing we really care about in this particular card is how much mana it costs to play.

— Mana cost —

Colossal Dreadmaw costs 4 generic mana and 2 green mana

We have two parts to this: the generic cost and the color cost. Generic costs can be payed for with any mana source while colored must come from a source of the corresponding color. Let’s also define a “casting pip” or “pip” as any colored mana cost across magic’s 5 colors. This will affect our output in two ways — The higher the casting costs across our deck (i.e. the mana curve) and the more pips of each color means we would need more lands of that color to be able to pay for the casting costs on average.

Magic’s 5 basic lands—Island, Forest, Mountain, Swamp, Plains

This means that we’re always somewhere on a curve of how many pips of each color we have and how many sources of that color we need. Great! So, we’ve just identified the structure of our target variables. What that means is that we want to predict the “color sources” of each of the five colors, along with the total number of lands to put in our deck.

We’ll make five columns to count the amount of pips in each card of our deck, but first we have to remember two specific edge cases. That is hybrid mana and phyrexian mana. Hybrid mana lets you pay either of two colors for one pip while phyrexian mana lets you choose between that color or pay 2 life.

Ajani, Sleeper Agent has a special mana cost consisting of both hybrid and phyrexian mana

This mana cost is represented as: {1}{G}{G/W/P}{W}in our card database. This was quite the edge case for me, but I concluded that hybrid mana should be counted as half of a regular mana. Whenever we encounter a /character in the mana cost, we look at the previous and next value and subtract those colors from the total cost by 1/2. This only works on valid colors, so if it sees phyrexian mana then nothing happens. We do want to keep track of how much phyrexian mana is in the casting cost as that also affects gameplay decisions around our mana base. It may not be the most elegant, but it worked for this case.

def count_pips(mana_cost):
    """Applied to the mana_cost column in lexicon to create columns for the number of pips in that color for each color

    Args:
        mana_cost (string): the mana cost as a str

    Returns:
        Counter: histogram of mana cost pips
    """
    pips = Counter(mana_cost)

    # Count hybrid mana as 1/2 a regular cost
    if '/' in pips:
        for i in range(len(mana_cost)):
            if mana_cost[i] == '/':
                try:
                    pips[mana_cost[i-1]] -= 0.5
                    pips[mana_cost[i+1]] -= 0.5
                except:
                    pass

        return pips

    return pips

lexicon = lexicon.copy() # Perform operations on a copy of the original df

for color in colors:
    row_name = f'cast_cost_{color}'
    lexicon.loc[:, f'{row_name}'] = lexicon['mana_cost'].apply(lambda row: count_pips(row)[f'{color}'])

Both fortunately and unfortunately, Magic isn’t filled with only big, stupid creatures. There are a multitude of different card effects which increase the game’s complexity. This means that we can not only look at the cost to cast a card.

— Rules Text —

Llanowar Elves has a special kind of rule text, he produces a green mana by tapping

That’s right, some creatures can produce colored mana. How does that change things? Well, the more creatures, or artifacts or really anything that can produce mana means we have the potential to play less lands and still be able to hit our mana requirements.

We can handle this in our code by using the Counter()class on the produced_manacolumn in our lexicondata frame which contains all of our card data.

def count_produced_mana(produced_mana):
    """Applied to the produced_mana column, returns the colors produced by a card

    Args:
        row (List[str]): 

    Returns:
        Counter: a histogram of the produced mana symbols in the card
    """    
    return Counter(produced_mana)

lexicon = lexicon.copy() # Perform operations on a copy of the original df

# colors is a list of the five magic colors
for color in colors:
    row_name = f'produces_{color}'
    lexicon.loc[:, f'{row_name}'] = lexicon['produced_mana'].apply(lambda row: count_produced_mana(row)[f'{color}'])

There are quite a few other considerations we can make among different rules texts. Here I’ll list them all out and give an explanation why they’re important instead of going through all of the code (if there’s interest I can write another article with more detail).

Looting / Card Draw — Card selection is a very important part of making sure you hit your land drops or can filter out unneeded land draws. Higher looting and card draw count in a deck correlates with a lower mana base requirement.
Treasure Tokens — Treasures are an artifact token which lets you sacrifice it to produce a mana. Lots of treasure token creation also correlates with a lighter mana base requirement.
Reduced / Free Spells — There are certain cards which can reduce their own mana cost or the cost of other cards, or even cards that let you play itself or other cards for free. Again, this leads to a lighter mana base requirement slightly.
X costed cards — Cards which have an {X} or multiple {X} in their mana cost can greatly increase the load of our mana base.

The previous six features included some fancy regex parsing of the rules text. Here’s a taste for one of the crazier regex strings used in part to find cards which reduce casting cost:

r“this spell costs (\{[\dx]\})?((\{[\/wubrgc]\})+)? less to cast”

Here’s the explanation for anyone interested

Nice, we were able to extract some very important features just by analyzing the card and various different edge cases which might affect deckbuilding decisions. However, we’re missing one very important edge case. Some cards have two-faces.

— Dual-Faced Cards (DFC)—

“Esika, God of the Tree” and its alternate side “The Prismatic Bridge”

Talk about annoying! Not only are these casting costs completely different, but they are also at different points in the mana curve and do completely different things. Esika can produce mana and lets other legendary creatures produce mana, while the Bridge can cheat some cards into play. How can we possibly create a singular casting cost while also accounting for all of the rules text in both cards, while also remembering that in reality, these faces are not played at an equal 50/50 ratio.

The only way to fairly understand the number of times you’ll cast the first side versus the second is to have a lot of data on how many times each card is played and, well, Magic: Arena which is the official digital platform to play magic, does NOT include any API to do this.

In order to get to that point we first have to divide the raw data which is stored as a list of two dictionaries that looks like:

[ {k:v,…, k_n:v_n}, {k2:v2},…, {k2_n:v2_n}]

First, we have to create new columns for each card by splitting the list of dictionaries into a single dictionary. Then, we can create some columns and label them as card_1_columns and card_2_columns and merge them together. After that, we can run all of our logic that we made earlier and count it for both sides of the cards. So, in the case of Esika, she would produce mana and play free spells. Once we understand what each individual card does, we can aggregate the things we care about and then merge everything back into the main card data frame.

As for the mana cost, we can’t simply sum them together because that would leave us with an inflated cost. We’re never paying {1}{G}{G}{W}{U}{B}{R}{G}since we’re only ever casting one side. We’ll instead handle the mana cost by averaging the mana cost but favoring the color pips since they are more likely to tax your mana base, and then halving everything rounding up to the next pip. We can simply average the converted mana cost, which in a perfect world would be weighted toward the side that sees more play.

def dfc_avg_mana_cost_and_cmc(row):
    """Averages the mana cost for dual-faced cards and the cmc

    Args:
        row (pd.Series): one row of the lexicon df

    Returns:
        tuple of strings: averaged mana cost and average cmc
    """
    # Only for dual faced cards
    if isinstance(row.card_faces, type(np.ndarray(0))):
        # Initialize return value
        mana_cost = ''
        # Only for DFC with mana costs
        try:
            # Count the values of each mana symbol and add them together into a dictionary
            s1 = Counter(row.mana_cost_1)
            s2 = Counter(row.mana_cost_2)
            s3 = s1+s2
            # If either side is a land then just return the mana cost of the nonland card
            if s3 == s1:
                row.mana_cost = row.mana_cost_1
                return row
            elif s3 == s2:
                row.mana_cost = row.mana_cost_2
                return row
            s4 = dict(s3)
            # Divide all the values by 2 rounded up for Mana symbols and down for generic costs
            for k in s4.keys():
                # If there is a generic casting cost
                try:
                    int(k)
                    s4[k] = max(m.floor(s4[k] / 2), 1)
                # If there is a colored mana symbol
                except:
                    s4[k] = m.ceil(s4[k] / 2)
                # concatenate the dict back into a string
                mana_cost += (k * s4[k])
            # Average cmc which is a return value
            avg_cmc = sum(s3.values()) / 2
            row.cmc = avg_cmc
            row.mana_cost = mana_cost
            # Return the averaged mana cost as a string
            return row
        except:
            pass
    # If its not a DFC then just return the mana_cost already present
    return row

lexicon = lexicon.apply(dfc_avg_mana_cost_and_cmc, axis=1)

It is important to note that there are some dual-faced cards which have a land on the back and a spell on the front. These are still going to be counted as lands for the purpose of our target variable.

— Lands —

All that’s left is to count up the number of lands to use as part of our target variables.

def identify_land(row):
# Encodes land types
    try:
        return 1 if 'Land' in row else 0
    except:
        return 0

lexicon['is_land'] = lexicon['type_line'].apply(lambda row: identify_land(row))

Our target variables include the number of lands and the number of total mana producers (which do include lands) in the deck for each of the five colors plus colorless for a total of seven output features.

Hooray! We’re done with the preprocessing and have a usable card database! Now that we can identify each individual card in a deck, we will need a lot of decks for our training data.

Step 2: Scraping for Decks

We’ll need decks which are highly representative of thoughtful deckbuilding practices, so my initial thought was to use the competitive deck sharing website mtgtop8 which has tournament data for multiple constructed formats. I found it difficult to find public datasets with deck data, so I accepted the challenge of scraping the data from this website which did not provide any APIs or downloads. Note that scraping is not against the website’s terms of service. I’ll be going over my process for scraping with broad strokes and will likely revisit in more detail in a later article.

That’s a good looking soup

BeautifulSoup (BS) is a Python library which is built for scraping by parsing HTML or XML documents. This will be very useful, even though I didn’t know too much about HTML going into this. Most websites use a Document Object Model (DOM) that makes interacting with it easier programmatically. HTML is a common front-end language while JavaScript is used as a client-side scripting language for more fine control. This is important because we will have to inspect our website for certain elements that we can interact with through BS. We will have to chain together some elements in order to get to a deck. Let's look at the front-page of mtgtop8.com and see what I’m talking about and explore what I mean further.

We see a page full of interactable elements. What we care about most is the formats listed at the top. If we right-click on a format and press inspect then we can look at the devtools.

Here, we can learn about how the page is structured which we can make use of that knowledge in BS.

We discovered something — This is an HTML anchor element <a>, with the attribute class="menu_item", which defines a hyperlink to another web page. The href attribute specifies the URL that the hyperlink points to (in this case, "/format?f=ST"), which is usually a different page on the same website or on a different website. The text content between the opening and closing <a> tags ("STANDARD") is the visible text of the hyperlink, i.e. the thing we right-clicked on.

What we care about here is the “/format?f=ST” string. Clicking on any of the formats changes the URL in our browser to include that format string, which takes us to a new page. We will have to collect all of these unique format strings so that we can navigate to each format in our program. Here’s what the Standard page looks like after navigating there.

As you can see, there are a ton of different deck archetypes, ranging from aggressive (aggro) to control to midrange to combo. Each of these archetypes have their own list of decks we can access. In order to access each archetype we again have to do a little bit of investigative inspection. Right-clicking on an archetype yields the following string:

”archetype?a=289&meta=58&f=ST”

I was able to discover that each of the archetypes have a distinct code which leads to a different loading of the same page. Here “Abzan Aggro” has an archetype of a=289.

We’ll get a list of all of the archetypes on this page with the following function:

import requests
from bs4 import BeautifulSoup as bs

def get_archetypes(url):
    """
    Extracts the archetype URLs from a web page using BeautifulSoup.

    Args:
        url (str): The URL of the web page to scrape.

    Returns:
        A list of strings, where each string is a URL representing an archetype.
    """
    archetypes = []

    response = requests.get(url)

    # Parse the HTML content of the response and create a BeautifulSoup object
    soup = bs(response.content, 'html.parser')

    # Find all hyperlinks in the parsed HTML that contain an 'href' attribute
    for link in soup.find_all('a', href=True):

        # Retrieve the value of the 'href' attribute for each hyperlink
        href = link.get('href')
        
        # Check if the string 'archetype?' is present in the value of the 'href' attribute
        if 'archetype?' in href:
            archetypes.append(href)

    print(f'ARCHETYPES: \n{archetypes}')
    return archetypes

First we’re using the built-in package named requests. The requests library in Python is used to send HTTP/1.1 requests to a web server and receive response data, making it easier to work with web APIs and other web-based data sources in Python code.

This function takes a single argument url, which is the URL of a web page to scrape for archetype URLs. It uses the requests library to send an HTTP request to the given URL and retrieve the HTML content of the response.

It then creates a BeautifulSoup object from the HTML content using the 'html.parser' parser. It searches the parsed HTML content for all hyperlinks (<a> tags) that have an href attribute, and retrieves the value of the href attribute for each hyperlink.

If the retrieved href value contains the string "archetype?", it is considered to be an archetype URL and is added to a list of archetypes. Collecting all of the archetypes in a list this way helps us loop over them and get a list of urls that contain a deck in them.

We can kind of see a pattern develop here of inspecting the elements you care about and defining a key characteristic that we want to access from our program in order to get to the next piece of the information puzzle. Next here, we would like to click on each deck. Let’s inspect and see what we find:

“event?e=40808&d=503529&f=ST”

The first deck in the list has a deck code of d=503529. We use a similar function to get_archetypes(url) called get_decks(url) to grab a list of deck urls that we can loop over. Great, now let’s see what happens after we click on the deck.

From this page we’re finally able to export the deck by clicking on either the MTGA or the MTGO buttons. Since not all formats are supported by MTGA lets go with the MTGO for now. I used a nifty library called selenium which uses a web driver to simulate a browser. Doing this allows us to interact with the page and click on the MTGO to download the deck to our computer. Then we can use the os module to delete the file once we have saved its contents.

Finalizing our Training Data

We’ve come a long way, and we have just a bit more work before we can start training our model. So far, we’ve been able to capture our card data and extract features from the cards which we think will be relevant to for our model to learn. We’ve also identified a representative deck dataset, primarily for constructed formats, which can provide our model with a good path to learning what a good mana base should look like.

We must merge our card and deck data together by merging on the card name.

merged_df = pd.merge(left=_deck, right=card_df, left_on='card', right_on='name2', how='left')

Right now, we have a big list of all of our decks, but each deck has around 20 rows and we really need to consolidate that as every deck should reside in just a single row before we feed it into the model for training. This is simple enough. Each deck is formatted as n card_name\n where n is an amount from 1–4, card_name is a unique card and \n denotes a new line where this format is repeated for another unique card. The amount will always be the first element in the line and so we can just slice that string and return both the card and amount in seperate columns in a copy of our dataframe.

# Extract the card amount
def extract_card_amount(_deck):
    def extract_card(row):
        try:
            card = row[1:].strip()
            return card
        except:
            return row

    def extract_card_amt(row):
        try:
            amount_of_card = row[0]
            return amount_of_card
        except:
            return row

    _deck = _deck.copy()
    _deck['card'] = _deck['card+amt'].apply(lambda row: extract_card(row))
    _deck['amount'] = _deck['card+amt'].apply(lambda row: extract_card_amt(row))

    _deck = _deck.drop(columns='card+amt')

    return _deck

Then, we can multiply all of our columns by the amount since all of our columns besides the converted mana cost is one-hot encoded. We can do this simply through the following:

# Multiply relevant columns by the amount col
    def multiply_amounts(row):
        try:
            return row * int(row.amount)
        except:
            return row

Then, we can collapse this deck into a single column by summing the values up and returning a new pd.DataFrame object.

def sum_columns_to_single_row(df):
        summed_values = {column: int(df[column].sum()) for column in df.columns}
        return pd.DataFrame([summed_values])

Just keep in mind that we’re performing these operations on both our input variables and our target variables. Finally, we can loop over our big list of decks and run it through the function named create_io which houses the previous functions we discussed along with some other handling like dropping columns, and some other miscellaneous things. We can use a library called tqdm for a progress bar that can fill us in on how far along our process we are.

for i in tqdm(range(num_decks), desc="Processing decks"):
    deck_i = deck_df.iloc[i][0]
    try:
        x, y = create_io(deck_i)
    except Exception as e:
        print(f'Failed to append deck: \n{deck_i}\nError Message: \n\t{e}')
        continue
    input = input.append(x)
    output = output.append(y)

Now we’re done with getting our usable training data! We have around 9600 decks to play with, and this is almost the final form of what we’ll use to train, although we will do some final transformations.

Step 3: Building a Model

Remove Outliers

One thing I had noticed after I had a model trained and was evaluating was that there was a subset of decks which had 0 lands and were complete outliers or just very creative deck design choices that our deck didn’t really handle well. I decided to remove these outliers and excluded decks with lands less than 10 and greater than 35. This improved the model significantly.

Stratify

Another thing I decided after training was to stratify my dataset. Imagine if I’m cutting my training data and for some reason my model is only seeing decks which have a very high cmc value. It goes to say that the model would not generalize well to decks with low cmc as it just didn’t have the available knowledge to do so. Stratifying means you divide your data into subgroups based on a certain category, in this case the cmc. Then, you can ensure that the training data will match this categorie’s distribution. We can use a pandas function called pd.qcutto achieve this.

input_no_outlier['cmc_cat'] = pd.qcut(input_no_outlier['avg_cmc'], q=5, labels=False)

This will create 5 subgroups that look like this:

Then we can use something called the StratifiedShuffleSplit class from scikit-learn to split our data. The code will look like this:

from sklearn.model_selection import StratifiedShuffleSplit as sss

# Stratify the input data to preserve the cmc distribution in training and testing set
split = sss(n_splits=1, test_size=0.2, random_state=69)

for train_index, test_index in split.split(input_no_outlier, input_no_outlier['cmc_cat']):
    X_train = input_no_outlier.iloc[train_index]
    X_test = input_no_outlier.iloc[test_index]
    y_train = output_no_outlier.iloc[train_index]
    y_test = output_no_outlier.iloc[test_index]

Then we just have to drop our categorical column cmc_cat.

Model Choice

When deciding between a neural network and a random forest model for a land recommender system, there are several considerations and trade-offs to keep in mind. A neural network can learn more complex patterns in the data and can potentially achieve better accuracy, but it requires more data and computational power to train. On the other hand, a random forest model is more interpretable and less prone to overfitting and can be trained with smaller datasets. Random forest models are also better suited for handling categorical variables and feature interactions, which can be important factors in determining the appropriate lands for a Magic: The Gathering deck.

I decided on using the random forest model as my dataset was nothing to write home about just yet and seemed like a better “out of the box” fit.

Our target matrix has 7 features,

color_sources_W
color_sources_U
color_sources_B
color_sources_R
color_sources_G
color_sources_C
number_of_lands

This means that we’ll need a regressor for multiple outputs. Scikit-Learn has just the thing.

from sklearn.ensemble import RandomForestRegressor
from sklearn.multioutput import MultiOutputRegressor

Hyperparameter choices

We’ll use another useful sklearn function called GridSearchCV to search a hyperparameter grid and train several small models in order to choose the parameters which performed best.

First, we’ll define our grid:

# Define the hyperparameters grid to search
param_grid = {
    'estimator__n_estimators': [10, 50, 100, 200],
    'estimator__max_depth': [None, 10, 20, 30],
    'estimator__min_samples_split': [2, 5, 10, 15],
    'estimator__min_samples_leaf': [1, 2, 5, 10]
}

We’ll have to wrap out random forest regressor class within our multi-output class,

# Instantiate the RandomForestRegressor
rf_gridsearch = RandomForestRegressor(random_state=42)

# Wrap the RandomForestRegressor with MultiOutputRegressor
multi_output_rf = MultiOutputRegressor(rf_gridsearch)

and then perform the grid search which will tune for the best hyperparameters.

# Instantiate the GridSearchCV with cross-validation
grid_search = GridSearchCV(estimator=multi_output_rf, param_grid=param_grid,
                           cv=5, scoring='neg_mean_squared_error', verbose=1, n_jobs=-1)

# Fit the GridSearchCV on the training data
grid_search.fit(X_train, y_train)

This code first instantiates a GridSearchCV object, which performs an exhaustive search over a specified hyperparameter grid for a given estimator — in this case, a multi-output RandomForestRegressor. It uses 5-fold cross-validation, optimizes based on the negative mean squared error, and parallelizes the search with the n_jobs=-1 parameter.

Next, it fits the GridSearchCV on the training data, effectively finding the optimal hyperparameter combination for the RandomForestRegressor model according to the chosen scoring metric.

Prediction

Now that we have the optimal hyperparameter combination for our multi-output regression model and have fit that model to our training variables, we can predict on our test set.

best_params = grid_search.best_params_
print("Best hyperparameters found by grid search:")
print(best_params)

best_model = grid_search.best_estimator_

# Make predictions using the best model
y_pred = best_model.predict(X_test)
# column_names = y_test.columns
# # convert from np.array to pd.dataframe
# y_pred = pd.DataFrame(y_pred, columns=column_names)

# Calculate the root mean squared error
rmse_per_target = mean_squared_error(y_test, y_pred, multioutput='raw_values') ** 1/2
print("Root Mean Squared Error for each target variable:", rmse_per_target)

# Calculate the R-squared score
r2_scores_per_target = r2_score(y_test, y_pred, multioutput='raw_values')
print("R-squared scores for each target variable:", r2_scores_per_target)

# Calculate the mean absolute error
mae_per_target = mean_absolute_error(y_test, y_pred, multioutput='raw_values')
print("Mean Absolute Error for each target variable:", mae_per_target)

This code retrieves the best hyperparameters found by the grid search and prints them. It then extracts the best estimator (the model with the optimal hyperparameters) and uses it to make predictions on the test data. The code also calculates the Root Mean Squared Error (RMSE), R-squared score, and Mean Absolute Error (MAE) for each target variable, which are performance metrics for the model. These metrics provide insight into how well the model with the best hyperparameters performs on unseen data for each of the target variables.

Evaluation

Lets check out our evaluation metrics to see how our model performed!

Not great, but not bad. We achieved a Root Mean Squared Error of ~2.3 for our land number target. That means that we can expect our prediction along this variable to be somewhat accurate. Furthermore, an r-squared value of .77 for number of lands indicates that we may have some uncaptured relationships in our features. Let’s look at the residual plots and see if this is true.

Residual plots display the residual values on the y-axis and fitted values, or another variable, on the x-axis. After you fit a regression model, it is crucial to check the residual plots. If your plots display unwanted patterns, you can’t trust the regression coefficients and other numeric results. The residuals should be randomly scattered across 0 with no predictability between the expected and observed values. Here, we can see some weird sort of downward lines forming a “stitching” pattern.

This confirms the suspicion that there are probably some uncaptured variables and our model could be improved.

Improvements and Other Considerations

There are a few avenues I could have gone down to improve the accuracy of the model, and although it would be useful to follow through on these. For the purpose I set out for originally, this kind of model doesn’t need to be perfect as its not being productionized.

Here are some things I could have done to improve,

Feature Extraction: Spending more time here could help us create a better correlation between our input and target variables. Based on our residual plots and r-squared score, we cannot be confident that our model was able to generalize as well as it could have
Polynomial Features: One thing that I had explored was raising the degree of our model to increase its complexity and determine whether there were any non-linear relationships present. I had explored up to degree 6 using sklearn’s PolynomialFeatures , however increasing the degree of complexity did not produce a clear increase in accuracy of the model
More Data: Another great way to introduce some more representative data would be to generate some new samples. I found a website called 17lands.com which publically hosts draft data for highly successful decks. They provide the raw data on their website which would eliminate the need to scrape. Then, we could use this data to augment the generalization capabilities of the model
Other Models: While I did train a few different algorithms including neural networks, random forest, and extreme gradient boosted forest, I decided to stick with random forest for its ease of use. Its unclear whether fine-tuning another model would yield better results.

Final Thoughts

Thanks for sticking it out with me. This was certainly a fun and challenging project, and although the results are not perfect or at the standard I would feel comfortable productionizing, I’m happy shelving it for now and perhaps working to improve later on. If you have any suggestions or comments feel free to share, and I’ll be happy to take a look or respond.

You can find the project hosted on my GitHub here.