Introducing itscalledsoccer

By Brian Greenwood and Tyler Richardett

For the latter half of 2021, American Soccer Analysis has been working on an R and Python library to make it easier to programmatically interact with our data. Today, we are happy to announce the release of the library which we've dubbed itscalledsoccer. If you want to get started right away, the library is available for download from CRAN and PyPI, and the source code can be found here.

In this article, we'll talk a bit about how we built the library and walk through a basic example or two.

Writing A Library

At a high level, itscalledsoccer is just a wrapper around the American Soccer Analysis API that powers our interactive tables. Many functions the library makes available are just API calls with some sensible defaults. When initially designing the library, we had considered writing the core functionality in C++ and having both versions directly call the shared C++ code, much like how xgboost is written. But given that we didn't have a lot of C++ expertise, we decided to maintain separate codebases for the R and Python versions. So in essence, the versions are two separate entities that we manually keep in sync. Both versions have caching enabled, so when making repeated calls, the results are stored locally, which speeds up performance and reduces load on the API. We have rate limiting enabled on the API, and the library gracefully handles this by executing large queries in batches.

How To Use itscalledsoccer

# Python

pip install itscalledsoccer

# R

install.packages("itscalledsoccer")

Once the library is installed, you'll want to import it and create a client.

# Python

from itscalledsoccer.client import AmericanSoccerAnalysis

asa_client = AmericanSoccerAnalysis()

# R

library(itscalledsoccer)

asa_client <- AmericanSoccerAnalysis$new()

Once you have a client, querying data can be done by calling any of the get_* functions. For instance, if I wanted to get goals added for a particular Philadelphia Union goalkeeper, I would use the get_goalkeeper_goals_added function. All the library functions return a data frame.

# Python

df = asa_client.get_goalkeeper_goals_added(leagues="mls", player_names="Andre Blake")

# R

df <- asa_client$get_goalkeeper_goals_added(leagues = "mls", player_names = "Andre Blake") %>% tidyr::unnest(data)

player_id team_id minutes_played action_type goals_added_raw goals_added_above_avg count_actions competition
7VqGmob3Mv 9z5k7Yg5A3 17805 Claiming -0.2881 -0.2653 279 mls
7VqGmob3Mv 9z5k7Yg5A3 17805 Fielding -1.1517 -0.1366 2279 mls
7VqGmob3Mv 9z5k7Yg5A3 17805 Handling -0.8955 -0.9808 566 mls
7VqGmob3Mv 9z5k7Yg5A3 17805 Passing 11.3168 -0.205 5203 mls
7VqGmob3Mv 9z5k7Yg5A3 17805 Shotstopping -1.4373 3.0576 805 mls
7VqGmob3Mv 9z5k7Yg5A3 17805 Sweeping -0.4462 -0.1893 373 mls

If I wanted to get demographic data of all past and present NWSL managers, I would use the get_managers function.

# Python

df = asa_client.get_managers(leagues="nwsl")

# R

df <- asa_client$get_managers(leagues = "nwsl")

manager_id manager_name nationality
e7MzExPQr0 Amy LePeilbet USA
NPqxODg59d Becky Burleigh USA
wvq9mKNqWn Christy Holly Northern Ireland
gOMnE7XQwN Craig Harrington England
lgpMOr7Qzy David Hodgson
NWMWR7eQlz Denise Reddy USA

Looking ahead

Again, in the interest of openness (and of getting others to do our work for us), we’d like to highlight that all of our source code for the library is publicly available on GitHub. If you’d like to report a bug or request a new feature, please open an issue. Or, if you’d like to contribute code, we’ve left some instructions there as well. Our application and API  do use modest compute resources, so we ask that you be mindful when using the library.

We plan to add new features over time, hopefully starting with a CLI. We hope that itscalledsoccer provides another useful way for folks to interact with our data, and if you do make something with the library we just ask that you give us proper attribution credit. And we would love for you to share it with us!