Alright, so yesterday I was messing around trying to pull player stats from that 49ers vs. Arizona Cardinals game. Here’s the lowdown on how I tackled it.
First things first, the data source. I wasn’t about to manually copy stuff, so I started hunting for a decent API. Found one that seemed promising – it spat out JSON, which is usually pretty easy to work with. Bookmarked that bad boy.
Next up, the code. I decided to keep it simple and used Python. I kicked things off with the usual suspects:
requests – for hitting the API.
json – for parsing the JSON response.
Pretty basic stuff, I know.
Then I wrote a quick function to actually grab the data. Something like this:
import requests
import json
def get_game_stats(game_id):
url = f"API_ENDPOINT_HERE/{game_id}" #I put the actual endpoint here but removed it for the example
response = *(url)
*_for_status() # Raise HTTPError for bad responses (4xx or 5xx)
return *()
Important bit: the *_for_status() line. Saved me a bunch of headaches later when the API was being flaky. Seriously, don’t skip error handling!
Diving into the JSON
Okay, so now I had the JSON data. Time to figure out the structure. I printed it out (or rather, pretty-printed it using *(data, indent=4) – makes life so much easier) and started poking around.
The JSON was nested like crazy. Player stats were buried deep inside, under keys like "teams", "players", and then finally, individual player info. Looked something roughly like this:
"gameInfo": { ... },
"teams": [
"teamName": "49ers",
"players": [
"name": "Brock Purdy",
"stats": {
"passingYards": 283,
"touchdowns": 1
"teamName": "Cardinals",
"players": [
Ugh. Nested loops were definitely in my future.
Extracting the Data
I wrote some code to loop through the teams, then through each team’s players, and finally, grab the stats I wanted. I was mainly interested in passing yards, rushing yards, and touchdowns.
def extract_player_stats(data):
player_stats = []
for team in data["teams"]:
for player in team["players"]:
name = player["name"]
passing_yards = player["stats"].get("passingYards", 0) # Using .get() to avoid errors if the stat is missing
See that .get() thing? That’s there because not all players have all stats. For example, a defensive player won’t have passing yards. Using .get() with a default value (like 0) prevents the code from crashing if a stat is missing. Trust me, it happens.
Putting it All Together
I then called these functions and printed out the results:
game_data = get_game_stats("the_actual_game_id") # Replace with the actual game ID
And boom! I had a list of player stats printed neatly in my console. Not the prettiest output, but it worked.
What I Learned
This whole thing took me about an hour, mostly because I spent time figuring out the JSON structure and handling potential errors. Key takeaways:
Always handle errors when hitting APIs. Your code will break otherwise.
*(data, indent=4) is your friend. Use it to understand the JSON.
.get() is a lifesaver when dealing with potentially missing data.
Next steps? I’m thinking of dumping this data into a CSV file or maybe even throwing it into a simple web app. But for now, I’m happy with getting the raw stats. That’s all folks!