Okay, so today I wanna talk about something I was messing around with – the Andreeva vs. Gracheva tennis match data. It’s not exactly rocket science, but I learned a few things, and figured I’d share.

First off, I grabbed the data. I just searched for “Andreeva vs Gracheva tennis match stats” and found a few sites with the info. I ended up copy-pasting it into a spreadsheet. Messy, I know, but it worked. No fancy APIs or anything this time.
Then I started cleaning it up. The raw data was all over the place. Some columns were missing, some were mislabeled, and the formatting was a nightmare. So I spent a good hour just fixing that stuff. You know, standard data cleaning chores. I used Excel for this, mostly because it’s what I had open.
After cleaning, I tried to extract some insights. Like, what were the key differences in their serving stats? How about their return game? I calculated some percentages, like first serve percentage, break point conversion rate, etc. Nothing too complex, just some basic arithmetic.
I also looked at the head-to-head records, if available. See if there were any patterns. Did one player consistently beat the other? Did home court advantage play a role? This kind of thing.
Here’s a breakdown of what I looked at:

- Serve stats: First serve percentage, ace count, double faults.
- Return stats: Break points won, return points won.
- Overall stats: Total points won, winners, unforced errors.
I also tried visualizing the data. I used a simple bar chart to compare their serve stats side-by-side. It made it easier to see who had the edge in each area. I used Google Sheets for this, pretty basic stuff, but it got the job done.
Here are some of the tools I used:
- Excel (for data cleaning)
- Google Sheets (for calculations and visualization)
In the end, it wasn’t anything groundbreaking, but it was a fun little project. I got to practice my data cleaning skills and learn a bit about tennis stats along the way.