Alright, let’s dive into my little experiment with predicting Tomas Martin Etcheverry’s performance. It was a bit of a wild ride, but hey, that’s what makes it fun, right?

So, it all started because I was bored, plain and simple. I mean, I had seen Etcheverry play a few times and thought, “Why not try and build something to predict his matches?”. I figured I’d learn a thing or two along the way. Plus, bragging rights if I got it right, obviously!
First things first, data. I spent ages scraping data from various tennis websites. Match stats, player rankings, court surfaces – the whole shebang. It was messy, let me tell you. Dates in different formats, missing data, websites changing their layouts… It was a pain in the butt. I ended up cleaning it all up with Python and Pandas. That alone took a solid week.
Then came the fun part, at least I thought so at the time: the modeling. I tried a bunch of different things. Started with a simple logistic regression because, well, it’s simple! But the results were… meh. Then I dabbled with some more complex stuff, like Random Forests and Gradient Boosting. Those seemed promising, but still not quite there.
- Logistic Regression: Easy to implement, but not great performance.
- Random Forest: Better, but prone to overfitting.
- Gradient Boosting: Promising, but needed lots of tweaking.
The biggest challenge? Feature engineering. What stats actually MATTER when predicting Etcheverry’s matches? Was it his serve percentage? His unforced errors on clay? His head-to-head record against left-handed players? I spent hours trying different combinations, tweaking the models, and pulling my hair out.
I even tried to factor in things like the weather, the crowd support, and whether Etcheverry had just eaten a good empanada (okay, maybe not that last one, but you get the idea!).

One thing I realized early on was that court surface was HUGE. Etcheverry is clearly better on clay than on hard courts, so I had to make sure the model took that into account. I also found that his performance against top-ranked players was a good indicator of his overall form.
After weeks of tinkering, I finally settled on a Gradient Boosting model with a bunch of custom features. I trained it on all the historical data I could find and then… the moment of truth! I used it to predict the outcome of Etcheverry’s next match.
The result? Let’s just say it wasn’t perfect. I got some predictions right, and some hilariously wrong. But that’s the name of the game, right? It was a good learning experience. I learned a ton about data science, machine learning, and the surprisingly complex world of professional tennis.
Would I do it again? Probably. Maybe with more data, better features, and a bit more luck. But hey, it was a fun project, and I got to tell you all about it! So, yeah, that was my wild ride trying to predict Tomas Martin Etcheverry’s matches. What a blast!