Alright, so yesterday I was messing around trying to predict the North Carolina vs. Virginia baseball game. Thought I’d jot down how it went down, more for myself than anything, but hey, maybe someone finds it useful.

First things first: Data Gathering.
- Started by scraping some basic team stats. Stuff like batting averages, ERAs, win-loss records for both UNC and UVA. Used Beautiful Soup in Python for this – super handy.
- Then, I dug a little deeper and grabbed individual player stats. Who’s hot right now? Who’s in a slump? That kinda stuff.
- Also looked at recent game results, head-to-head records between the two teams. You know, the usual background check.
Next up: Feature Engineering (aka Trying to Make Sense of the Mess).
- Calculated some moving averages for key stats, like runs scored per game, strikeouts per game, etc. Figured that would give me a sense of momentum.
- Created a “home advantage” feature. Is the game at UNC’s stadium? If so, give them a little bump.
- Tried to factor in pitching matchups. Who’s starting for each team? What are their stats like against similar opponents? This was tricky, lots of guesswork involved.
Then came the Model Building (The Fun Part… Supposedly).
- Kept it simple and went with a logistic regression model using scikit-learn. Didn’t want to overcomplicate things.
- Split the data into training and testing sets. Used about 80% for training, 20% for testing.
- Trained the model on the training data. Watched the accuracy slowly creep up.
Time for Predictions! (The Moment of Truth).
- Fed the model the testing data and got my predictions.
- Calculated the accuracy of the model on the testing data. It was… okay. Around 65%, if I remember correctly. Not great, not terrible.
So, what did the model say?
The model predicted that North Carolina would win by a slim margin.

The Actual Result:
Virginia won. Sigh.
What I learned:
- Baseball is unpredictable.
- My model was too simplistic. Probably needed to factor in more variables.
- Maybe I should just stick to watching the game and enjoying it, instead of trying to play Nostradamus.
Anyway, that was my attempt at predicting the baseball game. It was a fun little project, even if it didn’t pan out. Maybe I’ll try again next time, armed with a slightly better model and a healthy dose of skepticism.