OOTP Developments Forums

OOTP Developments Forums (https://forums.ootpdevelopments.com//index.php)
-   FHM9 - General Discussion (https://forums.ootpdevelopments.com//forumdisplay.php?f=4038)
-   -   Toby's FHM Advanced Analytics Tool is here! (https://forums.ootpdevelopments.com//showthread.php?t=342429)

Gazkyo 12-06-2022 12:57 AM

Quote:

Originally Posted by TobyInHR (Post 4955194)
https://youtu.be/K_HyhVTZv2U

Very excited to finally be able to share this project with the community. For the last 3 and a half months, I have been developing a "companion" spreadsheet to our favorite spreadsheet hockey video game, and I feel like it is finally at a point where it's ready for everyone to access. This video will go over how to set up the Spreadsheet for your own save, and the basics of using it to analyze players in your league.

The heavy lifting is done by a series of linear regression models to provide projected stats for teams and players, including Expected Points for teams, expected goals and goals against for players, and a replacement-level calculation, allowing for stats like WAR, GAR, and SPAR, similar to Evolving Hockey.

All of this is done solely with in-game data, meaning you only need to import a few CSVs, and the stats that are created will be accurate to your save.

Please let me know if you run into any issues. Currently, the Sheet is only designed to work for standard NHL leagues. I do plan to have a version compatible with custom leagues eventually, but the foundation needs to be laid first in order to get to that point. There is also, unfortunately, a limitation to Google Sheets that may present an issue: projects cannot exceed 10 million cells. This sounds like a ton, but the reality is that FHM CSVs export a ton of information for each league you select when starting your save. As a result, if you have every league selected, you will likely exceed the cell limit while importing your data. To fix this, open the CSVs in Excel, and delete information unrelated to your league. This can be a little daunting, so feel free to reach out to me if you need assistance. In the meantime, an awesome community member is working on a lightweight tool to clean up your CSVs, reducing them to just the necessary data. When that is ready, I'll be sure to include a link to it in the spreadsheet.

If you want to skip the video, go straight to the link in the description to access the Sheet, along with the text instructions on how to set it up.

Enjoy!

EDIT: 12/4/2022 - new version of Spreadsheet (v1.22) available at this link: https://docs.google.com/spreadsheets...it?usp=sharing

This is excellent. Thank you so much for sharing. I have only recently started becoming interested in hockey analytics so this will be great resource to play around with.

Colorado 12-06-2022 06:50 PM

Toby, I think your brilliance has secured you a place in the FHM community Hall of Fame.

Love your enthusiasm and look forward to delving into this. Keep up the great work!

zoo05 12-10-2022 04:41 PM

Thank you for your work and sharing!

I'd be very interested in how do you use these data in your game.

TobyInHR 12-10-2022 11:50 PM

https://docs.google.com/spreadsheets...it?usp=sharing

Super excited to provide another update to the Advanced Stats tool!

Biggest changes are substantial reworks to GAR and WAR. GAR previously suffered from what I'd call a hard cap on how high it could go. This was because it was calculated by multiplying a player's production per minute by the RL player's TOI. This limited how much better the top players in the league could be than a replacement-level player for a number of reasons. Now, it is the other way around: the RL player's production per minute is multiplied by the actual player's TOI, allowing for much broader ranges on GAR.

This leads into the improvements to WAR, as well -- previously, replacement-level wins were calculated on their own, due to how I was factoring in defensive metrics. Now, defensive production has been converted to what amounts to "goals allowed above replacement." This restructure translates offense and defense into goals, a single variable, which allows me to use a simple formula for WAR, which is (GAR-GAAR)/[Goals to Wins coefficient]. The result is a much broader range for WAR, from roughly -3.0 to +7.0, with the concentration of players mostly around 0.0, as it should be.

You'll notice that the leaders/trailers dropdowns are a little mismatched at the moment -- I plan to update those so that they can be used to view all the advanced stats in the sheet, but it's a pain the butt to filter that information when I'm changing columns around, so they are a little messy right now. They'll be cleaned up soon.

TobyInHR 12-10-2022 11:55 PM

Quote:

Originally Posted by zoo05 (Post 4957724)
Thank you for your work and sharing!

I'd be very interested in how do you use these data in your game.

Glad you like it!

I plan to do a new video soon, going over a few big changes, but also taking the time to walk through my process of using the sheet in my game. Going over things like how to assess your own players, whether things point to needing to shuffle your lines or trade someone away, how to evaluate players on other teams, and how I start looking for replacements when it's time to move on from someone.

Ideally, I'd like to get to a point in my current save where my team needs some changes. Currently, I'm coming off a Cup appearance, and my lines have stayed pretty hot through the deadline, so I'm not really in a position to make moves that would show off how helpful the sheet can be, but as soon as that time comes, I'll be sure to get a video posted.

zoo05 12-11-2022 02:20 PM

Quote:

Originally Posted by TobyInHR (Post 4957809)
Glad you like it!

I plan to do a new video soon, going over a few big changes, but also taking the time to walk through my process of using the sheet in my game. Going over things like how to assess your own players, whether things point to needing to shuffle your lines or trade someone away, how to evaluate players on other teams, and how I start looking for replacements when it's time to move on from someone.

Ideally, I'd like to get to a point in my current save where my team needs some changes. Currently, I'm coming off a Cup appearance, and my lines have stayed pretty hot through the deadline, so I'm not really in a position to make moves that would show off how helpful the sheet can be, but as soon as that time comes, I'll be sure to get a video posted.

This sounds great, exactly what I need! I'm new to hockey management and to advanced stats as well, so it would be a great help! I'm waiting patiently till that comes out!

I have one question though: I'm still playing FHM 8. Does this affect the numbers in any way?

TobyInHR 12-13-2022 03:48 PM

Quote:

Originally Posted by zoo05 (Post 4957929)
This sounds great, exactly what I need! I'm new to hockey management and to advanced stats as well, so it would be a great help! I'm waiting patiently till that comes out!

I have one question though: I'm still playing FHM 8. Does this affect the numbers in any way?

https://youtu.be/1MM2KIk1XE0

Here ya go, boss! This video covers my process of analyzing my team in the offseason. It should give you some idea of which metrics I'm relying on most, what they mean, and how they should influence how you value players on your team. It also goes over how I find players that can fill certain spots where I need to improve.

Advanced stats are super cool, but never forget that stats don't win games -- they just help to provide additional context for what makes good players good, and bad players bad.

Regarding your question -- you're totally fine using this with FHM 8. The CSVs that are required should be identical to what is produced in FHM 8. I actually began working on this spreadsheet before 9 came out, so it was built with 8's resources in mind!

tomkmb 12-13-2022 04:02 PM

Quote:

Originally Posted by TobyInHR (Post 4958438)
https://youtu.be/1MM2KIk1XE0

Here ya go, boss! This video covers my process of analyzing my team in the offseason. It should give you some idea of which metrics I'm relying on most, what they mean, and how they should influence how you value players on your team. It also goes over how I find players that can fill certain spots where I need to improve.

Advanced stats are super cool, but never forget that stats don't win games -- they just help to provide additional context for what makes good players good, and bad players bad.

Regarding your question -- you're totally fine using this with FHM 8. The CSVs that are required should be identical to what is produced in FHM 8. I actually began working on this spreadsheet before 9 came out, so it was built with 8's resources in mind!

These contributions are so valuable to the community, thanks for taking the time to make these resources and explaining them all to us!

artlan 12-14-2022 09:31 PM

Amazing stuff, thank you so much!

JeffR 12-15-2022 12:07 PM

Next week's update should add the following to the game-by-game player stats csv (pending testing):

Team Shots on
Team Shots Against on
Team Shots Missed on
Team Shots Missed Against on
Team Shots Blocked on
Team Shots Blocked Against on
Team Goals on
Team Goal Against on
Team Shots off
Team Shots Against off
Team Shots Missed off
Team Shots Missed Against off
Team Shots Blocked off
Team Shots Blocked Against off
Team Goals off
Team Goal Against off

And:

OZ Starts
NZ Starts
DZ Starts
Team OZ Starts
Team NZ Starts
Team DZ Starts

All will be tracked for both classic and 2D engines.

I'll PM you as soon as we've got all of those in a working build (probably today or tomorrow) and give you the code to access the Steam beta builds so you can get a head start on adding the new stuff.

zoo05 12-15-2022 03:08 PM

Quote:

Originally Posted by TobyInHR (Post 4958438)
https://youtu.be/1MM2KIk1XE0

Here ya go, boss! This video covers my process of analyzing my team in the offseason. It should give you some idea of which metrics I'm relying on most, what they mean, and how they should influence how you value players on your team. It also goes over how I find players that can fill certain spots where I need to improve.

Advanced stats are super cool, but never forget that stats don't win games -- they just help to provide additional context for what makes good players good, and bad players bad.

Regarding your question -- you're totally fine using this with FHM 8. The CSVs that are required should be identical to what is produced in FHM 8. I actually began working on this spreadsheet before 9 came out, so it was built with 8's resources in mind!

I'm speechless, thank you for your effort! You shed light to a lot of things to me, esp. with the defensive forwards. I'm in the situation similar to yours, replacing a third line checking forward, and I wasn't sure what to look for. Thank you again!

TobyInHR 12-16-2022 12:25 AM

Quote:

Originally Posted by JeffR (Post 4958784)
Next week's update should add the following to the game-by-game player stats csv (pending testing):

Team Shots on
Team Shots Against on
Team Shots Missed on
Team Shots Missed Against on
Team Shots Blocked on
Team Shots Blocked Against on
Team Goals on
Team Goal Against on
Team Shots off
Team Shots Against off
Team Shots Missed off
Team Shots Missed Against off
Team Shots Blocked off
Team Shots Blocked Against off
Team Goals off
Team Goal Against off

And:

OZ Starts
NZ Starts
DZ Starts
Team OZ Starts
Team NZ Starts
Team DZ Starts

All will be tracked for both classic and 2D engines.

I'll PM you as soon as we've got all of those in a working build (probably today or tomorrow) and give you the code to access the Steam beta builds so you can get a head start on adding the new stuff.

Amazing, this is absolutely incredible stuff Jeff! Thank you!

Just in time, as well, as I’ve been experimenting with R for my data analysis, and pumping out CSVs that can be easily imported to Sheets. End goal is to have a web-based app, similar to the FHM CSV reader, to clean the sheets and make them small and easily accessible for Sheets, hopefully minimizing the lag that is currently becoming an issue 10 years into my save haha.

zoo05 12-18-2022 03:45 PM

Quote:

Originally Posted by TobyInHR (Post 4958438)
https://youtu.be/1MM2KIk1XE0

Regarding your question -- you're totally fine using this with FHM 8. The CSVs that are required should be identical to what is produced in FHM 8. I actually began working on this spreadsheet before 9 came out, so it was built with 8's resources in mind!

I think there are some differences between the CSV files of FHM 8 and 9, because for me the numbers of the table "Team trailers GAX" were missing, and the "Team leaders WAR" table showed the xSPAR data, the "Team leaders WAX" showed the SPAX data. It was in all cases one column difference on the ActiveRosterStats sheet. But after seeing your second video I was able to assign the correct columns to the tables, so everything is working now!

yorkie_17 12-24-2022 09:52 AM

Question
 
This is fantastic work Toby! I just have a couple questions. If expected goals are based on shot type etc, and the game only records shots in general, can it be misleading? A player with 3 shots from the blue line would have a higher expected goals than 2 shots from the crease/slot. And how would this affect different types of players? Playmakers would have less shots less expected goals but in real life individual impacts on expected goals are accounted for. Is this just the current limitations in the CSV files? This is a heck of a project and extremely well done!

TobyInHR 12-28-2022 09:59 AM

Quote:

Originally Posted by yorkie_17 (Post 4960834)
This is fantastic work Toby! I just have a couple questions. If expected goals are based on shot type etc, and the game only records shots in general, can it be misleading? A player with 3 shots from the blue line would have a higher expected goals than 2 shots from the crease/slot. And how would this affect different types of players? Playmakers would have less shots less expected goals but in real life individual impacts on expected goals are accounted for. Is this just the current limitations in the CSV files? This is a heck of a project and extremely well done!

This is an awesome question, and I'm glad someone asked it because now I can geek out over some stats quick lol.

You're correct, real life NHL stats that quantify xG are influenced by RTSS data available during every game of the season, which provides the on-ice location for every player when a shot occurs, the shot angle, speed, etc. Unfortunately, that information is not available in FHM's current form. However, we *are* given a number of other metrics, some of which are more helpful than others.

The current xG model in my spreadsheet uses linear regression to find a linear relationship between their even-strength goals and their TOI, corsi for, and shots on goal. Simply put, the more time a player spends on the ice, plus the more shots attempts his team generates, plus the number of shots he personally puts on net, all contribute to an expected increase in his even strength goals.

In its current form, my model has an R-squared value of around 79% for forwards and 65% for defensemen, meaning 79% and 65% of the results displayed can be explained with linear regression (in other words, there is a clear linear relationship between these things), and the standard error is usually around 4 goals.

This, on its own, provides us with results that are surprisingly accurate, but I know it can be better. Which is why I'm currently working on a new model that uses the awesome new data that was added in the last update. The new model actually runs two regressions for each position (FWD and DEF). The first regression finds the linear relationship between a player's EVG and his individual corsi for (now that we have missed shots included in the stats, we can find a player's own shot attempts). That simple linear regression usually outputs an R-squared value between 80% and 90% for both positions.

The second regression finds the relationship between a player's iCF and their hits, plus their giveaways, plus their time on ice, plus their offensive and defensive zone start percentage. This model also results in an R-sq value between 80% and 90%. This essentially gives us an expected corsi for each player, which is then used in the first regression to provide us with an expected goals value. The end result is a much more accurate xG on both ends of the spectrum: players who don't have much TOI aren't as heavily penalized, while players who score a ton of goals receive much better predictions.

For example, Connor McDavid is always someone in my saves who ends up 15-25 goals above expected. With this new model, he is consistently falling somewhere within 5-10 goals above expected. Still an outlier, as he should be, but one that we can still quantify to a certain degree.

The goal is to have this new model available early 2023. It comes with a host of other benefits as well, since I have switched to using R for my data analysis. Namely, it cleans the spreadsheets for you, resulting in a much faster experience. I just need to figure out the best way to share it with everyone, without requiring every person to download and install R studio lol.

TobyInHR 12-28-2022 10:24 AM

https://ibb.co/0mBZSCB

Here is a plot of the new model in action. The blue line represents the intersection of xG and EVG, so players close to the line are performing as expected. You can see there is still plenty of variance, which is to be expected -- we can't be perfectly accurate, otherwise we're cheating. But notice the heavy concentration of players near the line from 0 to 10 xG and EVG -- this indicates that we have improved accuracy for players who are not prolific goal scorers. You can also see that there are a large number of players who stick close to the line as they pass 10 xG/EVG. This again shows that we are properly accounting for stats that are important for elite scorers when predicting their xG.

artlan 12-28-2022 11:04 AM

Quote:

Originally Posted by TobyInHR (Post 4961767)
This is an awesome question, and I'm glad someone asked it because now I can geek out over some stats quick lol.

You're correct, real life NHL stats that quantify xG are influenced by RTSS data available during every game of the season, which provides the on-ice location for every player when a shot occurs, the shot angle, speed, etc. Unfortunately, that information is not available in FHM's current form. However, we *are* given a number of other metrics, some of which are more helpful than others.

The current xG model in my spreadsheet uses linear regression to find a linear relationship between their even-strength goals and their TOI, corsi for, and shots on goal. Simply put, the more time a player spends on the ice, plus the more shots attempts his team generates, plus the number of shots he personally puts on net, all contribute to an expected increase in his even strength goals.

In its current form, my model has an R-squared value of around 79% for forwards and 65% for defensemen, meaning 79% and 65% of the results displayed can be explained with linear regression (in other words, there is a clear linear relationship between these things), and the standard error is usually around 4 goals.

This, on its own, provides us with results that are surprisingly accurate, but I know it can be better. Which is why I'm currently working on a new model that uses the awesome new data that was added in the last update. The new model actually runs two regressions for each position (FWD and DEF). The first regression finds the linear relationship between a player's EVG and his individual corsi for (now that we have missed shots included in the stats, we can find a player's own shot attempts). That simple linear regression usually outputs an R-squared value between 80% and 90% for both positions.

The second regression finds the relationship between a player's iCF and their hits, plus their giveaways, plus their time on ice, plus their offensive and defensive zone start percentage. This model also results in an R-sq value between 80% and 90%. This essentially gives us an expected corsi for each player, which is then used in the first regression to provide us with an expected goals value. The end result is a much more accurate xG on both ends of the spectrum: players who don't have much TOI aren't as heavily penalized, while players who score a ton of goals receive much better predictions.

For example, Connor McDavid is always someone in my saves who ends up 15-25 goals above expected. With this new model, he is consistently falling somewhere within 5-10 goals above expected. Still an outlier, as he should be, but one that we can still quantify to a certain degree.

The goal is to have this new model available early 2023. It comes with a host of other benefits as well, since I have switched to using R for my data analysis. Namely, it cleans the spreadsheets for you, resulting in a much faster experience. I just need to figure out the best way to share it with everyone, without requiring every person to download and install R studio lol.

Looking forward to it, thank you!

TobyInHR 12-28-2022 02:24 PM

Quote:

Originally Posted by yorkie_17 (Post 4960834)
This is fantastic work Toby! I just have a couple questions. If expected goals are based on shot type etc, and the game only records shots in general, can it be misleading? A player with 3 shots from the blue line would have a higher expected goals than 2 shots from the crease/slot. And how would this affect different types of players? Playmakers would have less shots less expected goals but in real life individual impacts on expected goals are accounted for. Is this just the current limitations in the CSV files? This is a heck of a project and extremely well done!

I have been thinking about your question at work and wanted to add one more thing: it’s important to keep in mind that xG is not a prediction (at this time). It’s the number of shots that are expected to be goals, based on league-wide numbers. So you are correct, a playmaker will have fewer shots, and thus, fewer xG. But he will also score fewer goals, so his xG measured against his actual goals is still accurate. However, the WAR model in its current form definitely punishes players who don’t score goals because it is a combination of goals scored above replacement and goals against above replacement, converted to wins.

This is a difficult thing to overcome. In real life advanced stats, it’s not necessarily clear how we value a player who gets a ton of assists because whether a pass results in a goal is almost entirely dependent on the goal scorer, not the passer. And, in the end, goals create wins, not assists, so whether a pass led to a goal is almost never considered in most metrics. But that doesn’t mean players who set up goal scorers have zero offensive value!

Evolving Hockey has made some awesome strides in this area with their RAPM and SPM models (regularized adjusted plus minus and standardized plus minus). These metrics seek to quantify each player’s impact on the shot attempts taken while he’s on the ice, convert those to a general “offense” stat, translate the result to a value in goals, and convert *that* to wins.

The end result ensures that players who are prolific passers are not punished as heavily because their presence on the ice increases the shot attempts for their team, but again, whether they passed the puck into a shot attempt is not part of the analysis.

I would love to find a way to do something similar, especially with the new xG model being based on Corsi. I currently have a system that I’m working on that seems promising — essentially looking at a player’s Corsi above replacement differential, and finding the xG of that number. However, I’m finding that the biggest problem with this is that nobody ends up with a WAR that is less than 1.0. This doesn’t make sense, as there should be roughly 23% of players at or below 0 WAR. I haven’t worked out the math yet, but when I do, there should be some improvements to how playmakers are valued by the advanced stats.

vign5440 01-01-2023 07:27 AM

this is amazing, thanks so much for all your hard work.

I know the current set up isn't really compatible with custom games - but what about standard games with a custom team in it? i.e. I added an expansion team to my franchise mode and I don't see it come up in the advanced metrics drop down. is there a way I could make that work within the current sheet?

reverist 01-01-2023 08:44 AM

Quote:

Originally Posted by vign5440 (Post 4962768)
this is amazing, thanks so much for all your hard work.

I know the current set up isn't really compatible with custom games - but what about standard games with a custom team in it? i.e. I added an expansion team to my franchise mode and I don't see it come up in the advanced metrics drop down. is there a way I could make that work within the current sheet?

I've played with this some, and I don't want to mess you up. I do know one thing, though: under the Team Info tab (a hidden tab you should be able to access from the left), you would want to add your team ID, name, and logo. Then under the "Utility Page," you would want to do the same thing. Finally, under "Advanced Metrics," I believe when you select your expansion team the logo will be missing. Clicking where it should be (I think?) reveals the formula. You will want to amend the "search" the formula uses to include the cell under "Team Info" where your team is found (presumably one more cell; e.g., instead of F2:F33, it would be F2:F34). This should cause it to show. I think. I am not an expert.


All times are GMT -4. The time now is 12:35 AM.

Powered by vBulletin® Version 3.8.10
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
Copyright © 2024 Out of the Park Developments