![]() |
Toby's FHM Advanced Analytics Tool is here!
https://youtu.be/K_HyhVTZv2U
Very excited to finally be able to share this project with the community. For the last 3 and a half months, I have been developing a "companion" spreadsheet to our favorite spreadsheet hockey video game, and I feel like it is finally at a point where it's ready for everyone to access. This video will go over how to set up the Spreadsheet for your own save, and the basics of using it to analyze players in your league. The heavy lifting is done by a series of linear regression models to provide projected stats for teams and players, including Expected Points for teams, expected goals and goals against for players, and a replacement-level calculation, allowing for stats like WAR, GAR, and SPAR, similar to Evolving Hockey. All of this is done solely with in-game data, meaning you only need to import a few CSVs, and the stats that are created will be accurate to your save. Please let me know if you run into any issues. Currently, the Sheet is only designed to work for standard NHL leagues. I do plan to have a version compatible with custom leagues eventually, but the foundation needs to be laid first in order to get to that point. There is also, unfortunately, a limitation to Google Sheets that may present an issue: projects cannot exceed 10 million cells. This sounds like a ton, but the reality is that FHM CSVs export a ton of information for each league you select when starting your save. As a result, if you have every league selected, you will likely exceed the cell limit while importing your data. To fix this, open the CSVs in Excel, and delete information unrelated to your league. This can be a little daunting, so feel free to reach out to me if you need assistance. In the meantime, an awesome community member is working on a lightweight tool to clean up your CSVs, reducing them to just the necessary data. When that is ready, I'll be sure to include a link to it in the spreadsheet. If you want to skip the video, go straight to the link in the description to access the Sheet, along with the text instructions on how to set it up. Enjoy! EDIT: 12/11/2022 - new version of Spreadsheet (v1.23) available at this link: https://docs.google.com/spreadsheets...it?usp=sharing Features in v1.23: rebalancing of GAR and WAR models to provide much more accurate statistics for replacement-level comparisons, and ensure there is a higher concentration of players around 0.0 GAR/WAR, rather than 0.0 being a floor for league players. EDIT: 12/13/2022 - https://youtu.be/1MM2KIk1XE0 A few people have requested this video, and I think it's a great idea. This covers my own personal process for looking at how my players did based on their advanced stats in the offseason, and how I go about narrowing the search for potential trade targets to improve my roster. This specific video will show my thought process heading into the draft, when the free agency pool and trade block have the most options available. In the future, I'll do a similar video when my team is having a difficult time being competitive in the regular season, and how I assess the problem and look to solve it. EDIT: 1/4/2023 - Major updates to my advanced stat tracker! A new webapp is available to help clean your CSVs prior to loading them, which entirely eliminates the slow loading and occasional freezing of the current Sheet. This is thanks to my switch to using R as my data manipulation program, rather than sheets, which has also provided me with a collection of tools for much more effective modeling for xG, GAR, and WAR! Now, a script will do all the calculations, and export a few CSVs that can be put into Sheets, taking the labor off of Google and improving performance. Second, I have made the necessary changes to allow for expansion teams to be added! Instructions are in the video. Thanks to the FHM team for the most recent update, which added in a bunch of crazy useful stats! Please use this video to see how to use the new tools, as well as how to add expansion teams. For those interested, I also cover my modeling process later in the video. https://youtu.be/1TSe41OWWwI |
Tremendous work, generous undertaking. I was just thinking today how lacking the game is in stats that give you actionable feedback, particularly as it pertains to roster and lineup on construction decisions- and the actual impact of the game engine systems. Out of curiosity, do you know if the data that can be pulled is tagged by either (1) team / unit /player tactic or (2) player tactical role (To be clear- I am NOT requesting that you add anything to your work, just wondering if that data was available for export).
Once more, a sincere Thank You. |
Unfortunately, I don't think there's a way to see tactics in the exported data. That is a great idea though, and could definitely be helpful.
My biggest hope is that this project gains enough traction to warrant an update to the CSVs -- ideally, we could eventually get stats for zone starts (which are in the game, but are not in the spreadsheets), shot location (even just high-danger, low danger), individual Corsi numbers, hits against, and nearly everything else that the real NHL tracks with RTSS data. If that is possible, then the advanced metrics could 100% mirror real life. |
Quote:
|
Thanks for doing this - when I get back from vacation I'll take a closer look and we'll see what extra stuff we can add to the csv's - adding more things to them is on the to-do list, but there's not much definition of exactly what we will/can add yet.
|
Something I'd like to see added to the csv is a column containing the team a player has played for in that particular game in the BoxScore csv.
I've made a rudimentary BoxScore creator and if a player moves to another team at some point during the season the BoxScore will show his current team instead of the team he played for that game. |
This is really fantastic, thanks for taking the time to create and share it.
|
Enjoy your vacation Jeff!
The biggest additions to what the CSV exports would be: - Simple shot location data: to make it simple, just seeing stats for high-danger shot attempts and low-danger shot attempts would work (that is, shot attempts from the slot, and from outside the slot). - Types of goals: not sure if this is tracked in the game data, but whether a goal was scored off a rebound, deflection, or clean. - Individual shot attempts/shot attempts against: this one is major. We get Corsi For, which is awesome, but having the data for each player would allow for Regularized Adjusted Plus Minus calculations, which is my *dream* lol. - Zone Starts: this data is in the game, so I'm not sure why it doesn't show up in the CSVs. But it's an important metric when weighting CF and CA, since players with more DZ starts will likely have an inflated CA, and vice versa. - Hits against: how many times a player gets hit provides insight into his possession skills, and possession is fundamental to WAR. I won't pretend to know how easy or difficult it is to add each of these, but it at least provides an idea of what might be out there that isn't currently included in the CSVs. Also, an option to limit the CSV data to the league that contains a player-controlled team would be massive, assuming that would be easier than creating a new interface to select which leagues are and are not exported; instead, there are just two options: "Export World" and "Export League." Especially if the league option does not export retired players haha. Please hit me up if you have any questions, suggestions, or information that may be of use! I have no plans to stop working on this any time soon, and the more big brains working on it together, the better. :) |
Quote:
|
Sounds interesting..doing a custom game right now so this wont work.
But If I decide to do a standard NHL one in the near future is it best to use it on google sheets or export and download as a ecel file? Have always had excel so don't know much about google sheets. |
1 Attachment(s)
Quote:
I think most of those requests will be doable, I'll ask Sebastian to have a look and see what issues there might be with recording the ones we don't do now. I know we'll have to check a couple of things to make sure they can be handled by both the 2D and classic engines, e.g. deflected shots are explicitly handled in 2D, but I think there may be a level of abstraction in the old engine that might make them difficult to record. Could be wrong, though. |
Quote:
|
Quote:
You are a savior. OOTPB is barebone stats for what is supposed to be a stats based game. I hope they recognize your talent and incorporate your product into the game. I |
Quote:
OMG....this is better than expected and MY WISH LIST!!! FHM takes is fan base seriously OOTPB still stuck with 20 year old reports and no intent on improving.. Show them how it is done FHM! |
Quote:
Regarding custom leagues: currently, the sheet is not set up for custom leagues, however, it can certainly be done. Sometimes with very minimal editing -- it all just depends on how your TeamIDs are setup. TeamID is used to isolate the relevant league to the NHLStats tab. By default, NHL teams are IDs 0-35, 4838 (SEA), and 4182 (VGK). This means I can use a filter formula to pull stats where TeamID<36, then have a separate tab where I pull stats where TeamID = 4838 and 4182. These tabs are necessary because they allow me to find replacement level players for a specific league without relying on hundreds of thousands of "If-then" functions, which result in *significant* lag. The Sheet can be adapted to custom leagues as long as you are able to easily isolate your teams by ID. Once things are stable with the base sheet, I'd like to work with some folks who do custom leagues so that I can see how the CSVs tend to look, then make a version of the Sheet that is easy to set up for the non-standard saves. |
Quote:
But I'd be more then willing to be a guinea pig when it comes to custom leagues. |
I have a quick question: I know you said wait until you have played 20 games, but I went ahead and did it after my team had done six games. I figured the top line guys would have 60 minutes by now. But on the expected goals and similar, I get the "VALUE" message. Is it because it needs to be 60 even strength minutes? Scratch that. I see their TOI is literally 58-59 mins. My bad. Awesome tool, I love it so far!
|
Quote:
|
Quote:
|
This is so cool! Thanks so much for your hard work!
|
Quote:
|
Toby, I think your brilliance has secured you a place in the FHM community Hall of Fame.
Love your enthusiasm and look forward to delving into this. Keep up the great work! |
Thank you for your work and sharing!
I'd be very interested in how do you use these data in your game. |
https://docs.google.com/spreadsheets...it?usp=sharing
Super excited to provide another update to the Advanced Stats tool! Biggest changes are substantial reworks to GAR and WAR. GAR previously suffered from what I'd call a hard cap on how high it could go. This was because it was calculated by multiplying a player's production per minute by the RL player's TOI. This limited how much better the top players in the league could be than a replacement-level player for a number of reasons. Now, it is the other way around: the RL player's production per minute is multiplied by the actual player's TOI, allowing for much broader ranges on GAR. This leads into the improvements to WAR, as well -- previously, replacement-level wins were calculated on their own, due to how I was factoring in defensive metrics. Now, defensive production has been converted to what amounts to "goals allowed above replacement." This restructure translates offense and defense into goals, a single variable, which allows me to use a simple formula for WAR, which is (GAR-GAAR)/[Goals to Wins coefficient]. The result is a much broader range for WAR, from roughly -3.0 to +7.0, with the concentration of players mostly around 0.0, as it should be. You'll notice that the leaders/trailers dropdowns are a little mismatched at the moment -- I plan to update those so that they can be used to view all the advanced stats in the sheet, but it's a pain the butt to filter that information when I'm changing columns around, so they are a little messy right now. They'll be cleaned up soon. |
Quote:
I plan to do a new video soon, going over a few big changes, but also taking the time to walk through my process of using the sheet in my game. Going over things like how to assess your own players, whether things point to needing to shuffle your lines or trade someone away, how to evaluate players on other teams, and how I start looking for replacements when it's time to move on from someone. Ideally, I'd like to get to a point in my current save where my team needs some changes. Currently, I'm coming off a Cup appearance, and my lines have stayed pretty hot through the deadline, so I'm not really in a position to make moves that would show off how helpful the sheet can be, but as soon as that time comes, I'll be sure to get a video posted. |
Quote:
I have one question though: I'm still playing FHM 8. Does this affect the numbers in any way? |
Quote:
Here ya go, boss! This video covers my process of analyzing my team in the offseason. It should give you some idea of which metrics I'm relying on most, what they mean, and how they should influence how you value players on your team. It also goes over how I find players that can fill certain spots where I need to improve. Advanced stats are super cool, but never forget that stats don't win games -- they just help to provide additional context for what makes good players good, and bad players bad. Regarding your question -- you're totally fine using this with FHM 8. The CSVs that are required should be identical to what is produced in FHM 8. I actually began working on this spreadsheet before 9 came out, so it was built with 8's resources in mind! |
Quote:
|
Amazing stuff, thank you so much!
|
Next week's update should add the following to the game-by-game player stats csv (pending testing):
Team Shots on Team Shots Against on Team Shots Missed on Team Shots Missed Against on Team Shots Blocked on Team Shots Blocked Against on Team Goals on Team Goal Against on Team Shots off Team Shots Against off Team Shots Missed off Team Shots Missed Against off Team Shots Blocked off Team Shots Blocked Against off Team Goals off Team Goal Against off And: OZ Starts NZ Starts DZ Starts Team OZ Starts Team NZ Starts Team DZ Starts All will be tracked for both classic and 2D engines. I'll PM you as soon as we've got all of those in a working build (probably today or tomorrow) and give you the code to access the Steam beta builds so you can get a head start on adding the new stuff. |
Quote:
|
Quote:
Just in time, as well, as I’ve been experimenting with R for my data analysis, and pumping out CSVs that can be easily imported to Sheets. End goal is to have a web-based app, similar to the FHM CSV reader, to clean the sheets and make them small and easily accessible for Sheets, hopefully minimizing the lag that is currently becoming an issue 10 years into my save haha. |
Quote:
|
Question
This is fantastic work Toby! I just have a couple questions. If expected goals are based on shot type etc, and the game only records shots in general, can it be misleading? A player with 3 shots from the blue line would have a higher expected goals than 2 shots from the crease/slot. And how would this affect different types of players? Playmakers would have less shots less expected goals but in real life individual impacts on expected goals are accounted for. Is this just the current limitations in the CSV files? This is a heck of a project and extremely well done!
|
Quote:
You're correct, real life NHL stats that quantify xG are influenced by RTSS data available during every game of the season, which provides the on-ice location for every player when a shot occurs, the shot angle, speed, etc. Unfortunately, that information is not available in FHM's current form. However, we *are* given a number of other metrics, some of which are more helpful than others. The current xG model in my spreadsheet uses linear regression to find a linear relationship between their even-strength goals and their TOI, corsi for, and shots on goal. Simply put, the more time a player spends on the ice, plus the more shots attempts his team generates, plus the number of shots he personally puts on net, all contribute to an expected increase in his even strength goals. In its current form, my model has an R-squared value of around 79% for forwards and 65% for defensemen, meaning 79% and 65% of the results displayed can be explained with linear regression (in other words, there is a clear linear relationship between these things), and the standard error is usually around 4 goals. This, on its own, provides us with results that are surprisingly accurate, but I know it can be better. Which is why I'm currently working on a new model that uses the awesome new data that was added in the last update. The new model actually runs two regressions for each position (FWD and DEF). The first regression finds the linear relationship between a player's EVG and his individual corsi for (now that we have missed shots included in the stats, we can find a player's own shot attempts). That simple linear regression usually outputs an R-squared value between 80% and 90% for both positions. The second regression finds the relationship between a player's iCF and their hits, plus their giveaways, plus their time on ice, plus their offensive and defensive zone start percentage. This model also results in an R-sq value between 80% and 90%. This essentially gives us an expected corsi for each player, which is then used in the first regression to provide us with an expected goals value. The end result is a much more accurate xG on both ends of the spectrum: players who don't have much TOI aren't as heavily penalized, while players who score a ton of goals receive much better predictions. For example, Connor McDavid is always someone in my saves who ends up 15-25 goals above expected. With this new model, he is consistently falling somewhere within 5-10 goals above expected. Still an outlier, as he should be, but one that we can still quantify to a certain degree. The goal is to have this new model available early 2023. It comes with a host of other benefits as well, since I have switched to using R for my data analysis. Namely, it cleans the spreadsheets for you, resulting in a much faster experience. I just need to figure out the best way to share it with everyone, without requiring every person to download and install R studio lol. |
https://ibb.co/0mBZSCB
Here is a plot of the new model in action. The blue line represents the intersection of xG and EVG, so players close to the line are performing as expected. You can see there is still plenty of variance, which is to be expected -- we can't be perfectly accurate, otherwise we're cheating. But notice the heavy concentration of players near the line from 0 to 10 xG and EVG -- this indicates that we have improved accuracy for players who are not prolific goal scorers. You can also see that there are a large number of players who stick close to the line as they pass 10 xG/EVG. This again shows that we are properly accounting for stats that are important for elite scorers when predicting their xG. |
Quote:
|
Quote:
This is a difficult thing to overcome. In real life advanced stats, it’s not necessarily clear how we value a player who gets a ton of assists because whether a pass results in a goal is almost entirely dependent on the goal scorer, not the passer. And, in the end, goals create wins, not assists, so whether a pass led to a goal is almost never considered in most metrics. But that doesn’t mean players who set up goal scorers have zero offensive value! Evolving Hockey has made some awesome strides in this area with their RAPM and SPM models (regularized adjusted plus minus and standardized plus minus). These metrics seek to quantify each player’s impact on the shot attempts taken while he’s on the ice, convert those to a general “offense” stat, translate the result to a value in goals, and convert *that* to wins. The end result ensures that players who are prolific passers are not punished as heavily because their presence on the ice increases the shot attempts for their team, but again, whether they passed the puck into a shot attempt is not part of the analysis. I would love to find a way to do something similar, especially with the new xG model being based on Corsi. I currently have a system that I’m working on that seems promising — essentially looking at a player’s Corsi above replacement differential, and finding the xG of that number. However, I’m finding that the biggest problem with this is that nobody ends up with a WAR that is less than 1.0. This doesn’t make sense, as there should be roughly 23% of players at or below 0 WAR. I haven’t worked out the math yet, but when I do, there should be some improvements to how playmakers are valued by the advanced stats. |
this is amazing, thanks so much for all your hard work.
I know the current set up isn't really compatible with custom games - but what about standard games with a custom team in it? i.e. I added an expansion team to my franchise mode and I don't see it come up in the advanced metrics drop down. is there a way I could make that work within the current sheet? |
Quote:
|
Major updates to my advanced stat tracker! A new webapp is available to help clean your CSVs prior to loading them, which entirely eliminates the slow loading and occasional freezing of the current Sheet. This is thanks to my switch to using R as my data manipulation program, rather than sheets, which has also provided me with a collection of tools for much more effective modeling for xG, GAR, and WAR! Now, a script will do all the calculations, and export a few CSVs that can be put into Sheets, taking the labor off of Google and improving performance.
Second, I have made the necessary changes to allow for expansion teams to be added! Instructions are in the video. Thanks to the FHM team for the most recent update, which added in a bunch of crazy useful stats! Please use this video to see how to use the new tools, as well as how to add expansion teams. For those interested, I also cover my modeling process later in the video. https://youtu.be/1TSe41OWWwI |
Quote:
|
When I am cleaning the files there are only 3 upload buttons instead of 4 in the video. The missing one is cleaned team records. How do I fix this?
|
Quote:
|
Quote:
Appears to be working correctly now. |
Def working now. Just wanna say you're an absolute hero for this tool! Not many would put in this kind of work for such a nice project
|
Quote:
|
Yes it working now. Awesome!
|
Thanks everyone! It means so much that you all are finding good use out of this tool. It's a labor of love, and like I have said before, it gives me so much fulfillment to add this level of "custom" input into one of my favorite games. Being able to share it with the community takes that to a new level.
Here's hoping that the developers continue to work with us to provide even more trackable metrics to the CSVs so that we can continue to produce statistically sound advanced stats. |
Update: after doing some cleanup and checking out the usage metrics for the webapp, it looks like we are already at 20 active hours from January 2 to January 8 lol. That is *incredible*! Unfortunately, it means the site may become unavailable within the next couple days.
Not to worry, I am going to immediately start working on a desktop, offline webapp to replace the browser version. This will introduce some new challenges, like pushing updates and ensuring compatibility with different versions of Windows, but it seems like it will be necessary. Keep an eye out on this thread for updates. Thanks everyone! |
All times are GMT -4. The time now is 08:12 PM. |
Powered by vBulletin® Version 3.8.10
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
Copyright © 2024 Out of the Park Developments