I have a list of soccer games and 3 statistics. From these stats I'm trying to come up with the most accurate number of expected goals scored for both teams.
The three stats are shots, TIL, and HW.
To come up with an average shots/goal rate for each team we simply take the total home shots and divide by the total home goals, and total away shots divided by total away goals. I already did this in the table in Sheet3.
Once we have these two numbers, knowing the amount of shots each team took in a game we can find out the expected goals for each team (shots taken*shot/goal rate).
Now, I want to make this expected goals number more accurate by incorporating two other stats, TIL and HW. I believe these stats can be useful because when filtering for teams with a high TIL and HW, we can see that their shot/goal rate goes down, meaning it takes them less shots to score a goal.
I tried to do this myself with basic linear regression but for some reason it made the expected goals less accurate. Maybe a different form of regression would work better, or something else altogether.
I will share the spreadsheet, so you can see the results of my attempts in columns AL and AM of the Games sheet. Sheet3 is where I ran the regression. Lineups sheet can be avoided.
Most qualified candidate for the best price will be hired in the next 24 hours. When applying, let me know that you've read the job post and understand what I'm trying to do.
About the recuiterMember since Mar 14, 2020 Umasankar Mariy
from Nordjylland, Denmark