22 February 2025

13 thoughts on “What is the Ideal Sample Size for Accurate Predictions?

  1. I’d love to dig deeper into the formula you used. Could you point me in the right direction? Is there anything I can specifically Google? Or any additional articles that you have that elaborate more on this process?

  2. I don’t know if this question has already been asked or addressed and if it has I apologize. My question is why is it the last 25 games and at least 6 years of head to head matches? Is this simply a percentage you picked? Or is there a mathematical formal behind it? Thanks and Cheers!

    1. You are talking about the Value Calculator… The 25 games are important, and the 6 H2H’s are a correction factor …. If you use the VC at the beginning of a season, then this means that data from the last season plus a few records of the previous season are included in the calculations. The further the season develops the more the previous season becomes obsolete.

      Yes, there is a lot of mathematical research behind it.

  3. Hi Soccerwidow,

    I understand that a large sample size is necessary to reduce error and improve the confidence level. However, considering that players and circumstances change frequently from season to season, would historical data be a good measure for future probabilities?

    Thank you!

    1. There is no other option than to use the little historical data available for calculating future probabilities. Just keep in mind that football clubs are professional entities, and even if players change the management will certainly employ suitable replacements.

  4. Have you tried to incorporate a exponential scaling function over-proportionally weighting receng matches. With going further back in time matches have a smaller weight in the overall calculation. Works pretty fine. And you can implement a very large dataset.

    Regards,
    Dennis

    1. Hi Dennis, thanks for your input 🙂

      You are on the right track… results become more accurate if giving matches further back in time smaller weight.

    2. Hi,

      This sounds very interesting. It reminds me of technical analysis in trading there is an EMA, which means Exponential Moving Average – more weight added on to recent price (in case you didn’t already know). Any tips on how to implement such a thing?

      Thanks.

      1. The EMA is a moving average that places a greater weight and significance on the most recent data points. This technical indicator is used to produce buy and sell signals based on crossovers and divergences from the historical average. You can use it for trading, for instance, 5-minutes, 15-minutes, 30-minutes, and 1-hour.

        However, it has nothing to do with Value Betting what Soccerwidow is all about and I’m not planning to extend the content into trading. Sorry, no time.

  5. Hi, The True Odds & Value Detector spreadsheet uses the 25 past games for analysis but this seems a small number, for say, Correct Score analysis. For this would it not be better to use a larger set of data, say 100 perhaps?

    1. Generally speaking you are right, if betting decisions for correct scores would be only based on scores from the last 25 games only. However, the value bet detector also calculates the goal expectation (please scroll down in the value bet detector spreadsheet – lines 88 to 98 in the ‘ValueCalc’ tab) based on home goals scored/conceded as well as away team goals & H2H’s. These numbers need also to be taken into consideration.

Leave a Reply

Your email address will not be published. Required fields are marked *