Prediction vs Inference

Statistical LearningMachine LearningPredictionInferenceFunction EstimationCorrelation

Read in about 3 min read

Published: 2023-09-06

Last modified: 2025-06-11

View count: 20

Summary

Explains the differences between prediction and inference, the two main purposes of statistical learning, covering their characteristics and trade-off relationships. Includes the importance of function estimation for analyzing relationships between variables and interpretability.

Correlation

A correlation may exist between late-night snack consumption and weight changes. When conducting quantitative analysis of such correlations, it is important to identify specific functional relationships. For example, if we can establish a clear correlation that 'X instances of late-night snack consumption cause Y(kg) of weight gain', this would provide valuable information for developing weight management strategies.

To analyze such correlations, we can set weight as the output variable Y and the frequency of late-night snack consumption as the input variable X, then express the relationship between Y and X as a function.

Statistical Learning

$Y = f(X)$
The set of techniques for estimating function f is called Statistical Learning. Statistical learning is a field that emerged under the influence of Machine Learning, and the two fields are closely related.

Even if function f is accurately estimated, there is inevitable error between the estimated function $\hat{f}$ (the hat symbol indicates an estimated value) and the actual function f. This error is expressed as $\epsilon$ and is called the error term.

Therefore, statistical learning is more accurately expressed as a collection of techniques for estimating function f in $Y = f(X) + \epsilon$ .

The Purpose of Statistical Learning: Prediction vs Inference

The main purposes of statistical learning can be broadly classified into two categories. The first is the inference purpose, which aims to explain and understand the relationships between variables, and the second is the prediction purpose, which aims to forecast future outcomes.

For example, analyzing the correlation between late-night snack consumption and weight changes to understand the underlying mechanism corresponds to the inference purpose. Conversely, predicting weight changes based on the frequency of late-night snack consumption over a specific period corresponds to the prediction purpose.

Prediction

Let us examine a case with prediction as the purpose.

Research subject A is experiencing weight gain due to increased frequency of late-night snack consumption. A observed that weekly late-night snack consumption varies from 1 to 4 times, with corresponding weight changes. A wants to predict weight changes when consuming late-night snacks every day for a week, but has decided not to conduct the actual experiment for health reasons. Therefore, A intends to use statistical learning to estimate the correlation between late-night snack consumption and weight, and perform weight prediction for hypothetical scenarios based on this estimation.

This is a typical case of prediction through statistical learning and serves as a major motivation for estimating function f.

Inference

Let us examine a case with inference as the purpose.

Research subject B is experiencing recent weight gain but cannot clearly identify the cause. Colleagues have suggested that either late-night snack consumption or pork belly consumption might be the cause. B possesses historical data on late-night snack and pork belly consumption schedules and corresponding daily weight measurements. B intends to analyze the relationship between weight and late-night snacks and pork belly consumption to identify which of the two factors has a greater impact on weight.

This is a case of inference through statistical learning, with the purpose of explaining and understanding the relationship between Y and X. This also serves as an important motivation for estimating function f.

The reason it is important to distinguish between these two purposes is that the form of function f varies depending on the purpose.

Changes in Function According to Purpose

When prediction is the primary purpose, function f can take complex forms such as:
$Y=\beta_{1}X^{128902398}+\beta_{2}X^{8998}+\beta_{3}X^{139}+\cdots$

Such functions are difficult to interpret intuitively regarding the relationship between Y and X. However, if prediction accuracy is the primary goal, prediction performance takes priority over interpretability.

Conversely, when inference is the primary purpose, function f must be constrained to interpretable forms. Complex functions are difficult for humans to understand and explain.

For inference purposes, function f should be simple and interpretable:
$Y = X + 80(kg)$
Where X represents weekly late-night snack consumption frequency and Y represents weight.

In this case, the relationship between late-night snacks and weight can be clearly explained as follows:

Weight increases proportionally to the frequency of late-night snack consumption, with each late-night snack causing a 1kg weight increase.

The Trade-off Between Prediction and Inference

The prediction performance of the highly interpretable model $Y = X + 80(kg)$ is likely to be relatively low.

Conversely, according to empirical observations, complex models such as:
$Y=\beta_{1}X^{128902398}+\beta_{2}X^{8998}+\beta_{3}X^{139}+\cdots$
can show much higher prediction performance. However, intuitive explanation of the relationship between Y and X is impossible.

Thus, there exists a trade-off relationship between prediction and inference. As prediction performance improves, inference and interpretability tend to decrease.

In actual research and application fields, it is common to require satisfaction of both purposes. Therefore, finding the optimal function f that balances prediction performance and interpretability is a core challenge in statistical learning research.

An interesting point is that current artificial intelligence systems have limited interpretability regarding their operating principles as a result of focusing on prediction performance.