P-Value

P-Value

In statistics, the p-value is the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct. The null hypothesis states that the portfolio's returns are equivalent to the S&P 500's returns over a specified period, while the alternative hypothesis states that the portfolio's returns and the S&P 500's returns are not equivalent — if the investor conducted a one-tailed test, the alternative hypothesis would state that the portfolio's returns are either less than or greater than the S&P 500's returns. If one researcher used a confidence level of 90% and the other required a confidence level of 95% to reject the null hypothesis and the p-value of the observed difference between the two returns was 0.08 (corresponding to a confidence level of 92%), then the first researcher would find that the two assets have a difference that is statistically significant, while the second would find no statistically significant difference between the returns. P-values are calculated from the deviation between the observed value and a chosen reference value, given the probability distribution of the statistic, with a greater difference between the two values corresponding to a lower p-value. Mathematically, the p-value is calculated using integral calculus from the area under the probability distribution curve for all values of statistics that are at least as far from the reference value as the observed value is, relative to the total area under the probability distribution curve.

A p-value is a measure of the probability that an observed difference could have occurred just by random chance.

What Is P-Value?

In statistics, the p-value is the probability of obtaining results at least as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is correct. The p-value is used as an alternative to rejection points to provide the smallest level of significance at which the null hypothesis would be rejected. A smaller p-value means that there is stronger evidence in favor of the alternative hypothesis.

A p-value is a measure of the probability that an observed difference could have occurred just by random chance.
The lower the p-value, the greater the statistical significance of the observed difference.
P-value can be used as an alternative to or in addition to pre-selected confidence levels for hypothesis testing.

How Is P-Value Calculated?

P-values are usually found using p-value tables or spreadsheets/statistical software. These calculations are based on the assumed or known probability distribution of the specific statistic being tested. P-values are calculated from the deviation between the observed value and a chosen reference value, given the probability distribution of the statistic, with a greater difference between the two values corresponding to a lower p-value.

Mathematically, the p-value is calculated using integral calculus from the area under the probability distribution curve for all values of statistics that are at least as far from the reference value as the observed value is, relative to the total area under the probability distribution curve. In a nutshell, the greater the difference between two observed values, the less likely it is that the difference is due to simple random chance, and this is reflected by a lower p-value.

P-Value Approach to Hypothesis Testing

The p-value approach to hypothesis testing uses the calculated probability to determine whether there is evidence to reject the null hypothesis. The null hypothesis, also known as the conjecture, is the initial claim about a population (or data generating process). The alternative hypothesis states whether the population parameter differs from the value of the population parameter stated in the conjecture.

In practice, the significance level is stated in advance to determine how small the p-value must be in order to reject the null hypothesis. Because different researchers use different levels of significance when examining a question, a reader may sometimes have difficulty comparing results from two different tests. P-values provide a solution to this problem.

For example, suppose a study comparing returns from two particular assets was undertaken by different researchers who used the same data but different significance levels. The researchers might come to opposite conclusions regarding whether the assets differ. If one researcher used a confidence level of 90% and the other required a confidence level of 95% to reject the null hypothesis and the p-value of the observed difference between the two returns was 0.08 (corresponding to a confidence level of 92%), then the first researcher would find that the two assets have a difference that is statistically significant, while the second would find no statistically significant difference between the returns.

To avoid this problem, the researchers could report the p-value of the hypothesis test and allow the reader to interpret the statistical significance themselves. This is called a p-value approach to hypothesis testing. An independent observer could note the p-value, and decide for themself whether that represents a statistically significant difference or not.

Example of P-Value

An investor claims that their investment portfolio's performance is equivalent to that of the Standard & Poor's (S&P) 500 Index. To determine this, the investor conducts a two-tailed test. The null hypothesis states that the portfolio's returns are equivalent to the S&P 500's returns over a specified period, while the alternative hypothesis states that the portfolio's returns and the S&P 500's returns are not equivalent — if the investor conducted a one-tailed test, the alternative hypothesis would state that the portfolio's returns are either less than or greater than the S&P 500's returns.

The p-value hypothesis test does not necessarily make use of a pre-selected confidence level at which the investor should reset the null hypothesis that the returns are equivalent. Instead, it provides a measure of how much evidence there is to reject the null hypothesis. The smaller the p-value, the greater the evidence against the null hypothesis. Thus, if the investor finds that the p-value is 0.001, there is strong evidence against the null hypothesis, and the investor can confidently conclude the portfolio's returns and the S&P 500's returns are not equivalent.

Although this does not provide an exact threshold as to when the investor should accept or reject the null hypothesis, it does have another very practical advantage. P-value hypothesis testing offers a direct way to compare the relative confidence that the investor can have when choosing among multiple different types of investments or portfolios, relative to a benchmark such as the S&P 500.

For example, for two portfolios, A and B, whose performance differs from the S&P 500 with p-values of 0.10 and 0.01, respectively, the investor can be much more confident that portfolio B, with a lower p-value, will actually show consistently different results.

Related terms:

Alpha Risk

Alpha risk is the risk in a statistical test of rejecting a null hypothesis when it is actually true.  read more

Asset

An asset is a resource with economic value that an individual or corporation owns or controls with the expectation that it will provide a future benefit. read more

Benchmark

A benchmark is a standard against which the performance of a security, mutual fund or investment manager can be measured. read more

Business Valuation , Methods, & Examples

Business valuation is the process of estimating the value of a business or company. read more

Goodness-of-Fit

A goodness-of-fit test helps you see if your sample data is accurate or somehow skewed. Discover how the popular chi-square goodness-of-fit test works. read more

Hypothesis Testing

Hypothesis testing is the process that an analyst uses to test a statistical hypothesis. The methodology employed by the analyst depends on the nature of the data used and the reason for the analysis. read more

Null Hypothesis : Testing & Examples

A null hypothesis is a type of hypothesis used in statistics that proposes that no statistical significance exists in a set of given observations. read more

Portfolio

A portfolio is a collection of financial investments like stocks, bonds, commodities, cash, and cash equivalents, including mutual funds and ETFs. read more

Probability Distribution

A probability distribution is a statistical function that describes possible values and likelihoods that a random variable can take within a given range.  read more

S&P 500 Index – Standard & Poor's 500 Index

The S&P 500 Index (the Standard & Poor's 500 Index) is a market-capitalization-weighted index of the 500 largest publicly traded companies in the U.S. read more