Mann-Whitney U test

Mann-Whitney U Test

The **Mann-Whitney U test** (also known as the Wilcoxon rank-sum test) is a non-parametric statistical test used to assess whether two independent samples come from the same distribution. Unlike parametric tests like the t-test, the Mann-Whitney U test does not assume that the data are normally distributed. This makes it particularly useful for analyzing data that is ordinal, ranked, or has a non-normal distribution. It’s a fundamental tool in Statistical Analysis and a frequent companion to other techniques like Regression Analysis.

1. When to Use the Mann-Whitney U Test

The Mann-Whitney U test is appropriate in the following situations:

**Two Independent Samples:** You have two distinct groups of participants or observations. The data from one group should not influence the data from the other.
**Non-Normal Data:** The data in one or both samples do not meet the assumptions of normality required for parametric tests. Checking for normality can be done with tests like the Shapiro-Wilk Test.
**Ordinal Data:** The data are ranked or represent ordered categories (e.g., levels of agreement: strongly disagree, disagree, neutral, agree, strongly agree).
**Continuous Data (with caveats):** While often used with ordinal data, it can also be applied to continuous data when normality cannot be assumed. However, it's crucial to understand it tests for *stochastic dominance*, meaning it detects if values in one group are generally higher than the other, not necessarily if the means are different.
**Comparing Medians:** The test assesses whether the distributions of the two samples differ, and this is often interpreted as a comparison of the medians, though it's more accurate to say it tests for differences in the overall distributions.

1. Null and Alternative Hypotheses

The Mann-Whitney U test is a hypothesis test, meaning it aims to determine whether there is enough evidence to reject a null hypothesis.

**Null Hypothesis (H₀):** The two samples come from the same distribution. In other words, there is no difference between the two populations from which the samples are drawn. Statistically, this means the probability of a randomly selected value from one population being greater than a randomly selected value from the other population is 0.5.
**Alternative Hypothesis (H₁):** The two samples come from different distributions. There is a difference between the two populations. This can be one-tailed or two-tailed:

   * **Two-tailed:**  The distributions are different (values in one group are generally higher *or* lower than the other).
   * **One-tailed:** The values in one group are generally higher than the other *or* the values in one group are generally lower than the other.  You must specify the direction of the difference *before* conducting the test.  This requires careful consideration based on the research question. The directionality is often informed by understanding Market Sentiment or pre-existing research.

1. How the Mann-Whitney U Test Works

The Mann-Whitney U test works by ranking all the observations from both samples together, from lowest to highest. Then, it calculates the sum of the ranks for each sample. The U statistic is calculated based on these rank sums. The lower the U statistic, the greater the evidence against the null hypothesis.

Here's a step-by-step breakdown:

1. **Combine and Rank:** Pool all observations from both samples into a single dataset. Rank all the observations from 1 to N (where N is the total number of observations). Assign average ranks in case of ties. 2. **Calculate Rank Sums:** Calculate the sum of the ranks (R₁) for the first sample and the sum of the ranks (R₂) for the second sample. 3. **Calculate U Statistics:** Calculate the U statistic for each sample using the following formulas:

  * U₁ = n₁ * n₂ + (n₁ * (n₁ + 1)) / 2 – R₁
  * U₂ = n₁ * n₂ + (n₂ * (n₂ + 1)) / 2 – R₂
  where:
     * n₁ is the sample size of the first sample
     * n₂ is the sample size of the second sample
     * R₁ is the sum of the ranks for the first sample
     * R₂ is the sum of the ranks for the second sample

4. **Determine the Test Statistic:** The test statistic is usually the smaller of U₁ and U₂. This ensures a conservative test. 5. **Determine the p-value:** The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming the null hypothesis is true. The p-value can be determined using:

  * **Exact Distribution:** For small sample sizes (typically n < 20), the exact distribution of the U statistic can be used.
  * **Normal Approximation:** For larger sample sizes (typically n > 20), the distribution of the U statistic can be approximated by a normal distribution.  This approximation requires calculating a mean and standard deviation.

6. **Compare p-value to Significance Level (α):** If the p-value is less than or equal to the significance level (α), typically 0.05, then the null hypothesis is rejected. This indicates that there is a statistically significant difference between the two populations. The choice of α depends on the desired level of confidence. In Technical Indicators, a similar concept of confidence levels is used for band width calculations.

1. Example

Let's say we want to compare the performance of two different Trading Strategies: Strategy A and Strategy B. We collect data on the percentage return for 10 trades using each strategy.

- Strategy A Returns:** 2%, 5%, 1%, 3%, 4%
- Strategy B Returns:** -1%, 0%, 2%, 6%, 3%

1. **Combine and Rank:**

  Combined data: -1%, 0%, 1%, 2%, 2%, 3%, 3%, 4%, 5%, 6%
  Ranks: 1, 2, 3, 4.5, 4.5, 6, 6, 8, 9, 10

2. **Calculate Rank Sums:**

  R₁ (Strategy A): 3 + 9 + 1 + 6 + 8 = 27
  R₂ (Strategy B): 2 + 4.5 + 4.5 + 6 + 10 = 27

3. **Calculate U Statistics:**

  U₁ = (5 * 5) + (5 * 6) / 2 – 27 = 25 + 15 – 27 = 13
  U₂ = (5 * 5) + (5 * 6) / 2 – 27 = 25 + 15 – 27 = 13

4. **Determine the Test Statistic:**

  U = min(U₁, U₂) = 13

5. **Determine the p-value:** Using a Mann-Whitney U table or statistical software (like R Programming Language or Python for Data Science), with n₁ = 5 and n₂ = 5, and U = 13, we find the p-value to be approximately 0.75 (two-tailed).

6. **Compare p-value to Significance Level:**

  Since 0.75 > 0.05, we fail to reject the null hypothesis.  There is no statistically significant difference in the performance of Strategy A and Strategy B based on this data.

1. Assumptions of the Mann-Whitney U Test

While the Mann-Whitney U test is less restrictive than parametric tests, it still has some assumptions:

**Independence:** The observations within each sample must be independent of each other.
**Ordinal Scale:** The data should be measured on at least an ordinal scale.
**Similar Shape:** The distributions of the two populations should have a similar shape. The Mann-Whitney U test primarily assesses differences in location (e.g., medians), and differences in shape can confound the results. Analyzing Candlestick Patterns also relies on shape recognition.
**No Significant Outliers:** Extreme outliers can disproportionately influence the ranking process.

1. Advantages and Disadvantages

- Advantages:**

**Non-parametric:** Doesn't require assumptions about the underlying distribution of the data.
**Suitable for Ordinal Data:** Well-suited for analyzing ranked or ordered data.
**Less Sensitive to Outliers:** Less affected by extreme values than parametric tests.
**Easy to Understand:** The concept is relatively straightforward to grasp.

- Disadvantages:**

**Less Powerful:** Generally less powerful than parametric tests when the assumptions of the parametric tests are met. This means it may be less likely to detect a true difference when one exists. Power analysis is important in Algorithmic Trading to optimize strategy performance.
**Only Tests for Differences in Distribution:** Interpreting the results as a simple difference in medians can be misleading. It tests for stochastic dominance.
**Can be Computationally Intensive:** Calculating the exact p-value can be computationally intensive for large sample sizes.

1. Mann-Whitney U Test vs. Other Tests

**T-test:** The t-test is a parametric test that assumes normality. If the normality assumption is violated, the Mann-Whitney U test is a more appropriate choice. The t-test focuses on differences in means, while the Mann-Whitney U test focuses on differences in distributions. Understanding Volatility can help determine if a t-test is appropriate.
**Wilcoxon Signed-Rank Test:** The Wilcoxon signed-rank test is used for *paired* samples (e.g., before-and-after measurements on the same individuals), while the Mann-Whitney U test is used for *independent* samples. Correlation Analysis often precedes paired tests.
**Kruskal-Wallis Test:** The Kruskal-Wallis test is an extension of the Mann-Whitney U test for comparing more than two independent samples. It’s analogous to ANOVA. Analyzing Trend Lines often involves comparing multiple data points.
**Kolmogorov-Smirnov Test:** The Kolmogorov-Smirnov test is another non-parametric test that can compare two distributions, but it is more sensitive to differences in the entire shape of the distributions, while the Mann-Whitney U test is more sensitive to differences in location. The Fibonacci Sequence is used to identify potential shifts in trend.

1. Applications in Finance and Trading

The Mann-Whitney U test can be applied in various finance and trading contexts:

**Comparing Strategy Performance:** As illustrated in the example, it can compare the returns of different trading strategies.
**Analyzing Expert Opinions:** Comparing the ratings or rankings of stocks or assets by different analysts.
**Evaluating Risk Tolerance:** Comparing the risk preferences of different investor groups.
**Assessing Market Sentiment:** Comparing the sentiment scores (e.g., based on news articles or social media) before and after a significant market event.
**Backtesting Trading Rules:** Determining if a particular trading rule consistently outperforms a benchmark. Monte Carlo Simulation can be used to validate backtesting results.
**Comparing Volatility:** Assessing if the volatility of two assets is significantly different. Bollinger Bands are a common tool for volatility analysis.
**Analyzing Order Book Data:** Comparing the distribution of order sizes or price levels. Limit Order Books are crucial for understanding market depth.
**Evaluating the Effectiveness of Educational Programs:** Determining if a training program improves traders' performance. Moving Averages can be used to smooth out performance data.
**Comparing the Performance of Different Brokers:** Assessing if there is a significant difference in the execution quality of different brokers.
**Identifying Statistical Arbitrage Opportunities:** Detecting discrepancies in the price distributions of related assets. Pairs Trading is a common application.
**Seasonality Analysis:** Determining if certain months or days of the week consistently produce higher or lower returns. Seasonal Patterns are important for long-term investment strategies.
**Sector Rotation Analysis:** Comparing the performance of different sectors of the stock market. Economic Indicators often drive sector rotation.
**Momentum Investing:** Identifying stocks with consistently high relative strength. Relative Strength Index (RSI) is a popular momentum indicator.
**Mean Reversion Strategies:** Identifying assets that are likely to revert to their historical average price. Oscillators are often used in mean reversion strategies.
**Trend Following Strategies:** Identifying and capitalizing on established trends. MACD (Moving Average Convergence Divergence) is a common trend-following indicator.
**Gap Analysis:** Comparing the distribution of price gaps between different assets. Chart Patterns can help identify potential gap trading opportunities.
**Support and Resistance Levels:** Assessing the strength of support and resistance levels based on price distribution. Pivot Points are used to identify potential support and resistance levels.
**Elliott Wave Theory:** Analyzing the distribution of wave patterns to predict future price movements. Wave Analysis is a complex but potentially rewarding technique.
**Ichimoku Cloud Analysis:** Comparing the position of price relative to the Ichimoku Cloud to identify potential trading signals. Ichimoku Kinko Hyo is a comprehensive technical analysis system.
**Harmonic Patterns:** Identifying specific geometric patterns in price charts to predict future price movements. Butterfly Pattern is a popular harmonic pattern.
**Wyckoff Method:** Analyzing price and volume action to identify accumulation and distribution phases. Volume Spread Analysis is a key component of the Wyckoff Method.
**Renko Charts:** Comparing the distribution of Renko bricks to identify trends and reversals. Renko Charts filter out noise and focus on price movements.
**Heikin Ashi Charts:** Comparing the distribution of Heikin Ashi candles to identify trends and reversals. Heikin Ashi smooths out price data and highlights trends.

Data Mining techniques can also be combined with Mann-Whitney U tests to uncover hidden patterns in financial data.

1. Software Packages

Many software packages can perform the Mann-Whitney U test:

**R:** `wilcox.test()` function
**Python (SciPy):** `scipy.stats.mannwhitneyu()` function
**SPSS:** Nonparametric Tests -> Independent Samples
**Excel:** While not directly available, it can be calculated using formulas and statistical add-ins.
**MATLAB:** `ranksums()` function

Statistical Software is essential for accurate and efficient data analysis.

Mann-Whitney U test

Navigation menu