Skip to content

BUG: Boxplot does not apply colors set by Matplotlib rcParams for certain plot elements #57709

@thetestspecimen

Description

@thetestspecimen

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
                     'boxplot.whiskerprops.color': 'green',
                     'boxplot.capprops.color': 'orange',
                     'boxplot.medianprops.color': 'cyan',
                     'patch.facecolor': 'grey'}):
    df.plot.box(patch_artist=True) # OR df.plot(kind='box', patch_artist=True)
    plt.show()

Issue Description

If the 'Reproducible Example' code is run it will result in the following:

pandas-example

If run directly through Matplotlib like so:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
                     'boxplot.whiskerprops.color': 'green',
                     'boxplot.capprops.color': 'orange',
                     'boxplot.medianprops.color': 'cyan',
                     'patch.facecolor': 'grey'}):
    plt.boxplot(df, patch_artist=True)
    plt.show()

You end up with this:

matplotlib-example

As you can see Pandas completely ignores the rcParams assignment, and sets it's own colours.

I have only included in this example the exact elements (box, whiskers, caps, medians and box-face) that are ignored. It should also be noted that as the rcParams are ignored, Matplotlib stylesheets are also ignored if applied to these elements.

As Pandas does this only for these specific elements in the boxplot, it can result in some terrible looking plots if someone uses a comprehensive set of rcParams (or stylesheet) that have a significantly different set of colours.

A solution?

I have looked into where this occurs, and all the relevant code resides in:

https://github.com/pandas-dev/pandas/blob/main/pandas/plotting/_matplotlib/boxplot.py

Specifically, methods _get_colors() and _color_attrs(self). These two methods (among other bits of linked code) basically pick specific colours from the assigned colormap and apply them to the plot.

I know what needs adjusting, and could put in a PR. However, due to the nature of rcParams being the "default" and hence having the lowest priority in terms of application, I see no way to adjust the code without changing the current default colours (i.e. blue, and a green median taken from the "tab10" colormap).

That is why I am filing this 'bug', as I can see this change might be objectionable, and as such will require further discussion on the appropriate solution. The solution I am proposing, of using matplotlib rcParam defaults, would result in the following "default" plot:

matplotlib-default

My personal opinion is that this visual change is minor, and therefore should be implemented. I would also argue that accessibility is hindered by the current implementation (colour blindness being an example).

Items to note

While reviewing the code I noticed the following:

  1. BUG: Min/max markers on box plot are not visible with 'dark_background' theme #40769 is not completely solved as it was only fixed for the method plot.box and not boxplot (the two methods use different code within boxplot.py) - see line 376 of boxplot.py for the hardcoded black value for the caps using the method boxplot result = np.append(result, "k")
  2. the section of code refactored by color attribute of medianprops is not correctly understand in a boxplot #30346 does not distinguish between edgecolor and facecolor when patch_artist is set to True. This may or may not have been intentional, but should probably be separated out as it is the only reason patch.facecolor features this current bug report.

Expected Behavior

If colours are set in matplotlib rcParams (or stylesheets) by the user, they should be applied to the plot, not ignored.

Installed Versions

commit : 69f03a3
python : 3.10.13.final.0
python-bits : 64
OS : Linux
OS-release : 6.7.4-2-MANJARO
Version : #1 SMP PREEMPT_DYNAMIC Sat Feb 10 09:41:20 UTC 2024
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8

pandas : 3.0.0.dev0+448.g69f03a39ec
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0
setuptools : 69.1.1
pip : 24.0
Cython : 3.0.8
pytest : 8.0.2
hypothesis : 6.98.15
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 5.1.0
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.9
jinja2 : 3.1.3
IPython : 8.22.1
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : 1.3.8
fastparquet : 2024.2.0
fsspec : 2024.2.0
gcsfs : 2024.2.0
matplotlib : 3.8.3
numba : 0.59.0
numexpr : 2.9.0
odfpy : None
openpyxl : 3.1.2
pyarrow : 15.0.0
pyreadstat : 1.2.6
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2024.2.0
scipy : 1.12.0
sqlalchemy : 2.0.27
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.2.0
xlrd : 2.0.1
zstandard : 0.22.0
tzdata : 2024.1
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team memberVisualizationplotting

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions