-
-
Notifications
You must be signed in to change notification settings - Fork 18.7k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
'boxplot.whiskerprops.color': 'green',
'boxplot.capprops.color': 'orange',
'boxplot.medianprops.color': 'cyan',
'patch.facecolor': 'grey'}):
df.plot.box(patch_artist=True) # OR df.plot(kind='box', patch_artist=True)
plt.show()
Issue Description
If the 'Reproducible Example' code is run it will result in the following:
If run directly through Matplotlib like so:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib as mpl
df = pd.DataFrame(np.random.default_rng(12).random((10, 2)))
with mpl.rc_context({'boxplot.boxprops.color': 'red',
'boxplot.whiskerprops.color': 'green',
'boxplot.capprops.color': 'orange',
'boxplot.medianprops.color': 'cyan',
'patch.facecolor': 'grey'}):
plt.boxplot(df, patch_artist=True)
plt.show()
You end up with this:
As you can see Pandas completely ignores the rcParams assignment, and sets it's own colours.
I have only included in this example the exact elements (box, whiskers, caps, medians and box-face) that are ignored. It should also be noted that as the rcParams are ignored, Matplotlib stylesheets are also ignored if applied to these elements.
As Pandas does this only for these specific elements in the boxplot, it can result in some terrible looking plots if someone uses a comprehensive set of rcParams (or stylesheet) that have a significantly different set of colours.
A solution?
I have looked into where this occurs, and all the relevant code resides in:
Specifically, methods _get_colors()
and _color_attrs(self)
. These two methods (among other bits of linked code) basically pick specific colours from the assigned colormap and apply them to the plot.
I know what needs adjusting, and could put in a PR. However, due to the nature of rcParams being the "default" and hence having the lowest priority in terms of application, I see no way to adjust the code without changing the current default colours (i.e. blue, and a green median taken from the "tab10" colormap).
That is why I am filing this 'bug', as I can see this change might be objectionable, and as such will require further discussion on the appropriate solution. The solution I am proposing, of using matplotlib rcParam defaults, would result in the following "default" plot:
My personal opinion is that this visual change is minor, and therefore should be implemented. I would also argue that accessibility is hindered by the current implementation (colour blindness being an example).
Items to note
While reviewing the code I noticed the following:
- BUG: Min/max markers on box plot are not visible with 'dark_background' theme #40769 is not completely solved as it was only fixed for the method
plot.box
and notboxplot
(the two methods use different code withinboxplot.py
) - see line 376 ofboxplot.py
for the hardcoded black value for the caps using the methodboxplot
result = np.append(result, "k")
- the section of code refactored by color attribute of medianprops is not correctly understand in a boxplot #30346 does not distinguish between
edgecolor
andfacecolor
whenpatch_artist
is set toTrue
. This may or may not have been intentional, but should probably be separated out as it is the only reasonpatch.facecolor
features this current bug report.
Expected Behavior
If colours are set in matplotlib rcParams (or stylesheets) by the user, they should be applied to the plot, not ignored.
Installed Versions
commit : 69f03a3
python : 3.10.13.final.0
python-bits : 64
OS : Linux
OS-release : 6.7.4-2-MANJARO
Version : #1 SMP PREEMPT_DYNAMIC Sat Feb 10 09:41:20 UTC 2024
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_GB.UTF-8
LOCALE : en_GB.UTF-8
pandas : 3.0.0.dev0+448.g69f03a39ec
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0
setuptools : 69.1.1
pip : 24.0
Cython : 3.0.8
pytest : 8.0.2
hypothesis : 6.98.15
sphinx : 7.2.6
blosc : None
feather : None
xlsxwriter : 3.1.9
lxml.etree : 5.1.0
html5lib : 1.1
pymysql : 1.4.6
psycopg2 : 2.9.9
jinja2 : 3.1.3
IPython : 8.22.1
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : 1.3.8
fastparquet : 2024.2.0
fsspec : 2024.2.0
gcsfs : 2024.2.0
matplotlib : 3.8.3
numba : 0.59.0
numexpr : 2.9.0
odfpy : None
openpyxl : 3.1.2
pyarrow : 15.0.0
pyreadstat : 1.2.6
python-calamine : None
pyxlsb : 1.0.10
s3fs : 2024.2.0
scipy : 1.12.0
sqlalchemy : 2.0.27
tables : 3.9.2
tabulate : 0.9.0
xarray : 2024.2.0
xlrd : 2.0.1
zstandard : 0.22.0
tzdata : 2024.1
qtpy : None
pyqt5 : None