]` where the columns are
prioritised, to limit ``data`` to *before* applying the function.
na_rep : str, optional
Representation for missing values.
If ``na_rep`` is None, no special formatting is applied.
precision : int, optional
Floating point precision to use for display purposes, if not determined by
the specified ``formatter``.
.. versionadded:: 1.3.0
decimal : str, default "."
Character used as decimal separator for floats, complex and integers.
.. versionadded:: 1.3.0
thousands : str, optional, default None
Character used as thousands separator for floats, complex and integers.
.. versionadded:: 1.3.0
escape : str, optional
Use 'html' to replace the characters ``&``, ``<``, ``>``, ``'``, and ``"``
in cell display string with HTML-safe sequences.
Use 'latex' to replace the characters ``&``, ``%``, ``$``, ``#``, ``_``,
``{``, ``}``, ``~``, ``^``, and ``\`` in the cell display string with
LaTeX-safe sequences.
Use 'latex-math' to replace the characters the same way as in 'latex' mode,
except for math substrings, which either are surrounded
by two characters ``$`` or start with the character ``\(`` and
end with ``\)``. Escaping is done before ``formatter``.
.. versionadded:: 1.3.0
hyperlinks : {"html", "latex"}, optional
Convert string patterns containing https://, http://, ftp:// or www. to
HTML tags as clickable URL hyperlinks if "html", or LaTeX \href
commands if "latex".
.. versionadded:: 1.4.0
Returns
-------
Styler
See Also
--------
Styler.format_index: Format the text display value of index labels.
Notes
-----
This method assigns a formatting function, ``formatter``, to each cell in the
DataFrame. If ``formatter`` is ``None``, then the default formatter is used.
If a callable then that function should take a data value as input and return
a displayable representation, such as a string. If ``formatter`` is
given as a string this is assumed to be a valid Python format specification
and is wrapped to a callable as ``string.format(x)``. If a ``dict`` is given,
keys should correspond to column names, and values should be string or
callable, as above.
The default formatter currently expresses floats and complex numbers with the
pandas display precision unless using the ``precision`` argument here. The
default formatter does not adjust the representation of missing values unless
the ``na_rep`` argument is used.
The ``subset`` argument defines which region to apply the formatting function
to. If the ``formatter`` argument is given in dict form but does not include
all columns within the subset then these columns will have the default formatter
applied. Any columns in the formatter dict excluded from the subset will
be ignored.
When using a ``formatter`` string the dtypes must be compatible, otherwise a
`ValueError` will be raised.
When instantiating a Styler, default formatting can be applied be setting the
``pandas.options``:
- ``styler.format.formatter``: default None.
- ``styler.format.na_rep``: default None.
- ``styler.format.precision``: default 6.
- ``styler.format.decimal``: default ".".
- ``styler.format.thousands``: default None.
- ``styler.format.escape``: default None.
.. warning::
`Styler.format` is ignored when using the output format `Styler.to_excel`,
since Excel and Python have inherrently different formatting structures.
However, it is possible to use the `number-format` pseudo CSS attribute
to force Excel permissible formatting. See examples.
Examples
--------
Using ``na_rep`` and ``precision`` with the default ``formatter``
>>> df = pd.DataFrame([[np.nan, 1.0, 'A'], [2.0, np.nan, 3.0]])
>>> df.style.format(na_rep='MISS', precision=3) # doctest: +SKIP
0 1 2
0 MISS 1.000 A
1 2.000 MISS 3.000
Using a ``formatter`` specification on consistent column dtypes
>>> df.style.format('{:.2f}', na_rep='MISS', subset=[0,1]) # doctest: +SKIP
0 1 2
0 MISS 1.00 A
1 2.00 MISS 3.000000
Using the default ``formatter`` for unspecified columns
>>> df.style.format({0: '{:.2f}', 1: '£ {:.1f}'}, na_rep='MISS', precision=1)
... # doctest: +SKIP
0 1 2
0 MISS £ 1.0 A
1 2.00 MISS 3.0
Multiple ``na_rep`` or ``precision`` specifications under the default
``formatter``.
>>> (df.style.format(na_rep='MISS', precision=1, subset=[0])
... .format(na_rep='PASS', precision=2, subset=[1, 2])) # doctest: +SKIP
0 1 2
0 MISS 1.00 A
1 2.0 PASS 3.00
Using a callable ``formatter`` function.
>>> func = lambda s: 'STRING' if isinstance(s, str) else 'FLOAT'
>>> df.style.format({0: '{:.1f}', 2: func}, precision=4, na_rep='MISS')
... # doctest: +SKIP
0 1 2
0 MISS 1.0000 STRING
1 2.0 MISS FLOAT
Using a ``formatter`` with HTML ``escape`` and ``na_rep``.
>>> df = pd.DataFrame([['', '"A&B"', None]])
>>> s = df.style.format(
... '{0}', escape="html", na_rep="NA"
... )
>>> s.to_html() # doctest: +SKIP
...
<div></div> |
"A&B" |
NA |
...
Using a ``formatter`` with ``escape`` in 'latex' mode.
>>> df = pd.DataFrame([["123"], ["~ ^"], ["$%#"]])
>>> df.style.format("\\textbf{{{}}}", escape="latex").to_latex()
... # doctest: +SKIP
\begin{tabular}{ll}
& 0 \\
0 & \textbf{123} \\
1 & \textbf{\textasciitilde \space \textasciicircum } \\
2 & \textbf{\$\%\#} \\
\end{tabular}
Applying ``escape`` in 'latex-math' mode. In the example below
we enter math mode using the character ``$``.
>>> df = pd.DataFrame([[r"$\sum_{i=1}^{10} a_i$ a~b $\alpha \
... = \frac{\beta}{\zeta^2}$"], ["%#^ $ \$x^2 $"]])
>>> df.style.format(escape="latex-math").to_latex()
... # doctest: +SKIP
\begin{tabular}{ll}
& 0 \\
0 & $\sum_{i=1}^{10} a_i$ a\textasciitilde b $\alpha = \frac{\beta}{\zeta^2}$ \\
1 & \%\#\textasciicircum \space $ \$x^2 $ \\
\end{tabular}
We can use the character ``\(`` to enter math mode and the character ``\)``
to close math mode.
>>> df = pd.DataFrame([[r"\(\sum_{i=1}^{10} a_i\) a~b \(\alpha \
... = \frac{\beta}{\zeta^2}\)"], ["%#^ \( \$x^2 \)"]])
>>> df.style.format(escape="latex-math").to_latex()
... # doctest: +SKIP
\begin{tabular}{ll}
& 0 \\
0 & \(\sum_{i=1}^{10} a_i\) a\textasciitilde b \(\alpha
= \frac{\beta}{\zeta^2}\) \\
1 & \%\#\textasciicircum \space \( \$x^2 \) \\
\end{tabular}
If we have in one DataFrame cell a combination of both shorthands
for math formulas, the shorthand with the sign ``$`` will be applied.
>>> df = pd.DataFrame([[r"\( x^2 \) $x^2$"], \
... [r"$\frac{\beta}{\zeta}$ \(\frac{\beta}{\zeta}\)"]])
>>> df.style.format(escape="latex-math").to_latex()
... # doctest: +SKIP
\begin{tabular}{ll}
& 0 \\
0 & \textbackslash ( x\textasciicircum 2 \textbackslash ) $x^2$ \\
1 & $\frac{\beta}{\zeta}$ \textbackslash (\textbackslash
frac\{\textbackslash beta\}\{\textbackslash zeta\}\textbackslash ) \\
\end{tabular}
Pandas defines a `number-format` pseudo CSS attribute instead of the `.format`
method to create `to_excel` permissible formatting. Note that semi-colons are
CSS protected characters but used as separators in Excel's format string.
Replace semi-colons with the section separator character (ASCII-245) when
defining the formatting here.
>>> df = pd.DataFrame({"A": [1, 0, -1]})
>>> pseudo_css = "number-format: 0§[Red](0)§-§@;"
>>> filename = "formatted_file.xlsx"
>>> df.style.map(lambda v: pseudo_css).to_excel(filename) # doctest: +SKIP
.. figure:: ../../_static/style/format_excel_css.png
"""
if all(
(
formatter is None,
subset is None,
precision is None,
decimal == ".",
thousands is None,
na_rep is None,
escape is None,
hyperlinks is None,
)
):
self._display_funcs.clear()
return self # clear the formatter / revert to default and avoid looping
subset = slice(None) if subset is None else subset
subset = non_reducing_slice(subset)
data = self.data.loc[subset]
if not isinstance(formatter, dict):
formatter = {col: formatter for col in data.columns}
cis = self.columns.get_indexer_for(data.columns)
ris = self.index.get_indexer_for(data.index)
for ci in cis:
format_func = _maybe_wrap_formatter(
formatter.get(self.columns[ci]),
na_rep=na_rep,
precision=precision,
decimal=decimal,
thousands=thousands,
escape=escape,
hyperlinks=hyperlinks,
)
for ri in ris:
self._display_funcs[(ri, ci)] = format_func
return self
def format_index(
self,
formatter: ExtFormatter | None = None,
axis: Axis = 0,
level: Level | list[Level] | None = None,
na_rep: str | None = None,
precision: int | None = None,
decimal: str = ".",
thousands: str | None = None,
escape: str | None = None,
hyperlinks: str | None = None,
) -> StylerRenderer:
r"""
Format the text display value of index labels or column headers.
.. versionadded:: 1.4.0
Parameters
----------
formatter : str, callable, dict or None
Object to define how values are displayed. See notes.
axis : {0, "index", 1, "columns"}
Whether to apply the formatter to the index or column headers.
level : int, str, list
The level(s) over which to apply the generic formatter.
na_rep : str, optional
Representation for missing values.
If ``na_rep`` is None, no special formatting is applied.
precision : int, optional
Floating point precision to use for display purposes, if not determined by
the specified ``formatter``.
decimal : str, default "."
Character used as decimal separator for floats, complex and integers.
thousands : str, optional, default None
Character used as thousands separator for floats, complex and integers.
escape : str, optional
Use 'html' to replace the characters ``&``, ``<``, ``>``, ``'``, and ``"``
in cell display string with HTML-safe sequences.
Use 'latex' to replace the characters ``&``, ``%``, ``$``, ``#``, ``_``,
``{``, ``}``, ``~``, ``^``, and ``\`` in the cell display string with
LaTeX-safe sequences.
Escaping is done before ``formatter``.
hyperlinks : {"html", "latex"}, optional
Convert string patterns containing https://, http://, ftp:// or www. to
HTML tags as clickable URL hyperlinks if "html", or LaTeX \href
commands if "latex".
Returns
-------
Styler
See Also
--------
Styler.format: Format the text display value of data cells.
Notes
-----
This method assigns a formatting function, ``formatter``, to each level label
in the DataFrame's index or column headers. If ``formatter`` is ``None``,
then the default formatter is used.
If a callable then that function should take a label value as input and return
a displayable representation, such as a string. If ``formatter`` is
given as a string this is assumed to be a valid Python format specification
and is wrapped to a callable as ``string.format(x)``. If a ``dict`` is given,
keys should correspond to MultiIndex level numbers or names, and values should
be string or callable, as above.
The default formatter currently expresses floats and complex numbers with the
pandas display precision unless using the ``precision`` argument here. The
default formatter does not adjust the representation of missing values unless
the ``na_rep`` argument is used.
The ``level`` argument defines which levels of a MultiIndex to apply the
method to. If the ``formatter`` argument is given in dict form but does
not include all levels within the level argument then these unspecified levels
will have the default formatter applied. Any levels in the formatter dict
specifically excluded from the level argument will be ignored.
When using a ``formatter`` string the dtypes must be compatible, otherwise a
`ValueError` will be raised.
.. warning::
`Styler.format_index` is ignored when using the output format
`Styler.to_excel`, since Excel and Python have inherrently different
formatting structures.
However, it is possible to use the `number-format` pseudo CSS attribute
to force Excel permissible formatting. See documentation for `Styler.format`.
Examples
--------
Using ``na_rep`` and ``precision`` with the default ``formatter``
>>> df = pd.DataFrame([[1, 2, 3]], columns=[2.0, np.nan, 4.0])
>>> df.style.format_index(axis=1, na_rep='MISS', precision=3) # doctest: +SKIP
2.000 MISS 4.000
0 1 2 3
Using a ``formatter`` specification on consistent dtypes in a level
>>> df.style.format_index('{:.2f}', axis=1, na_rep='MISS') # doctest: +SKIP
2.00 MISS 4.00
0 1 2 3
Using the default ``formatter`` for unspecified levels
>>> df = pd.DataFrame([[1, 2, 3]],
... columns=pd.MultiIndex.from_arrays([["a", "a", "b"],[2, np.nan, 4]]))
>>> df.style.format_index({0: lambda v: v.upper()}, axis=1, precision=1)
... # doctest: +SKIP
A B
2.0 nan 4.0
0 1 2 3
Using a callable ``formatter`` function.
>>> func = lambda s: 'STRING' if isinstance(s, str) else 'FLOAT'
>>> df.style.format_index(func, axis=1, na_rep='MISS')
... # doctest: +SKIP
STRING STRING
FLOAT MISS FLOAT
0 1 2 3
Using a ``formatter`` with HTML ``escape`` and ``na_rep``.
>>> df = pd.DataFrame([[1, 2, 3]], columns=['"A"', 'A&B', None])
>>> s = df.style.format_index('$ {0}', axis=1, escape="html", na_rep="NA")
... # doctest: +SKIP
$ "A" |
$ A&B |
NA
...
Using a ``formatter`` with LaTeX ``escape``.
>>> df = pd.DataFrame([[1, 2, 3]], columns=["123", "~", "$%#"])
>>> df.style.format_index("\\textbf{{{}}}", escape="latex", axis=1).to_latex()
... # doctest: +SKIP
\begin{tabular}{lrrr}
{} & {\textbf{123}} & {\textbf{\textasciitilde }} & {\textbf{\$\%\#}} \\
0 & 1 & 2 & 3 \\
\end{tabular}
"""
axis = self.data._get_axis_number(axis)
if axis == 0:
display_funcs_, obj = self._display_funcs_index, self.index
else:
display_funcs_, obj = self._display_funcs_columns, self.columns
levels_ = refactor_levels(level, obj)
if all(
(
formatter is None,
level is None,
precision is None,
decimal == ".",
thousands is None,
na_rep is None,
escape is None,
hyperlinks is None,
)
):
display_funcs_.clear()
return self # clear the formatter / revert to default and avoid looping
if not isinstance(formatter, dict):
formatter = {level: formatter for level in levels_}
else:
formatter = {
obj._get_level_number(level): formatter_
for level, formatter_ in formatter.items()
}
for lvl in levels_:
format_func = _maybe_wrap_formatter(
formatter.get(lvl),
na_rep=na_rep,
precision=precision,
decimal=decimal,
thousands=thousands,
escape=escape,
hyperlinks=hyperlinks,
)
for idx in [(i, lvl) if axis == 0 else (lvl, i) for i in range(len(obj))]:
display_funcs_[idx] = format_func
return self
def relabel_index(
self,
labels: Sequence | Index,
axis: Axis = 0,
level: Level | list[Level] | None = None,
) -> StylerRenderer:
r"""
Relabel the index, or column header, keys to display a set of specified values.
.. versionadded:: 1.5.0
Parameters
----------
labels : list-like or Index
New labels to display. Must have same length as the underlying values not
hidden.
axis : {"index", 0, "columns", 1}
Apply to the index or columns.
level : int, str, list, optional
The level(s) over which to apply the new labels. If `None` will apply
to all levels of an Index or MultiIndex which are not hidden.
Returns
-------
Styler
See Also
--------
Styler.format_index: Format the text display value of index or column headers.
Styler.hide: Hide the index, column headers, or specified data from display.
Notes
-----
As part of Styler, this method allows the display of an index to be
completely user-specified without affecting the underlying DataFrame data,
index, or column headers. This means that the flexibility of indexing is
maintained whilst the final display is customisable.
Since Styler is designed to be progressively constructed with method chaining,
this method is adapted to react to the **currently specified hidden elements**.
This is useful because it means one does not have to specify all the new
labels if the majority of an index, or column headers, have already been hidden.
The following produce equivalent display (note the length of ``labels`` in
each case).
.. code-block:: python
# relabel first, then hide
df = pd.DataFrame({"col": ["a", "b", "c"]})
df.style.relabel_index(["A", "B", "C"]).hide([0,1])
# hide first, then relabel
df = pd.DataFrame({"col": ["a", "b", "c"]})
df.style.hide([0,1]).relabel_index(["C"])
This method should be used, rather than :meth:`Styler.format_index`, in one of
the following cases (see examples):
- A specified set of labels are required which are not a function of the
underlying index keys.
- The function of the underlying index keys requires a counter variable,
such as those available upon enumeration.
Examples
--------
Basic use
>>> df = pd.DataFrame({"col": ["a", "b", "c"]})
>>> df.style.relabel_index(["A", "B", "C"]) # doctest: +SKIP
col
A a
B b
C c
Chaining with pre-hidden elements
>>> df.style.hide([0,1]).relabel_index(["C"]) # doctest: +SKIP
col
C c
Using a MultiIndex
>>> midx = pd.MultiIndex.from_product([[0, 1], [0, 1], [0, 1]])
>>> df = pd.DataFrame({"col": list(range(8))}, index=midx)
>>> styler = df.style # doctest: +SKIP
col
0 0 0 0
1 1
1 0 2
1 3
1 0 0 4
1 5
1 0 6
1 7
>>> styler.hide((midx.get_level_values(0)==0)|(midx.get_level_values(1)==0))
... # doctest: +SKIP
>>> styler.hide(level=[0,1]) # doctest: +SKIP
>>> styler.relabel_index(["binary6", "binary7"]) # doctest: +SKIP
col
binary6 6
binary7 7
We can also achieve the above by indexing first and then re-labeling
>>> styler = df.loc[[(1,1,0), (1,1,1)]].style
>>> styler.hide(level=[0,1]).relabel_index(["binary6", "binary7"])
... # doctest: +SKIP
col
binary6 6
binary7 7
Defining a formatting function which uses an enumeration counter. Also note
that the value of the index key is passed in the case of string labels so it
can also be inserted into the label, using curly brackets (or double curly
brackets if the string if pre-formatted),
>>> df = pd.DataFrame({"samples": np.random.rand(10)})
>>> styler = df.loc[np.random.randint(0,10,3)].style
>>> styler.relabel_index([f"sample{i+1} ({{}})" for i in range(3)])
... # doctest: +SKIP
samples
sample1 (5) 0.315811
sample2 (0) 0.495941
sample3 (2) 0.067946
"""
axis = self.data._get_axis_number(axis)
if axis == 0:
display_funcs_, obj = self._display_funcs_index, self.index
hidden_labels, hidden_lvls = self.hidden_rows, self.hide_index_
else:
display_funcs_, obj = self._display_funcs_columns, self.columns
hidden_labels, hidden_lvls = self.hidden_columns, self.hide_columns_
visible_len = len(obj) - len(set(hidden_labels))
if len(labels) != visible_len:
raise ValueError(
"``labels`` must be of length equal to the number of "
f"visible labels along ``axis`` ({visible_len})."
)
if level is None:
level = [i for i in range(obj.nlevels) if not hidden_lvls[i]]
levels_ = refactor_levels(level, obj)
def alias_(x, value):
if isinstance(value, str):
return value.format(x)
return value
for ai, i in enumerate([i for i in range(len(obj)) if i not in hidden_labels]):
if len(levels_) == 1:
idx = (i, levels_[0]) if axis == 0 else (levels_[0], i)
display_funcs_[idx] = partial(alias_, value=labels[ai])
else:
for aj, lvl in enumerate(levels_):
idx = (i, lvl) if axis == 0 else (lvl, i)
display_funcs_[idx] = partial(alias_, value=labels[ai][aj])
return self
def _element(
html_element: str,
html_class: str | None,
value: Any,
is_visible: bool,
**kwargs,
) -> dict:
"""
Template to return container with information for a | | or | element.
"""
if "display_value" not in kwargs:
kwargs["display_value"] = value
return {
"type": html_element,
"value": value,
"class": html_class,
"is_visible": is_visible,
**kwargs,
}
def _get_trimming_maximums(
rn,
cn,
max_elements,
max_rows=None,
max_cols=None,
scaling_factor: float = 0.8,
) -> tuple[int, int]:
"""
Recursively reduce the number of rows and columns to satisfy max elements.
Parameters
----------
rn, cn : int
The number of input rows / columns
max_elements : int
The number of allowable elements
max_rows, max_cols : int, optional
Directly specify an initial maximum rows or columns before compression.
scaling_factor : float
Factor at which to reduce the number of rows / columns to fit.
Returns
-------
rn, cn : tuple
New rn and cn values that satisfy the max_elements constraint
"""
def scale_down(rn, cn):
if cn >= rn:
return rn, int(cn * scaling_factor)
else:
return int(rn * scaling_factor), cn
if max_rows:
rn = max_rows if rn > max_rows else rn
if max_cols:
cn = max_cols if cn > max_cols else cn
while rn * cn > max_elements:
rn, cn = scale_down(rn, cn)
return rn, cn
def _get_level_lengths(
index: Index,
sparsify: bool,
max_index: int,
hidden_elements: Sequence[int] | None = None,
):
"""
Given an index, find the level length for each element.
Parameters
----------
index : Index
Index or columns to determine lengths of each element
sparsify : bool
Whether to hide or show each distinct element in a MultiIndex
max_index : int
The maximum number of elements to analyse along the index due to trimming
hidden_elements : sequence of int
Index positions of elements hidden from display in the index affecting
length
Returns
-------
Dict :
Result is a dictionary of (level, initial_position): span
"""
if isinstance(index, MultiIndex):
levels = index._format_multi(sparsify=lib.no_default, include_names=False)
else:
levels = index._format_flat(include_name=False)
if hidden_elements is None:
hidden_elements = []
lengths = {}
if not isinstance(index, MultiIndex):
for i, value in enumerate(levels):
if i not in hidden_elements:
lengths[(0, i)] = 1
return lengths
for i, lvl in enumerate(levels):
visible_row_count = 0 # used to break loop due to display trimming
for j, row in enumerate(lvl):
if visible_row_count > max_index:
break
if not sparsify:
# then lengths will always equal 1 since no aggregation.
if j not in hidden_elements:
lengths[(i, j)] = 1
visible_row_count += 1
elif (row is not lib.no_default) and (j not in hidden_elements):
# this element has not been sparsified so must be the start of section
last_label = j
lengths[(i, last_label)] = 1
visible_row_count += 1
elif row is not lib.no_default:
# even if the above is hidden, keep track of it in case length > 1 and
# later elements are visible
last_label = j
lengths[(i, last_label)] = 0
elif j not in hidden_elements:
# then element must be part of sparsified section and is visible
visible_row_count += 1
if visible_row_count > max_index:
break # do not add a length since the render trim limit reached
if lengths[(i, last_label)] == 0:
# if previous iteration was first-of-section but hidden then offset
last_label = j
lengths[(i, last_label)] = 1
else:
# else add to previous iteration
lengths[(i, last_label)] += 1
non_zero_lengths = {
element: length for element, length in lengths.items() if length >= 1
}
return non_zero_lengths
def _is_visible(idx_row, idx_col, lengths) -> bool:
"""
Index -> {(idx_row, idx_col): bool}).
"""
return (idx_col, idx_row) in lengths
def format_table_styles(styles: CSSStyles) -> CSSStyles:
"""
looks for multiple CSS selectors and separates them:
[{'selector': 'td, th', 'props': 'a:v;'}]
---> [{'selector': 'td', 'props': 'a:v;'},
{'selector': 'th', 'props': 'a:v;'}]
"""
return [
{"selector": selector, "props": css_dict["props"]}
for css_dict in styles
for selector in css_dict["selector"].split(",")
]
def _default_formatter(x: Any, precision: int, thousands: bool = False) -> Any:
"""
Format the display of a value
Parameters
----------
x : Any
Input variable to be formatted
precision : Int
Floating point precision used if ``x`` is float or complex.
thousands : bool, default False
Whether to group digits with thousands separated with ",".
Returns
-------
value : Any
Matches input type, or string if input is float or complex or int with sep.
"""
if is_float(x) or is_complex(x):
return f"{x:,.{precision}f}" if thousands else f"{x:.{precision}f}"
elif is_integer(x):
return f"{x:,}" if thousands else str(x)
return x
def _wrap_decimal_thousands(
formatter: Callable, decimal: str, thousands: str | None
) -> Callable:
"""
Takes a string formatting function and wraps logic to deal with thousands and
decimal parameters, in the case that they are non-standard and that the input
is a (float, complex, int).
"""
def wrapper(x):
if is_float(x) or is_integer(x) or is_complex(x):
if decimal != "." and thousands is not None and thousands != ",":
return (
formatter(x)
.replace(",", "§_§-") # rare string to avoid "," <-> "." clash.
.replace(".", decimal)
.replace("§_§-", thousands)
)
elif decimal != "." and (thousands is None or thousands == ","):
return formatter(x).replace(".", decimal)
elif decimal == "." and thousands is not None and thousands != ",":
return formatter(x).replace(",", thousands)
return formatter(x)
return wrapper
def _str_escape(x, escape):
"""if escaping: only use on str, else return input"""
if isinstance(x, str):
if escape == "html":
return escape_html(x)
elif escape == "latex":
return _escape_latex(x)
elif escape == "latex-math":
return _escape_latex_math(x)
else:
raise ValueError(
f"`escape` only permitted in {{'html', 'latex', 'latex-math'}}, \
got {escape}"
)
return x
def _render_href(x, format):
"""uses regex to detect a common URL pattern and converts to href tag in format."""
if isinstance(x, str):
if format == "html":
href = '{0}'
elif format == "latex":
href = r"\href{{{0}}}{{{0}}}"
else:
raise ValueError("``hyperlinks`` format can only be 'html' or 'latex'")
pat = r"((http|ftp)s?:\/\/|www.)[\w/\-?=%.:@]+\.[\w/\-&?=%.,':;~!@#$*()\[\]]+"
return re.sub(pat, lambda m: href.format(m.group(0)), x)
return x
def _maybe_wrap_formatter(
formatter: BaseFormatter | None = None,
na_rep: str | None = None,
precision: int | None = None,
decimal: str = ".",
thousands: str | None = None,
escape: str | None = None,
hyperlinks: str | None = None,
) -> Callable:
"""
Allows formatters to be expressed as str, callable or None, where None returns
a default formatting function. wraps with na_rep, and precision where they are
available.
"""
# Get initial func from input string, input callable, or from default factory
if isinstance(formatter, str):
func_0 = lambda x: formatter.format(x)
elif callable(formatter):
func_0 = formatter
elif formatter is None:
precision = (
get_option("styler.format.precision") if precision is None else precision
)
func_0 = partial(
_default_formatter, precision=precision, thousands=(thousands is not None)
)
else:
raise TypeError(f"'formatter' expected str or callable, got {type(formatter)}")
# Replace chars if escaping
if escape is not None:
func_1 = lambda x: func_0(_str_escape(x, escape=escape))
else:
func_1 = func_0
# Replace decimals and thousands if non-standard inputs detected
if decimal != "." or (thousands is not None and thousands != ","):
func_2 = _wrap_decimal_thousands(func_1, decimal=decimal, thousands=thousands)
else:
func_2 = func_1
# Render links
if hyperlinks is not None:
func_3 = lambda x: func_2(_render_href(x, format=hyperlinks))
else:
func_3 = func_2
# Replace missing values if na_rep
if na_rep is None:
return func_3
else:
return lambda x: na_rep if (isna(x) is True) else func_3(x)
def non_reducing_slice(slice_: Subset):
"""
Ensure that a slice doesn't reduce to a Series or Scalar.
Any user-passed `subset` should have this called on it
to make sure we're always working with DataFrames.
"""
# default to column slice, like DataFrame
# ['A', 'B'] -> IndexSlices[:, ['A', 'B']]
kinds = (ABCSeries, np.ndarray, Index, list, str)
if isinstance(slice_, kinds):
slice_ = IndexSlice[:, slice_]
def pred(part) -> bool:
"""
Returns
-------
bool
True if slice does *not* reduce,
False if `part` is a tuple.
"""
# true when slice does *not* reduce, False when part is a tuple,
# i.e. MultiIndex slice
if isinstance(part, tuple):
# GH#39421 check for sub-slice:
return any((isinstance(s, slice) or is_list_like(s)) for s in part)
else:
return isinstance(part, slice) or is_list_like(part)
if not is_list_like(slice_):
if not isinstance(slice_, slice):
# a 1-d slice, like df.loc[1]
slice_ = [[slice_]]
else:
# slice(a, b, c)
slice_ = [slice_] # to tuplize later
else:
# error: Item "slice" of "Union[slice, Sequence[Any]]" has no attribute
# "__iter__" (not iterable) -> is specifically list_like in conditional
slice_ = [p if pred(p) else [p] for p in slice_] # type: ignore[union-attr]
return tuple(slice_)
def maybe_convert_css_to_tuples(style: CSSProperties) -> CSSList:
"""
Convert css-string to sequence of tuples format if needed.
'color:red; border:1px solid black;' -> [('color', 'red'),
('border','1px solid red')]
"""
if isinstance(style, str):
s = style.split(";")
try:
return [
(x.split(":")[0].strip(), x.split(":")[1].strip())
for x in s
if x.strip() != ""
]
except IndexError:
raise ValueError(
"Styles supplied as string must follow CSS rule formats, "
f"for example 'attr: val;'. '{style}' was given."
)
return style
def refactor_levels(
level: Level | list[Level] | None,
obj: Index,
) -> list[int]:
"""
Returns a consistent levels arg for use in ``hide_index`` or ``hide_columns``.
Parameters
----------
level : int, str, list
Original ``level`` arg supplied to above methods.
obj:
Either ``self.index`` or ``self.columns``
Returns
-------
list : refactored arg with a list of levels to hide
"""
if level is None:
levels_: list[int] = list(range(obj.nlevels))
elif isinstance(level, int):
levels_ = [level]
elif isinstance(level, str):
levels_ = [obj._get_level_number(level)]
elif isinstance(level, list):
levels_ = [
obj._get_level_number(lev) if not isinstance(lev, int) else lev
for lev in level
]
else:
raise ValueError("`level` must be of type `int`, `str` or list of such")
return levels_
class Tooltips:
"""
An extension to ``Styler`` that allows for and manipulates tooltips on hover
of ```` cells in the HTML result.
Parameters
----------
css_name: str, default "pd-t"
Name of the CSS class that controls visualisation of tooltips.
css_props: list-like, default; see Notes
List of (attr, value) tuples defining properties of the CSS class.
tooltips: DataFrame, default empty
DataFrame of strings aligned with underlying Styler data for tooltip
display.
Notes
-----
The default properties for the tooltip CSS class are:
- visibility: hidden
- position: absolute
- z-index: 1
- background-color: black
- color: white
- transform: translate(-20px, -20px)
Hidden visibility is a key prerequisite to the hover functionality, and should
always be included in any manual properties specification.
"""
def __init__(
self,
css_props: CSSProperties = [
("visibility", "hidden"),
("position", "absolute"),
("z-index", 1),
("background-color", "black"),
("color", "white"),
("transform", "translate(-20px, -20px)"),
],
css_name: str = "pd-t",
tooltips: DataFrame = DataFrame(),
) -> None:
self.class_name = css_name
self.class_properties = css_props
self.tt_data = tooltips
self.table_styles: CSSStyles = []
@property
def _class_styles(self):
"""
Combine the ``_Tooltips`` CSS class name and CSS properties to the format
required to extend the underlying ``Styler`` `table_styles` to allow
tooltips to render in HTML.
Returns
-------
styles : List
"""
return [
{
"selector": f".{self.class_name}",
"props": maybe_convert_css_to_tuples(self.class_properties),
}
]
def _pseudo_css(self, uuid: str, name: str, row: int, col: int, text: str):
"""
For every table data-cell that has a valid tooltip (not None, NaN or
empty string) must create two pseudo CSS entries for the specific
| element id which are added to overall table styles:
an on hover visibility change and a content change
dependent upon the user's chosen display string.
For example:
[{"selector": "T__row1_col1:hover .pd-t",
"props": [("visibility", "visible")]},
{"selector": "T__row1_col1 .pd-t::after",
"props": [("content", "Some Valid Text String")]}]
Parameters
----------
uuid: str
The uuid of the Styler instance
name: str
The css-name of the class used for styling tooltips
row : int
The row index of the specified tooltip string data
col : int
The col index of the specified tooltip string data
text : str
The textual content of the tooltip to be displayed in HTML.
Returns
-------
pseudo_css : List
"""
selector_id = "#T_" + uuid + "_row" + str(row) + "_col" + str(col)
return [
{
"selector": selector_id + f":hover .{name}",
"props": [("visibility", "visible")],
},
{
"selector": selector_id + f" .{name}::after",
"props": [("content", f'"{text}"')],
},
]
def _translate(self, styler: StylerRenderer, d: dict):
"""
Mutate the render dictionary to allow for tooltips:
- Add ```` HTML element to each data cells ``display_value``. Ignores
headers.
- Add table level CSS styles to control pseudo classes.
Parameters
----------
styler_data : DataFrame
Underlying ``Styler`` DataFrame used for reindexing.
uuid : str
The underlying ``Styler`` uuid for CSS id.
d : dict
The dictionary prior to final render
Returns
-------
render_dict : Dict
"""
self.tt_data = self.tt_data.reindex_like(styler.data)
if self.tt_data.empty:
return d
name = self.class_name
mask = (self.tt_data.isna()) | (self.tt_data.eq("")) # empty string = no ttip
self.table_styles = [
style
for sublist in [
self._pseudo_css(styler.uuid, name, i, j, str(self.tt_data.iloc[i, j]))
for i in range(len(self.tt_data.index))
for j in range(len(self.tt_data.columns))
if not (
mask.iloc[i, j]
or i in styler.hidden_rows
or j in styler.hidden_columns
)
]
for style in sublist
]
if self.table_styles:
# add span class to every cell only if at least 1 non-empty tooltip
for row in d["body"]:
for item in row:
if item["type"] == "td":
item["display_value"] = (
str(item["display_value"])
+ f''
)
d["table_styles"].extend(self._class_styles)
d["table_styles"].extend(self.table_styles)
return d
def _parse_latex_table_wrapping(table_styles: CSSStyles, caption: str | None) -> bool:
"""
Indicate whether LaTeX {tabular} should be wrapped with a {table} environment.
Parses the `table_styles` and detects any selectors which must be included outside
of {tabular}, i.e. indicating that wrapping must occur, and therefore return True,
or if a caption exists and requires similar.
"""
IGNORED_WRAPPERS = ["toprule", "midrule", "bottomrule", "column_format"]
# ignored selectors are included with {tabular} so do not need wrapping
return (
table_styles is not None
and any(d["selector"] not in IGNORED_WRAPPERS for d in table_styles)
) or caption is not None
def _parse_latex_table_styles(table_styles: CSSStyles, selector: str) -> str | None:
"""
Return the first 'props' 'value' from ``tables_styles`` identified by ``selector``.
Examples
--------
>>> table_styles = [{'selector': 'foo', 'props': [('attr','value')]},
... {'selector': 'bar', 'props': [('attr', 'overwritten')]},
... {'selector': 'bar', 'props': [('a1', 'baz'), ('a2', 'ignore')]}]
>>> _parse_latex_table_styles(table_styles, selector='bar')
'baz'
Notes
-----
The replacement of "§" with ":" is to avoid the CSS problem where ":" has structural
significance and cannot be used in LaTeX labels, but is often required by them.
"""
for style in table_styles[::-1]: # in reverse for most recently applied style
if style["selector"] == selector:
return str(style["props"][0][1]).replace("§", ":")
return None
def _parse_latex_cell_styles(
latex_styles: CSSList, display_value: str, convert_css: bool = False
) -> str:
r"""
Mutate the ``display_value`` string including LaTeX commands from ``latex_styles``.
This method builds a recursive latex chain of commands based on the
CSSList input, nested around ``display_value``.
If a CSS style is given as ('', '') this is translated to
'\{display_value}', and this value is treated as the
display value for the next iteration.
The most recent style forms the inner component, for example for styles:
`[('c1', 'o1'), ('c2', 'o2')]` this returns: `\c1o1{\c2o2{display_value}}`
Sometimes latex commands have to be wrapped with curly braces in different ways:
We create some parsing flags to identify the different behaviours:
- `--rwrap` : `\{}`
- `--wrap` : `{\ }`
- `--nowrap` : `\ `
- `--lwrap` : `{\} `
- `--dwrap` : `{\}{}`
For example for styles:
`[('c1', 'o1--wrap'), ('c2', 'o2')]` this returns: `{\c1o1 \c2o2{display_value}}
"""
if convert_css:
latex_styles = _parse_latex_css_conversion(latex_styles)
for command, options in latex_styles[::-1]: # in reverse for most recent style
formatter = {
"--wrap": f"{{\\{command}--to_parse {display_value}}}",
"--nowrap": f"\\{command}--to_parse {display_value}",
"--lwrap": f"{{\\{command}--to_parse}} {display_value}",
"--rwrap": f"\\{command}--to_parse{{{display_value}}}",
"--dwrap": f"{{\\{command}--to_parse}}{{{display_value}}}",
}
display_value = f"\\{command}{options} {display_value}"
for arg in ["--nowrap", "--wrap", "--lwrap", "--rwrap", "--dwrap"]:
if arg in str(options):
display_value = formatter[arg].replace(
"--to_parse", _parse_latex_options_strip(value=options, arg=arg)
)
break # only ever one purposeful entry
return display_value
def _parse_latex_header_span(
cell: dict[str, Any],
multirow_align: str,
multicol_align: str,
wrap: bool = False,
convert_css: bool = False,
) -> str:
r"""
Refactor the cell `display_value` if a 'colspan' or 'rowspan' attribute is present.
'rowspan' and 'colspan' do not occur simultaneouly. If they are detected then
the `display_value` is altered to a LaTeX `multirow` or `multicol` command
respectively, with the appropriate cell-span.
``wrap`` is used to enclose the `display_value` in braces which is needed for
column headers using an siunitx package.
Requires the package {multirow}, whereas multicol support is usually built in
to the {tabular} environment.
Examples
--------
>>> cell = {'cellstyle': '', 'display_value':'text', 'attributes': 'colspan="3"'}
>>> _parse_latex_header_span(cell, 't', 'c')
'\\multicolumn{3}{c}{text}'
"""
display_val = _parse_latex_cell_styles(
cell["cellstyle"], cell["display_value"], convert_css
)
if "attributes" in cell:
attrs = cell["attributes"]
if 'colspan="' in attrs:
colspan = attrs[attrs.find('colspan="') + 9 :] # len('colspan="') = 9
colspan = int(colspan[: colspan.find('"')])
if "naive-l" == multicol_align:
out = f"{{{display_val}}}" if wrap else f"{display_val}"
blanks = " & {}" if wrap else " &"
return out + blanks * (colspan - 1)
elif "naive-r" == multicol_align:
out = f"{{{display_val}}}" if wrap else f"{display_val}"
blanks = "{} & " if wrap else "& "
return blanks * (colspan - 1) + out
return f"\\multicolumn{{{colspan}}}{{{multicol_align}}}{{{display_val}}}"
elif 'rowspan="' in attrs:
if multirow_align == "naive":
return display_val
rowspan = attrs[attrs.find('rowspan="') + 9 :]
rowspan = int(rowspan[: rowspan.find('"')])
return f"\\multirow[{multirow_align}]{{{rowspan}}}{{*}}{{{display_val}}}"
if wrap:
return f"{{{display_val}}}"
else:
return display_val
def _parse_latex_options_strip(value: str | float, arg: str) -> str:
"""
Strip a css_value which may have latex wrapping arguments, css comment identifiers,
and whitespaces, to a valid string for latex options parsing.
For example: 'red /* --wrap */ ' --> 'red'
"""
return str(value).replace(arg, "").replace("/*", "").replace("*/", "").strip()
def _parse_latex_css_conversion(styles: CSSList) -> CSSList:
"""
Convert CSS (attribute,value) pairs to equivalent LaTeX (command,options) pairs.
Ignore conversion if tagged with `--latex` option, skipped if no conversion found.
"""
def font_weight(value, arg):
if value in ("bold", "bolder"):
return "bfseries", f"{arg}"
return None
def font_style(value, arg):
if value == "italic":
return "itshape", f"{arg}"
if value == "oblique":
return "slshape", f"{arg}"
return None
def color(value, user_arg, command, comm_arg):
"""
CSS colors have 5 formats to process:
- 6 digit hex code: "#ff23ee" --> [HTML]{FF23EE}
- 3 digit hex code: "#f0e" --> [HTML]{FF00EE}
- rgba: rgba(128, 255, 0, 0.5) --> [rgb]{0.502, 1.000, 0.000}
- rgb: rgb(128, 255, 0,) --> [rbg]{0.502, 1.000, 0.000}
- string: red --> {red}
Additionally rgb or rgba can be expressed in % which is also parsed.
"""
arg = user_arg if user_arg != "" else comm_arg
if value[0] == "#" and len(value) == 7: # color is hex code
return command, f"[HTML]{{{value[1:].upper()}}}{arg}"
if value[0] == "#" and len(value) == 4: # color is short hex code
val = f"{value[1].upper()*2}{value[2].upper()*2}{value[3].upper()*2}"
return command, f"[HTML]{{{val}}}{arg}"
elif value[:3] == "rgb": # color is rgb or rgba
r = re.findall("(?<=\\()[0-9\\s%]+(?=,)", value)[0].strip()
r = float(r[:-1]) / 100 if "%" in r else int(r) / 255
g = re.findall("(?<=,)[0-9\\s%]+(?=,)", value)[0].strip()
g = float(g[:-1]) / 100 if "%" in g else int(g) / 255
if value[3] == "a": # color is rgba
b = re.findall("(?<=,)[0-9\\s%]+(?=,)", value)[1].strip()
else: # color is rgb
b = re.findall("(?<=,)[0-9\\s%]+(?=\\))", value)[0].strip()
b = float(b[:-1]) / 100 if "%" in b else int(b) / 255
return command, f"[rgb]{{{r:.3f}, {g:.3f}, {b:.3f}}}{arg}"
else:
return command, f"{{{value}}}{arg}" # color is likely string-named
CONVERTED_ATTRIBUTES: dict[str, Callable] = {
"font-weight": font_weight,
"background-color": partial(color, command="cellcolor", comm_arg="--lwrap"),
"color": partial(color, command="color", comm_arg=""),
"font-style": font_style,
}
latex_styles: CSSList = []
for attribute, value in styles:
if isinstance(value, str) and "--latex" in value:
# return the style without conversion but drop '--latex'
latex_styles.append((attribute, value.replace("--latex", "")))
if attribute in CONVERTED_ATTRIBUTES:
arg = ""
for x in ["--wrap", "--nowrap", "--lwrap", "--dwrap", "--rwrap"]:
if x in str(value):
arg, value = x, _parse_latex_options_strip(value, x)
break
latex_style = CONVERTED_ATTRIBUTES[attribute](value, arg)
if latex_style is not None:
latex_styles.extend([latex_style])
return latex_styles
def _escape_latex(s: str) -> str:
r"""
Replace the characters ``&``, ``%``, ``$``, ``#``, ``_``, ``{``, ``}``,
``~``, ``^``, and ``\`` in the string with LaTeX-safe sequences.
Use this if you need to display text that might contain such characters in LaTeX.
Parameters
----------
s : str
Input to be escaped
Return
------
str :
Escaped string
"""
return (
s.replace("\\", "ab2§=§8yz") # rare string for final conversion: avoid \\ clash
.replace("ab2§=§8yz ", "ab2§=§8yz\\space ") # since \backslash gobbles spaces
.replace("&", "\\&")
.replace("%", "\\%")
.replace("$", "\\$")
.replace("#", "\\#")
.replace("_", "\\_")
.replace("{", "\\{")
.replace("}", "\\}")
.replace("~ ", "~\\space ") # since \textasciitilde gobbles spaces
.replace("~", "\\textasciitilde ")
.replace("^ ", "^\\space ") # since \textasciicircum gobbles spaces
.replace("^", "\\textasciicircum ")
.replace("ab2§=§8yz", "\\textbackslash ")
)
def _math_mode_with_dollar(s: str) -> str:
r"""
All characters in LaTeX math mode are preserved.
The substrings in LaTeX math mode, which start with
the character ``$`` and end with ``$``, are preserved
without escaping. Otherwise regular LaTeX escaping applies.
Parameters
----------
s : str
Input to be escaped
Return
------
str :
Escaped string
"""
s = s.replace(r"\$", r"rt8§=§7wz")
pattern = re.compile(r"\$.*?\$")
pos = 0
ps = pattern.search(s, pos)
res = []
while ps:
res.append(_escape_latex(s[pos : ps.span()[0]]))
res.append(ps.group())
pos = ps.span()[1]
ps = pattern.search(s, pos)
res.append(_escape_latex(s[pos : len(s)]))
return "".join(res).replace(r"rt8§=§7wz", r"\$")
def _math_mode_with_parentheses(s: str) -> str:
r"""
All characters in LaTeX math mode are preserved.
The substrings in LaTeX math mode, which start with
the character ``\(`` and end with ``\)``, are preserved
without escaping. Otherwise regular LaTeX escaping applies.
Parameters
----------
s : str
Input to be escaped
Return
------
str :
Escaped string
"""
s = s.replace(r"\(", r"LEFT§=§6yzLEFT").replace(r"\)", r"RIGHTab5§=§RIGHT")
res = []
for item in re.split(r"LEFT§=§6yz|ab5§=§RIGHT", s):
if item.startswith("LEFT") and item.endswith("RIGHT"):
res.append(item.replace("LEFT", r"\(").replace("RIGHT", r"\)"))
elif "LEFT" in item and "RIGHT" in item:
res.append(
_escape_latex(item).replace("LEFT", r"\(").replace("RIGHT", r"\)")
)
else:
res.append(
_escape_latex(item)
.replace("LEFT", r"\textbackslash (")
.replace("RIGHT", r"\textbackslash )")
)
return "".join(res)
def _escape_latex_math(s: str) -> str:
r"""
All characters in LaTeX math mode are preserved.
The substrings in LaTeX math mode, which either are surrounded
by two characters ``$`` or start with the character ``\(`` and end with ``\)``,
are preserved without escaping. Otherwise regular LaTeX escaping applies.
Parameters
----------
s : str
Input to be escaped
Return
------
str :
Escaped string
"""
s = s.replace(r"\$", r"rt8§=§7wz")
ps_d = re.compile(r"\$.*?\$").search(s, 0)
ps_p = re.compile(r"\(.*?\)").search(s, 0)
mode = []
if ps_d:
mode.append(ps_d.span()[0])
if ps_p:
mode.append(ps_p.span()[0])
if len(mode) == 0:
return _escape_latex(s.replace(r"rt8§=§7wz", r"\$"))
if s[mode[0]] == r"$":
return _math_mode_with_dollar(s.replace(r"rt8§=§7wz", r"\$"))
if s[mode[0] - 1 : mode[0] + 1] == r"\(":
return _math_mode_with_parentheses(s.replace(r"rt8§=§7wz", r"\$"))
else:
return _escape_latex(s.replace(r"rt8§=§7wz", r"\$"))
|