Skip to content

Fix annotations of str methods that accept regular expressions #1278

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hamdanal
Copy link
Contributor

  • Closes #xxxx (Replace xxxx with the Github issue number)
  • Tests added: Please use assert_type() to assert the type of any return value

These methods accept compiled regular expressions as documented in pandas and tested at runtime.

@@ -83,12 +83,17 @@ class StringMethods(
) -> _T_STR: ...
@overload
def split(
self, pat: str = ..., *, n: int = ..., expand: Literal[True], regex: bool = ...
self,
pat: str | re.Pattern[str] = ...,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

) -> _T_BOOL: ...
def replace(
self,
pat: str,
pat: str | re.Pattern[str],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -180,18 +189,26 @@ class StringMethods(
def count(self, pat: str, flags: int = ...) -> _T_INT: ...
def startswith(self, pat: str | tuple[str, ...], na: Any = ...) -> _T_BOOL: ...
def endswith(self, pat: str | tuple[str, ...], na: Any = ...) -> _T_BOOL: ...
def findall(self, pat: str, flags: int = ...) -> _T_LIST_STR: ...
def findall(self, pat: str | re.Pattern[str], flags: int = ...) -> _T_LIST_STR: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@overload
def extract(
self, pat: str, flags: int = ..., *, expand: Literal[True] = ...
self,
pat: str | re.Pattern[str],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

) -> _T_OBJECT: ...
def extractall(self, pat: str, flags: int = ...) -> pd.DataFrame: ...
def extractall(
self, pat: str | re.Pattern[str], flags: int = ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -214,7 +231,11 @@ class StringMethods(
def isnumeric(self) -> _T_BOOL: ...
def isdecimal(self) -> _T_BOOL: ...
def fullmatch(
self, pat: str, case: bool = ..., flags: int = ..., na: Any = ...
self,
pat: str | re.Pattern[str],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -133,11 +138,15 @@ class StringMethods(
regex: bool = ...,
) -> _T_BOOL: ...
def match(
self, pat: str, case: bool = ..., flags: int = ..., na: Any = ...
self,
pat: str | re.Pattern[str],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only one of the changed methods that is not documented as accepting a compiled regex but this is an oversight https://pandas.pydata.org/docs/reference/api/pandas.Series.str.match.html. See pandas-dev/pandas#61879

@twoertwein
Copy link
Member

Thank you for your well-tested PR!

These methods accept compiled regular expressions as documented in pandas

It is not clear to me if the documentation suggests that re.Pattern objects are officially accepted (for example here https://pandas.pydata.org/docs/reference/api/pandas.Series.str.extractall.html). The type is often explicitly annotated as str in pandas and the fact that not all functions accept it might hint that it is not officially accepted. I will defer to @Dr-Irv.

Feel free to open an issue/PR at pandas about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants