Skip to content

BUG: Fix lost precision with common type of uint64/int64 #61679

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

pbrochart
Copy link
Contributor

@pbrochart pbrochart commented Jun 19, 2025

@pbrochart
Copy link
Contributor Author

pre-commit.ci autofix

@simonjayhawkins simonjayhawkins added Bug Dtype Conversions Unexpected or buggy dtype conversions isin isin method labels Jun 25, 2025
@@ -59,6 +60,7 @@
"is_float",
"is_float_dtype",
"is_hashable",
"is_implicit_conversion_to_float64",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix the original issue, we shouldn't need to create a new function (and expose it as a public API)

@pbrochart
Copy link
Contributor Author

I think this patch is more safe and provides better performance

@pbrochart
Copy link
Contributor Author

I test another fix that can be benefit for other issues (e.g #61688) without any changes
But it breaks some tests (the fix changes the behavior of np_find_common_type)

@pbrochart pbrochart marked this pull request as draft June 29, 2025 20:22
@pbrochart pbrochart changed the title BUG: Fix implicit conversion to float64 with isin() BUG: Fix lost precision with common type of uint64/int64 Jun 30, 2025
@pbrochart
Copy link
Contributor Author

pre-commit.ci autofix

@pbrochart pbrochart marked this pull request as ready for review July 1, 2025 10:04
@@ -404,6 +404,7 @@ Other API changes
- Index set operations (like union or intersection) will now ignore the dtype of
an empty ``RangeIndex`` or empty ``Index`` with object dtype when determining
the dtype of the resulting Index (:issue:`60797`)
- ``np_find_common_type`` will now return ``object`` for mixed ``int64`` and ``uint64`` dtypes to avoid precision lost (:issue:`61676`, :issue:`61688`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

I'm just skimming issues so have not looked in detail.

By glancing through I see we are changing tested behavior. This sort of suggests intentional behavior and not a bug per se. If it is a longstanding behavior we may want to deprecate first as a breaking API change?

In pandas we normally want to avoid object type, I don't see any discussion that this is an agreed way forward?

@pbrochart
Copy link
Contributor Author

Hi,

Numpy documentation is not explicit about it but the API give the best compromise for this kind of types as it's a hardware limitation.
IIRC the precision start to lose if the uint64 number is above 2^53.
The API should not be deprecated in my opinion but rather be aware of this limitation.
I don't see any other solution than using an object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions isin isin method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Merge duplicates and validation failure when columns have type int64 and uint64 BUG: Implicit conversion to float64 with isin()
3 participants