-
Notifications
You must be signed in to change notification settings - Fork 34
Split out numpy
and numpy-tests
, and update to NumPy v2.3.1
#86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
It still takes |
I checked the build log of numpy-tests and it looks like it still builds all the source codes in numpy (you can download it from the GHA artifacts). Is it intended behavior? |
Yes, this is intended behaviour. The install tags only affect what files are installed, i.e., copied into the final wheels. In this case, these are the test-specific extension modules and Python files, but all files are still built. I checked the artifacts and it is working as expected: the Now I have to figure out why it doesn't work… because if/when it does, we have previously seen that we can compile from a shared directory in around ~8 seconds, and we can take advantage of that. |
Ah, okay, I got why it doesn't work by checking the logs. The build dir for
when it should be:
We need to make the source:
path: ../numpy/build/numpy-2.2.5/ work to do so. The documentation for |
No, I think it is how pyodide-build works (for now). When one sets So what happens now is
One possibility I can think of is that the mtime of the files was modified when the file was copied, and because of that, meson thinks it needs to recompile all the files, which causes all the compilation to happen again. Or, if recompilation is the expected behavior, it could be that the path of the file has changed, causing a cache miss. |
I see, that makes sense. Considering that we've been advertising/using this for local testing and debugging only, would you be willing to change this behaviour? That is, we could modify I think such a way would be easier to manage in comparison to comparing mtimes. However, one issue will still exist which we haven't sorted out: |
Yeah, I think you can try to see if it works. I guess changing the code in srcdir = self.source_metadata.path.resolve()
+ self.src_extract_dir = srcdir
- if not srcdir.is_dir():
- raise ValueError(f"path={srcdir} must point to a directory that exists")
-
- def ignore(path: str, names: list[str]) -> list[str]:
- ignored: list[str] = []
-
- if fnmatch.fnmatch(path, "*/dist"):
- # Do not copy dist/*.whl files from a dirty source tree;
- # this can lead to "Exception: Unexpected number of wheels" later.
- ignored.extend(name for name in names if name.endswith(".whl"))
- return ignored
-
- shutil.copytree(srcdir, self.src_extract_dir, ignore=ignore) would do the job, but not 100% sure. Reusing the src directory for the build directory will make cleaning up the build directory or rebuilding a little bit more complex, but I think it is not so big problem.
I think the current approach (setting numpy as a host dependency) is sufficient, at least for now. It's a very special case for two recipes to share source code like this, and I don't want to add too much complex behavior to handle this. |
This reverts commit 9ac87e0.
Sorry for getting back to this a little late, and thanks for this patch! Yes, it worked for the build case perfectly – barring one minor problem, where the However, I do have my reservations around if we need such a patch at all in this case (especially for such a special case). I've pushed a few commits, please feel free to take a look! |
I resolved this by using |
numpy
and numpy-tests
Okay, there's just one test to get through: @pytest.mark.skip_pyproxy_check
def test_runpythonasync_numpy(selenium_standalone):
selenium_standalone.run_async(
"""
import numpy as np
x = np.zeros(5)
"""
)
for i in range(5):
assert selenium_standalone.run_js(
f"return pyodide.globals.get('x').toJs()[{i}] == 0"
) which says
I'm not too sure why, but I can debug it with NumPy outside this build – probably broken from the test splits. |
Package Build ResultsTotal packages built: 30 Package Build Times (click to expand)
Longest build: openssl (4m 16s) |
numpy
and numpy-tests
numpy
and numpy-tests
, and update to NumPy v2.3.1
I saw this happening time to time. I think there is an unknown flakiness in our build system, but I really don't understand why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks for your work @agriyakhetarpal, and also thanks for updating the numpy version.
After merging this, we should also update the numpy version in pyodide/pyodide to align the numpy version in xbuildenv (yes, it is a bit annoying but should be done until we implmenent pyodide/pyodide-build#43).
packages/numpy/test_numpy.py
Outdated
@@ -3,7 +3,7 @@ | |||
|
|||
|
|||
def test_numpy(selenium): | |||
selenium.load_package("numpy") | |||
selenium.load_package(["numpy", "numpy-tests"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to import numpy-tests
in this file.
Would you like to add a separate test file under packages/numpy-tests
that actually runs some tests that are included in numpy-tests instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks! I added the full test suite in ae85b6d, and I will reduce it in subsequent commits before we merge this, so that CI time is not impacted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a subset of the tests that would be fine for us would be those from numpy.linalg
, numpy.fft
, numpy.polynomial
, numpy.random
, and numpy.lib
. We can leave out numpy.f2py
(won't work anyway), numpy.strings
, numpy.char
, and numpy.ma
. This is, unless you may have any other ideas here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, numpy is very important so I am okay with having more tests, but running all test suite will be too time consuming, so it would be nice if we could find a good Goldilocks point.
I tried to run the tests locally after splitting them out, but I don't think this is the right approach for testing them. For example, https://docs.scipy.org/doc/scipy/building/redistributable_binaries.html does not mention how to run the tests from the split wheels. I notice that the One way to get around this is to install I tried to follow the approach in pandas-dev/pandas#53007, but I don't think this is something we should be doing here – we should wait for developments on numpy/numpy#26289 first. It is only after So, would you be okay with proceeding without these tests, given that NumPy functionality is being tested out-of-tree to a reasonable extent, and also here in the |
Sure, that is okay with me. |
Thanks! I'll open a follow-up issue to discuss more testing for NumPy and a follow-up PR to update NumPy on the Pyodide repository side. |
This PR splits out
numpy
into two packages using the Meson install tags feature, one that installsruntime,python-runtime,devel
and another that installstests
. This is an easier way to unvendor the tests from NumPy's wheels. It is to be noted that the package is still built fully – it is just that the relevant files are not installed into the wheel.I've also updated to NumPy version 2.3.1, which was recently released, and dropped the associated patch from numpy/numpy#28936 as it is no longer needed.