Skip to content

Add Nutrient benchmark #92

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pweiskircher
Copy link

@pweiskircher pweiskircher commented Jul 11, 2025

As discussed on our call, a Nutrient benchmark!

This is a pretty real-world workload. We import some annotations (same as if a user adds them, but less messy in the code). We render pages. We extract the text.

The benchmark doesn't require any license keys and just uses a trial version. This has no limitations except a runtime limit (1 hour) and a watermark in the rendered pages.

Copy link

netlify bot commented Jul 11, 2025

Deploy Preview for webkit-jetstream-preview ready!

Name Link
🔨 Latest commit da89895
🔍 Latest deploy log https://app.netlify.com/projects/webkit-jetstream-preview/deploys/688a5e06f55c5f0008f7b1a6
😎 Deploy Preview https://deploy-preview-92--webkit-jetstream-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

@danleh danleh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, that was super quick! Overall it looks great to me.

I had a quick look at the CPU profile/flamegraph with 40 iterations (see comment), and in d8 we spend ~40% of the cycles compiling. That's quite compilation heavy, so we could consider increasing the runtime of the workload further. We also spend ~20% of the cycles in the top 3 hottest functions, but that LGTM.

CC @kmiller68 @eqrion for input from the other engines.

@pweiskircher
Copy link
Author

Thanks a lot, that was super quick! Overall it looks great to me.

I had a quick look at the CPU profile/flamegraph with 40 iterations (see comment), and in d8 we spend ~40% of the cycles compiling. That's quite compilation heavy, so we could consider increasing the runtime of the workload further. We also spend ~20% of the cycles in the top 3 hottest functions, but that LGTM.

I'm glad to hear that! We have a lot more we can benchmark. Just wanted to make sure we're going in the right direction. Will work on that and ping you again soon!

@pweiskircher pweiskircher force-pushed the add-nutrient-benchmark branch from d6fb3ce to 1a8a584 Compare July 15, 2025 18:15
@pweiskircher pweiskircher marked this pull request as ready for review July 16, 2025 20:34
@pweiskircher
Copy link
Author

Marking it ready for review. This is a pretty real-world workload. We import some annotations (same as if a user adds them, but less messy in the code). We render pages. We extract the text. What do you think?

@danleh danleh requested a review from kmiller68 July 17, 2025 08:16
@pweiskircher pweiskircher force-pushed the add-nutrient-benchmark branch from 8d38abd to 7ccf24f Compare July 24, 2025 13:08
@pweiskircher
Copy link
Author

@danleh @kmiller68 Anything I can do to get this reviewed? :)

@danleh
Copy link
Contributor

danleh commented Jul 30, 2025

(Sorry for the late reply.) Thanks for addressing the comments and adding a license for the binary!

From our (Google's) side, this is a nice real-world workload to have, but there have been some concerns in the last meeting regarding using a "non-standard"/proprietary license and not having source code available.

You mentioned in an earlier email that you might be able to use the original, unmodified Wasm binary from your publicly released NPM package in this benchmark. Would that be possible? If we just 1:1 copy the Wasm binary from an upstream package and have open sourced JavaScript code for the rest (setup, polyfills if required), that might make things easier wrt. licensing and also makes it clear that this is the exact Wasm code that real users are running in the wild.

@kmiller68 @eqrion Happy to discuss this here or in the next meeting.

@pweiskircher
Copy link
Author

(Sorry for the late reply.) Thanks for addressing the comments and adding a license for the binary!

From our (Google's) side, this is a nice real-world workload to have, but there have been some concerns in the last meeting regarding using a "non-standard"/proprietary license and not having source code available.

You mentioned in an earlier email that you might be able to use the original, unmodified Wasm binary from your publicly released NPM package in this benchmark. Would that be possible? If we just 1:1 copy the Wasm binary from an upstream package and have open sourced JavaScript code for the rest (setup, polyfills if required), that might make things easier wrt. licensing and also makes it clear that this is the exact Wasm code that real users are running in the wild.

I'm happy to do that! Our latest release - 1.5.0 - unfortunately doesn't have the changes in it to work in the shell, but our next one will (freezes next Tuesday). I can use a nightly build for now and adjust when our next release is out or I can wait. Either one is fine with me.

Happy to help in any way to get this into JetStream. The only thing I can't do it open source the code unfortunately.

@pweiskircher pweiskircher force-pushed the add-nutrient-benchmark branch from 8bf0f73 to da89895 Compare July 30, 2025 18:01
@eqrion
Copy link

eqrion commented Jul 30, 2025

@danleh I opened up that NPM package and the license points here.

I am not a lawyer, but I do not think the terms will be acceptable to be included in this repo.

Specifically:

LICENSEE MAY ONLY USE LICENSED TECHNOLOGY FOR EVALUATION PURPOSES, WITH
A VIEW TO PURCHASING A FULL COMMERCIAL DEVELOPMENT LICENSE. IN ALL CASES,
LICENSED TECHNOLOGY'S OBJECT CODE MAY NOT BE SUBMITTED TO APPLE'S APP
STORE, GOOGLE’S PLAY STORE, ANY THIRD PARTY WHATSOEVER, OR MADE
AVAILABLE ON ANY WEBSITE, PRIVATE OR PUBLIC, OR USED IN PRODUCTION. ANY
REDISTRIBUTION OF LICENSED TECHNOLOGY IN EITHER SOURCE OR BINARY FORM IS
STRICTLY PROHIBITED.
3.4. ANY DISTRIBUTION OR MAKING AVAILABLE OF DERIVED WORKS OR LICENSEE
APPLICATIONS TO LICENSEE'S END USERS IS STRICTLY PROHIBITED.
3.2. Licensee shall not (and is not licensed to):
3.2.1. reverse engineer, decompile, disassemble, bypass any code obfuscation, or otherwise
attempt to derive the source code of any part of the Licensed Technology. (To the
extent that applicable law expressly permits you to decompile the Licensed
Technology in order to obtain information necessary to render the object code libraries
interoperable with other software, you must first obtain written permission from us to
provide the necessary information)

I followed up internally at Mozilla and we agreed that we do not want to include any benchmark where we cannot view the source code. It could maybe be okay for a disabled workload that's not part of the main score, but not the main score.

@pweiskircher
Copy link
Author

@danleh I opened up that NPM package and the license points here.

I am not a lawyer, but I do not think the terms will be acceptable to be included in this repo.

I talked with our legal team and everyone is very excited at the possibility to get this included here. I'm sure we could adjust licensing to allow this usage.

I followed up internally at Mozilla and we agreed that we do not want to include any benchmark where we cannot view the source code. It could maybe be okay for a disabled workload that's not part of the main score, but not the main score.

This is more problematic. There's no way we can open source our solution.

@eqrion
Copy link

eqrion commented Jul 30, 2025

This is more problematic. There's no way we can open source our solution.

That's very understandable, I'm definitely not realistically asking that you do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants