Skip to content

Feat/dictionary #69

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
May 19, 2025
Merged

Feat/dictionary #69

merged 13 commits into from
May 19, 2025

Conversation

kjdev
Copy link
Owner

@kjdev kjdev commented May 16, 2025

Summary by CodeRabbit

  • New Features

    • Added support for Brotli compression and decompression using custom dictionaries, including new configuration options and constants.
    • Introduced the ability to specify a dictionary file for output compression.
    • Added new Brotli constants for advanced compression modes and dictionary support.
    • Enhanced stream and output handlers to work with dictionary-based compression.
  • Documentation

    • Updated documentation to reflect new dictionary features, configuration options, and function parameters.
    • Added comprehensive usage examples for dictionary-based compression and decompression.
  • Tests

    • Added extensive test coverage for dictionary support, including basic, incremental, named argument, stream, and output buffer scenarios.
    • Updated existing tests to verify output against external Brotli-compressed files and check for correct HTTP headers.
  • Bug Fixes

    • Improved error handling and validation for invalid or unavailable dictionaries during compression.
  • Chores

    • Updated build configuration scripts to enable dictionary support based on Brotli library version.
    • Added .gitattributes to handle binary file extensions appropriately.

Copy link

coderabbitai bot commented May 16, 2025

Walkthrough

This update introduces dictionary support to the Brotli PHP extension. The change adds optional dictionary parameters to compression and decompression functions, integrates dictionary handling into the output handler and stream wrappers, and adds new configuration directives and constants. Comprehensive documentation and tests are included to validate dictionary-based compression and decompression workflows.

Changes

File(s) Change Summary
README.md Updated documentation to describe new dictionary support, API changes, new constants, configuration options, and example usage.
brotli.c, php_brotli.h Added dictionary support to compression/decompression functions, output handler, stream wrappers; new INI entry and constant; internal refactoring for dictionary management.
brotli.stub.php Updated function signatures to include optional dictionary parameter; added BROTLI_DICTIONARY_SUPPORT constant and related doc comments.
config.m4, config.w32 Build scripts updated to conditionally enable dictionary support based on Brotli library version; macro USE_BROTLI_DICTIONARY defined accordingly.
tests/data.dict Added static dictionary file with Shakespearean text for testing purposes.
tests/dictionary_basic.phpt, tests/dictionary_args.phpt, tests/dictionary_named_args.phpt, tests/dictionary_incremental.phpt, tests/dictionary_named_args_incremental.phpt, tests/dictionary_streams.phpt Added new tests to verify dictionary-based Brotli compression and decompression, including basic, named arguments, incremental, and stream context usage.
tests/dictionary_ob.phpt, tests/ob_dcb_001.phpt, tests/ob_dcb_002.phpt, tests/ob_dcb_003.phpt, tests/ob_dcb_004.phpt Added/updated tests for output handler behavior with dictionary compression, including error cases and HTTP header validation.
tests/files/ob_hi.br, tests/ob_dcb.br Added external Brotli-compressed files for use in output verification by tests.
tests/ob_001.phpt, tests/ob_003.phpt, tests/ob_004.phpt Updated tests to use external file-based output verification instead of inline binary data.
.gitattributes Added to configure Git to treat .br and .dic files as binary and normalize line endings for text files.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant PHP_Brotli_Extension
    participant Brotli_Library
    participant Dictionary_File

    User->>PHP_Brotli_Extension: brotli_compress(data, level, mode, dict)
    alt dict provided and supported
        PHP_Brotli_Extension->>Dictionary_File: Load dictionary
        PHP_Brotli_Extension->>Brotli_Library: Compress with dictionary
    else no dict
        PHP_Brotli_Extension->>Brotli_Library: Compress without dictionary
    end
    Brotli_Library-->>PHP_Brotli_Extension: Compressed data
    PHP_Brotli_Extension-->>User: Return compressed data

    User->>PHP_Brotli_Extension: brotli_uncompress(data, length, dict)
    alt dict provided and supported
        PHP_Brotli_Extension->>Dictionary_File: Load dictionary
        PHP_Brotli_Extension->>Brotli_Library: Decompress with dictionary
    else no dict
        PHP_Brotli_Extension->>Brotli_Library: Decompress without dictionary
    end
    Brotli_Library-->>PHP_Brotli_Extension: Decompressed data
    PHP_Brotli_Extension-->>User: Return decompressed data
Loading
sequenceDiagram
    participant HTTP_Client
    participant PHP_Application
    participant Brotli_Output_Handler
    participant Dictionary_File

    HTTP_Client->>PHP_Application: Request with Accept-Encoding: dcb, Available-Dictionary: <digest>
    PHP_Application->>Brotli_Output_Handler: Output data
    Brotli_Output_Handler->>Dictionary_File: Load dictionary (if configured)
    alt Dictionary matches Available-Dictionary
        Brotli_Output_Handler->>Brotli_Library: Compress with dictionary
        Brotli_Output_Handler-->>PHP_Application: Set Content-Encoding: dcb, Vary: Accept-Encoding, Available-Dictionary
    else Dictionary mismatch or not found
        Brotli_Output_Handler->>Brotli_Library: Compress without dictionary
        Brotli_Output_Handler-->>PHP_Application: Set Content-Encoding: br, Vary: Accept-Encoding
    end
    PHP_Application-->>HTTP_Client: Compressed response with headers
Loading

Poem

🐰
With a hop and a skip, a new trick in my hat—
Brotli learns dictionaries, imagine that!
Compressing with wisdom from Shakespearean lore,
Now streams and handlers are smarter than before.
Tests and docs join in the fun,
Brotli and bunny—together as one!
🥕✨

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kjdev
Copy link
Owner Author

kjdev commented May 16, 2025

@coderabbitai review

Copy link

coderabbitai bot commented May 16, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🔭 Outside diff range comments (1)
brotli.c (1)

341-364: 🛠️ Refactor suggestion

strstr misses case-insensitive header tokens

Accept-Encoding token names are case-insensitive per RFC 9110.
Using strstr() means clients sending BR / Br / DcB, etc. will be ignored.

- if (strstr(Z_STRVAL_P(enc), "br")) {
+ if (php_stristr(Z_STRVAL_P(enc), "br")) {
...
- if (strstr(Z_STRVAL_P(enc), "dcb")) {
+ if (php_stristr(Z_STRVAL_P(enc), "dcb")) {

You already link against ext/standard; php_stristr is available there.

🧹 Nitpick comments (9)
tests/dictionary_ob.phpt (1)

18-19: Add error handling for missing data file

The test includes data.inc without checking if it exists. Consider adding error handling or a skip condition if the data file is missing:

-include(dirname(__FILE__) . '/data.inc');
-echo "{$data}";
+$data_file = dirname(__FILE__) . DIRECTORY_SEPARATOR . 'data.inc';
+if (!file_exists($data_file)) {
+    die("Skip: data.inc file not found");
+}
+include($data_file);
+echo "{$data}";
README.md (5)

123-128: Fix incomplete sentence in documentation

The explanation for the dictionary parameter needs a complete sentence.

- > can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
+ > This parameter can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
🧰 Tools
🪛 LanguageTool

[style] ~127-~127: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


154-159: Fix incomplete sentence in documentation

Similar to the previous comment, the explanation needs a complete sentence.

- > can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
+ > This parameter can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
🧰 Tools
🪛 LanguageTool

[style] ~158-~158: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


187-192: Fix incomplete sentence in documentation

Same issue with incomplete sentence.

- > can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
+ > This parameter can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
🧰 Tools
🪛 LanguageTool

[style] ~191-~191: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


242-247: Fix incomplete sentence in documentation

Same issue with incomplete sentence.

- > can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
+ > This parameter can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
🧰 Tools
🪛 LanguageTool

[style] ~246-~246: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


405-410: Consistent list formatting for HTTP headers

The markdown list uses dashes (-) while the rest of the document uses asterisks (*) for lists. For consistency, consider using asterisks.

-> `Accept-Encoding: dcb`
-> `Available-Dictionary: :<base64-hash>:`
+* `Accept-Encoding: dcb`
+* `Available-Dictionary: :<base64-hash>:`
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

408-408: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


409-409: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

brotli.stub.php (1)

59-65: Constant should be declared as literal bool instead of UNKNOWN in generated stub

Leaving the value as UNKNOWN works for generate_stubs.php, but once the extension is distributed the constant will appear as null, which defeats the purpose of an easy run-time check (if (BROTLI_DICTIONARY_SUPPORT) …).
Provide the literal true/false so that users do not need to rely on run-time reflection or defined() work-arounds.

-const BROTLI_DICTIONARY_SUPPORT = UNKNOWN;
+const BROTLI_DICTIONARY_SUPPORT = true;  // overwritten at compile time
brotli.c (2)

238-260: Error message wording is confusing

The fallback branch emits
"failed to not supported compression dictionary"
and similarly for decompression. The double-negative makes the message hard to read.

-php_error_docref(NULL, E_WARNING,
-                 "%sfailed to not supported compression dictionary",
+php_error_docref(NULL, E_WARNING,
+                 "%sdictionary support is not enabled in this build",

Apply the same wording fix in php_brotli_context_create_decoder_ex.


546-571: HTTP headers risk multiple Vary duplicates

When dcb is not accepted you send Vary: Accept-Encoding in the “else”
branch, but the “start” branch (lines 478-489) may have already added the
same header, leading to duplicate Vary entries.
Consider adding the header only once or using sapi_update_header to merge
values.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between e29e59f and 5abfffc.

📒 Files selected for processing (22)
  • README.md (14 hunks)
  • brotli.c (25 hunks)
  • brotli.stub.php (1 hunks)
  • config.m4 (2 hunks)
  • config.w32 (2 hunks)
  • php_brotli.h (1 hunks)
  • tests/data.dict (1 hunks)
  • tests/dictionary_args.phpt (1 hunks)
  • tests/dictionary_basic.phpt (1 hunks)
  • tests/dictionary_incremental.phpt (1 hunks)
  • tests/dictionary_named_args.phpt (1 hunks)
  • tests/dictionary_named_args_incremental.phpt (1 hunks)
  • tests/dictionary_ob.phpt (1 hunks)
  • tests/dictionary_streams.phpt (1 hunks)
  • tests/files/ob_hi.br (1 hunks)
  • tests/ob_001.phpt (1 hunks)
  • tests/ob_003.phpt (1 hunks)
  • tests/ob_004.phpt (1 hunks)
  • tests/ob_dcb_001.phpt (1 hunks)
  • tests/ob_dcb_002.phpt (1 hunks)
  • tests/ob_dcb_003.phpt (1 hunks)
  • tests/ob_dcb_004.phpt (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
brotli.c (1)
brotli.stub.php (3)
  • Context (121-123)
  • Context (129-131)
  • compress (96-96)
🪛 GitHub Actions: Windows
tests/ob_dcb_001.phpt

[error] 1-85: Output handler test failed with invalid available-dictionary PHP request shutdown notice.

tests/ob_dcb_004.phpt

[error] 1-10: Output handler test failed with invalid available-dictionary and output mismatch.

tests/ob_dcb_003.phpt

[error] 1-83: Output handler test failed with invalid available-dictionary PHP request shutdown notice and output mismatch.

tests/dictionary_ob.phpt

[error] 1-1: Test failed due to output mismatch in dictionary output handler test.

brotli.c

[warning] 21-21: Warning C4068: unknown pragma 'GCC'


[warning] 25-25: Warning C4068: unknown pragma 'GCC'


[warning] 28-28: Warning C4068: unknown pragma 'GCC'


[warning] 626-626: Warning C4133: 'function': incompatible types - from 'int (__cdecl *)(void **,php_output_context *)' to 'php_output_handler_context_func_t'


[warning] 1295-1295: Warning C4133: 'function': incompatible types - from 'int (__cdecl *)(const char *,size_t)' to 'php_output_handler_conflict_check_t'

🪛 GitHub Actions: Linux
tests/ob_dcb_003.phpt

[error] 1-1: Test failed due to invalid available-dictionary in output handler: dcb. Notice: PHP Request Shutdown: brotli: invalid available-dictionary: request(:test:) != actual(4SycXHyLhvKPL/nNaYXl60myMwris2L4X1O+q5tW1XE=).

tests/ob_dcb_002.phpt

[error] 1-1: Test failed due to invalid available-dictionary in output handler: dcb. Notice: PHP Request Shutdown: brotli: not found available-dictionary.

🪛 LanguageTool
README.md

[style] ~127-~127: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~158-~158: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~191-~191: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~246-~246: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)

🪛 markdownlint-cli2 (0.17.2)
README.md

408-408: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


409-409: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: ci (8.1-zts-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.3, x86, nts, vs16, true)
🔇 Additional comments (32)
tests/files/ob_hi.br (1)

1-2: Binary reference file added for test expectations.

This is a binary test file containing Brotli-compressed content of "hi\n" used as an external reference for output buffering tests. This approach of using external files for binary content expectations is a good practice to keep test files clean and maintainable.

php_brotli.h (1)

37-37: New global variable added for dictionary support.

The addition of output_compression_dict to the module globals structure properly supports the new dictionary compression feature. This variable will store a path to the dictionary file that can be used during output compression.

config.w32 (1)

20-20: Dictionary support unconditionally enabled for bundled library.

The dictionary feature is correctly enabled for the bundled Brotli library case, which is appropriate since you can control the version bundled with the extension.

tests/ob_003.phpt (1)

19-20: Improved test structure using external binary reference.

The test now references an external binary file for expectations rather than embedding binary content directly in the test file. This is a better practice for maintaining test files with binary expectations.

tests/data.dict (1)

1-81: Well-chosen dictionary content for compression testing

The Shakespearean excerpts make for an excellent dictionary file with diverse vocabulary and natural language patterns that will effectively demonstrate Brotli's dictionary compression capabilities. The three distinct passages with sufficient spacing between them provide good test cases for the dictionary functionality.

tests/ob_001.phpt (1)

18-22: Good improvement to test verification approach

Switching from inline binary pattern matching to external file verification is a better practice for testing binary output like Brotli-compressed content. The explicit header checks also ensure proper HTTP response attributes are verified.

tests/ob_004.phpt (1)

19-20: Consistent improvement to test verification

This change follows the same pattern as in ob_001.phpt, moving to external file verification for binary compressed output, which improves maintainability and consistency across the test suite.

tests/dictionary_ob.phpt (2)

16-16: ⚠️ Potential issue

Fix Windows path handling to resolve pipeline failure

The Windows pipeline failure suggests an issue with the dictionary path. Windows uses backslashes for file paths, which can cause problems. Use DIRECTORY_SEPARATOR for cross-platform compatibility:

-ini_set('brotli.output_compression_dict', dirname(__FILE__) . '/data.dict');
+ini_set('brotli.output_compression_dict', dirname(__FILE__) . DIRECTORY_SEPARATOR . 'data.dict');

Likely an incorrect or invalid review comment.


24-25:

❌ Incorrect review comment

Verify header expectations for dictionary-based compression

The headers being checked don't seem to fully reflect dictionary-based compression. According to the AI summary for this file, we might expect Content-Encoding: dcb and Vary: Accept-Encoding, Available-Dictionary for dictionary compression. Please verify if these headers should be updated to match the expected behavior.


🏁 Script executed:

#!/bin/bash
# Check if other dictionary tests verify different headers
grep -r "Content-Encoding:" tests/dictionary_*.phpt
grep -r "Vary:" tests/dictionary_*.phpt

Length of output: 580


🏁 Script executed:

#!/bin/bash
# Display the content of the dictionary_ob.phpt test to understand its context
sed -n '1,200p' tests/dictionary_ob.phpt

Length of output: 600


No changes needed for headers in dictionary_ob.phpt
Dictionary-based Brotli output still uses Content-Encoding: br and only varies on Accept-Encoding. None of the other tests/dictionary_*.phpt tests expect a dcb encoding or an Available-Dictionary Vary header. Please disregard the suggestion to update these headers.

Likely an incorrect or invalid review comment.

tests/ob_dcb_003.phpt (1)

1-25:

❓ Verification inconclusive

Test case for invalid dictionary handling is well-designed.

This test appropriately validates the behavior when a client requests compression with a dictionary that doesn't match the server's dictionary. It correctly:

  1. Checks for dictionary support and appropriate SAPI
  2. Sets up the environment to simulate a client request with an invalid dictionary
  3. Validates the error message showing the mismatch between requested and actual dictionary

The pipeline failures indicate this test might be failing. Run the following to check if the expected output pattern matches the actual output:


🏁 Script executed:

#!/bin/bash
# Compare the actual vs expected output patterns for this test
php run-tests.php -q tests/ob_dcb_003.phpt

Length of output: 85


🏁 Script executed:

#!/bin/bash
# Locate the PHPT test runner and run the test
RUNNER=$(find . -maxdepth 3 -type f -name run-tests.php | head -n1)
echo "Using runner: $RUNNER"
if [ -n "$RUNNER" ]; then
  php "$RUNNER" -q tests/ob_dcb_003.phpt || true
else
  echo "run-tests.php not found in repository."
fi

Length of output: 213


Confirm invalid available-dictionary error output

I wasn’t able to locate a PHPT test runner (run-tests.php) in this repo. Please manually execute tests/ob_dcb_003.phpt using your project’s PHPT harness or CI setup and verify that it produces a Notice similar to:

PHP Request Shutdown: brotli: invalid available-dictionary: request(:test:) != actual(<hash>) in Unknown on line 0

This will ensure the server correctly detects when the client’s HTTP_AVAILABLE_DICTIONARY doesn’t match the server’s dictionary.

🧰 Tools
🪛 GitHub Actions: Linux

[error] 1-1: Test failed due to invalid available-dictionary in output handler: dcb. Notice: PHP Request Shutdown: brotli: invalid available-dictionary: request(:test:) != actual(4SycXHyLhvKPL/nNaYXl60myMwris2L4X1O+q5tW1XE=).

🪛 GitHub Actions: Windows

[error] 1-83: Output handler test failed with invalid available-dictionary PHP request shutdown notice and output mismatch.

tests/ob_dcb_002.phpt (1)

1-24:

❓ Verification inconclusive

Test case for missing dictionary handling is well-structured.

This test properly validates the behavior when a client requests dictionary-based compression without specifying which dictionary to use. It:

  1. Verifies dictionary support and SAPI requirements
  2. Sets up a request for dictionary compression without providing an Available-Dictionary header
  3. Checks for the appropriate error notice

The pipeline failures suggest this test might be failing. Run the following to examine the actual output:


🏁 Script executed:

#!/bin/bash
# Check the actual output against expected pattern
php run-tests.php -q tests/ob_dcb_002.phpt

Length of output: 85


Test runner missing – manual verification needed

The tests/ob_dcb_002.phpt file correctly covers the case where Brotli dictionary compression is requested without specifying an available dictionary. However, I wasn’t able to locate or invoke the run-tests.php harness in this repository, so I can’t confirm the actual output matches the expected notice.

Please verify this test by running it with your PHP test harness (e.g., php run-tests.php -q tests/ob_dcb_002.phpt or the appropriate test runner) and ensure it produces:

Notice: PHP Request Shutdown: brotli: not found available-dictionary in Unknown on line 0
🧰 Tools
🪛 GitHub Actions: Linux

[error] 1-1: Test failed due to invalid available-dictionary in output handler: dcb. Notice: PHP Request Shutdown: brotli: not found available-dictionary.

tests/ob_dcb_004.phpt (1)

1-27:

❓ Verification inconclusive

Good test for fallback behavior from invalid dictionary to standard compression.

This test properly validates that when dictionary-based compression fails due to an invalid dictionary, the system falls back to standard Brotli compression if supported by the client. It correctly:

  1. Sets up a client request supporting both standard and dictionary-based compression
  2. Provides an invalid dictionary identifier
  3. Verifies the compressed output matches expectations
  4. Confirms the response headers indicate standard Brotli compression was used

The Windows pipeline failure suggests platform-specific issues. Run the following to check file path handling or other Windows-specific behaviors:


🏁 Script executed:

#!/bin/bash
# Examine differences in test execution between platforms
php run-tests.php -q tests/ob_dcb_004.phpt

Length of output: 85


Unable to locate test runner – please verify manually on a proper PHP source checkout

Good test for fallback behavior from an invalid dictionary to standard Brotli compression. It correctly:

  • Simulates a client that accepts both br and dcb
  • Supplies an invalid dictionary identifier
  • Confirms the output matches the pre-compressed reference
  • Checks that headers show Content-Encoding: br and Vary: Accept-Encoding

However, the sandbox wasn’t able to run the test because run-tests.php wasn’t found. To ensure this works and to diagnose the Windows pipeline failures (likely due to path or line-ending differences), please run in your PHP source tree:

php run-tests.php -q tests/ob_dcb_004.phpt

and report any errors or mismatches on Windows so we can address them.

🧰 Tools
🪛 GitHub Actions: Windows

[error] 1-10: Output handler test failed with invalid available-dictionary and output mismatch.

config.m4 (2)

66-73: Proper conditional dictionary support based on library version.

The implementation correctly checks for Brotli library version 1.1.0 or newer before enabling dictionary support, which ensures the feature is only enabled when the underlying library supports it.


117-117: Appropriate unconditional dictionary support for bundled Brotli.

For the bundled Brotli library, dictionary support is unconditionally enabled, which is appropriate since the bundled version presumably supports dictionaries.

tests/dictionary_basic.phpt (1)

1-50: Well-structured and comprehensive test for dictionary support!

This test effectively verifies Brotli dictionary-based compression and decompression functionality:

  • Properly checks for dictionary support availability
  • Tests both procedural functions and namespaced equivalents
  • Verifies successful compression/decompression with the correct dictionary
  • Validates failure cases when no dictionary or incorrect dictionary is provided

The test structure and expected output correctly align with how dictionary-based compression should work.

tests/dictionary_args.phpt (1)

1-81: Comprehensive test covering all parameter combinations!

This test thoroughly verifies dictionary-based compression with various parameters:

  • Tests all compression levels (0-11)
  • Tests all compression modes (generic, text, font)
  • Tests compression both with and without dictionary
  • Uses a well-structured helper function to test each combination

The test methodology is excellent and provides good coverage of all parameter combinations.

tests/ob_dcb_001.phpt (2)

16-17: Well-implemented dictionary configuration for output compression

The new INI setting brotli.output_compression_dict is appropriately used to specify the dictionary file path for the output handler.


24-26: Proper header expectations for dictionary-based compression

The test correctly expects:

  1. Content-Encoding: dcb to indicate dictionary-based compression
  2. Vary header with both Accept-Encoding and Available-Dictionary

This follows proper HTTP protocol for negotiated compression with dictionaries.

tests/dictionary_named_args_incremental.phpt (2)

15-30: Well-implemented incremental compression with named arguments

The implementation correctly demonstrates:

  • Proper use of PHP 8.0+ named arguments syntax
  • Incremental compression with dictionary support
  • Appropriate finalization with BROTLI_FINISH mode

The code structure is clean and showcases the new dictionary parameter effectively.


39-50: Well-implemented incremental decompression with named arguments

The implementation correctly demonstrates:

  • Proper initialization of the decompression context with a dictionary
  • Incremental decompression of compressed data in chunks
  • Appropriate finalization with BROTLI_FINISH mode

The test effectively validates the round-trip functionality of the dictionary-based compression.

tests/dictionary_streams.phpt (1)

1-47: Well-structured test for dictionary-based compression with streams

This test comprehensively verifies that dictionary-based compression and decompression works correctly with PHP stream contexts and the compress.brotli:// stream wrapper. It checks:

  1. Dictionary-based compression through stream context
  2. Size verification of compressed data
  3. Decompression through stream context
  4. Direct decompression with brotli_uncompress using dictionary

The test design is solid with proper error checking and cleanup.

tests/dictionary_named_args.phpt (1)

1-128: Thorough test for named arguments with dictionary support

This test comprehensively covers all combinations of named arguments for both compression and decompression functions with dictionary support. It properly:

  1. Verifies behavior when required arguments are missing
  2. Tests different combinations of optional parameters
  3. Confirms data integrity through round-trip compression/decompression
  4. Validates correct error messages

The test has good coverage for PHP 8.0+ named arguments syntax with the new dictionary parameter.

tests/dictionary_incremental.phpt (1)

1-55: Solid test for incremental dictionary-based compression/decompression

This test effectively validates incremental compression and decompression using a dictionary across different Brotli modes:

  1. Tests both BROTLI_PROCESS and BROTLI_FLUSH modes
  2. Validates correct context initialization with dictionary
  3. Checks chunk-by-chunk processing (6-byte chunks)
  4. Verifies data integrity through complete round-trip workflow

The approach of splitting data into small chunks ensures the incremental API works correctly with dictionaries.

README.md (9)

47-48: Documentation updates for configuration options

The default compression level has been updated to 11 (BROTLI_COMPRESS_LEVEL_DEFAULT), and a new configuration directive for dictionary-based output compression has been added. This aligns with the newly implemented dictionary support.


64-68: Clear documentation for the new dictionary directive

The documentation clearly explains the purpose of the new brotli.output_compression_dict directive.


73-86: Well-documented constants including dictionary support flag

The added constants are well-documented, including mode constants and the critical BROTLI_DICTIONARY_SUPPORT flag that indicates whether dictionary support is available. The note about requiring brotli library version 1.1.0+ is important for users.


102-103: Updated function signature for brotli_compress

The function signature has been correctly updated to include the optional dictionary parameter.


139-140: Updated function signature for brotli_uncompress

The function signature has been correctly updated to include the optional dictionary parameter.


170-171: Updated function signature for brotli_compress_init

The function signature has been correctly updated to include the optional dictionary parameter.


235-236: Updated function signature for brotli_uncompress_init

The function signature has been correctly updated to include the optional dictionary parameter.


289-294: Updated namespace function signatures

All namespace function signatures have been correctly updated to include the optional dictionary parameter.


364-403: Excellent examples for dictionary usage

The examples section provides comprehensive coverage of dictionary usage with:

  1. Basic compression/decompression
  2. Incremental API
  3. Stream wrapper with context
  4. Output handler configuration

This will be very helpful for users adopting the new dictionary feature.

@kjdev kjdev force-pushed the feat/dictionary branch from 86c6a8a to 54b2410 Compare May 18, 2025 23:37
@kjdev kjdev force-pushed the feat/dictionary branch from 54b2410 to 7733074 Compare May 19, 2025 00:22
@kjdev kjdev force-pushed the feat/dictionary branch from 7733074 to 48c719a Compare May 19, 2025 00:34
@kjdev
Copy link
Owner Author

kjdev commented May 19, 2025

@coderabbitai review

Copy link

coderabbitai bot commented May 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
README.md (2)

69-69: Improve document readability with complete sentences

The notes about BROTLI_DICTIONARY_SUPPORT are written as sentence fragments without a subject. This occurs in multiple places throughout the documentation.

Add a subject to form complete sentences:

-> can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled
+> This feature can be used when `BROTLI_DICTIONARY_SUPPORT` is enabled

Also applies to: 129-129, 160-160, 193-193, 248-248

🧰 Tools
🪛 LanguageTool

[style] ~69-~69: To form a complete sentence, be sure to include a subject.
Context: ... used by the output handler. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


407-411: Fix inconsistent list style

The unordered list in the notes about Compression Dictionary Transport uses dash (-) instead of asterisk (*), which is inconsistent with the rest of the document.

-> `Accept-Encoding: dcb`
-> `Available-Dictionary: :<base64-hash>:`
+* `Accept-Encoding: dcb`
+* `Available-Dictionary: :<base64-hash>:`
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

410-410: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


411-411: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 5abfffc and 48c719a.

📒 Files selected for processing (23)
  • .gitattributes (1 hunks)
  • README.md (14 hunks)
  • brotli.c (25 hunks)
  • brotli.stub.php (1 hunks)
  • config.m4 (2 hunks)
  • config.w32 (2 hunks)
  • php_brotli.h (1 hunks)
  • tests/data.dict (1 hunks)
  • tests/dictionary_args.phpt (1 hunks)
  • tests/dictionary_basic.phpt (1 hunks)
  • tests/dictionary_incremental.phpt (1 hunks)
  • tests/dictionary_named_args.phpt (1 hunks)
  • tests/dictionary_named_args_incremental.phpt (1 hunks)
  • tests/dictionary_ob.phpt (1 hunks)
  • tests/dictionary_streams.phpt (1 hunks)
  • tests/files/ob_hi.br (1 hunks)
  • tests/ob_001.phpt (1 hunks)
  • tests/ob_003.phpt (1 hunks)
  • tests/ob_004.phpt (1 hunks)
  • tests/ob_dcb_001.phpt (1 hunks)
  • tests/ob_dcb_002.phpt (1 hunks)
  • tests/ob_dcb_003.phpt (1 hunks)
  • tests/ob_dcb_004.phpt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitattributes
🚧 Files skipped from review as they are similar to previous changes (20)
  • php_brotli.h
  • tests/files/ob_hi.br
  • tests/ob_003.phpt
  • config.w32
  • tests/ob_004.phpt
  • tests/ob_001.phpt
  • tests/data.dict
  • tests/dictionary_basic.phpt
  • tests/ob_dcb_003.phpt
  • tests/ob_dcb_004.phpt
  • tests/ob_dcb_001.phpt
  • tests/dictionary_ob.phpt
  • tests/dictionary_args.phpt
  • tests/dictionary_named_args_incremental.phpt
  • tests/ob_dcb_002.phpt
  • tests/dictionary_incremental.phpt
  • tests/dictionary_streams.phpt
  • tests/dictionary_named_args.phpt
  • config.m4
  • brotli.stub.php
🧰 Additional context used
🧬 Code Graph Analysis (1)
brotli.c (1)
brotli.stub.php (3)
  • Context (121-123)
  • Context (129-131)
  • compress (96-96)
🪛 LanguageTool
README.md

[style] ~69-~69: To form a complete sentence, be sure to include a subject.
Context: ... used by the output handler. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~129-~129: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~160-~160: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~193-~193: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)


[style] ~248-~248: To form a complete sentence, be sure to include a subject.
Context: ... * dict The dictionary data. > can be used when `BROTLI_DICTIONARY_SUPPORT...

(MISSING_IT_THERE)

🪛 markdownlint-cli2 (0.17.2)
README.md

410-410: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)


411-411: Unordered list style
Expected: asterisk; Actual: dash

(MD004, ul-style)

⏰ Context from checks skipped due to timeout of 90000ms (34)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.0-alpine, ghcr.io/kjdev/php, linux/amd64)
  • GitHub Check: ci (7.0-alpine, ghcr.io/kjdev/php, linux/amd64)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.3-zts-alpine, php, linux/amd64)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.0-alpine, ghcr.io/kjdev/php, linux/amd64)
  • GitHub Check: ci (7.0-alpine, ghcr.io/kjdev/php, linux/amd64)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.3-zts-alpine, php, linux/amd64)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (7.0-alpine, ghcr.io/kjdev/php, linux/amd64)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.3-zts-alpine, php, linux/amd64)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.4-alpine, apcu, php, linux/amd64)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.1-alpine, apcu, php, linux/386)
  • GitHub Check: ci (8.2-alpine, apcu, php, linux/386)
🔇 Additional comments (24)
README.md (1)

365-405: Well-documented examples for the new dictionary feature

The added dictionary examples thoroughly cover all use cases (basic usage, incremental compression/decompression, streams, and output handler), providing clear guidance for users.

brotli.c (23)

8-10: Appropriate header inclusions for new features

The added includes for hash functions and base64 support are necessary for the dictionary digest functionality.


119-121: Good macro definition for nullable strings

Creating a macro for nullable string parameters improves code consistency and maintainability.


155-159: Well-structured context enhancements for dictionary support

The changes to the php_brotli_context struct properly add dictionary support, correctly using conditional compilation with #if defined(USE_BROTLI_DICTIONARY) to ensure backward compatibility.


171-175: Proper initialization handling for dictionary member

The context initialization function correctly initializes the new dictionary member and digest buffer, ensuring no uninitialized memory is used.


238-259: Complete error handling for dictionary support

The implementation correctly handles both cases where dictionary support is enabled and disabled, providing appropriate error messages to users.


287-304: Consistent implementation for decoder dictionary support

The decoder dictionary implementation mirrors the encoder implementation, maintaining consistency in the codebase.


321-326: Proper resource cleanup for dictionaries

The context cleanup function correctly frees the dictionary resources when dictionary support is enabled, preventing memory leaks.


341-343: Good use of macros for encoding flags

Using macros for encoding flags improves code readability and maintainability.


362-366: Proper conditional compilation for DCB support

The DCB (Dictionary Compression for Brotli) encoding detection is correctly guarded by the USE_BROTLI_DICTIONARY macro.


373-467: Well-implemented dictionary loading functionality

The dictionary loading function is comprehensive, handling:

  1. Path validation
  2. Stream operations
  3. Dictionary verification using SHA-256
  4. Error handling
  5. Conditional dictionary support

Note: The commented-out dictionary size limit code (lines 401-411) is consistent with the maintainer's decision in a previous review to not impose size limits on dictionaries.


469-491: Clean implementation of output handler context initialization

The implementation properly initializes the output handler context, loads dictionaries when needed, and handles errors appropriately.


503-511: Proper header handling for dictionary compression

The output handler correctly sets different headers based on whether dictionary compression is enabled, using the appropriate Vary header values.


571-593: Comprehensive handling of dictionary-based compression

The implementation:

  1. Properly creates the DCB format with marker, identifier, and SHA-256 digest
  2. Sets appropriate headers for both standard and dictionary-based compression
  3. Handles the output buffer correctly

767-769: Good use of string macros

Using string conversion macros to convert numeric values to strings is a good practice.


775-781: Properly defined INI settings

The INI settings are properly defined with appropriate default values and handlers.


1074-1077: Stream context handling for dictionaries

The implementation correctly extracts the dictionary from stream context options.


1085-1087: Memory management for dictionary strings

All allocated dictionary strings are properly released, preventing memory leaks.

Also applies to: 1102-1104, 1121-1123, 1127-1129


1306-1312: Proper constant registration based on support

The BROTLI_DICTIONARY_SUPPORT constant is registered with the correct value based on whether dictionary support is enabled.


1427-1431: Information about dictionary support in phpinfo

The implementation adds dictionary support information to the PHP info output, making it easy for users to check if the feature is available.


1478-1486: Parameter handling in brotli_compress

The implementation correctly parses and handles the new dictionary parameter.


1536-1543: Parameter handling in brotli_compress_init

The implementation correctly parses and handles the new dictionary parameter for compression initialization.


1670-1677: Parameter handling in brotli_uncompress

The implementation correctly parses and handles the new dictionary parameter for decompression.


1725-1730: Parameter handling in brotli_uncompress_init

The implementation correctly parses and handles the new dictionary parameter for decompression initialization.

@kjdev kjdev merged commit 48c719a into master May 19, 2025
157 of 197 checks passed
@kjdev kjdev deleted the feat/dictionary branch May 19, 2025 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant