feat: Add Opt-in Formula Reading Support #137
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR introduces optional formula reading capabilities to python-calamine, allowing users to access the underlying formulas in spreadsheet cells rather than just their calculated values.
Motivation
Previously, python-calamine only provided access to cell values (the calculated results of formulas). Users had no way to access the actual formula expressions, which is essential for:
Implementation
Core Changes
Opt-in Design: Formula reading is disabled by default (
read_formulas=False
) to maintain backward compatibility and avoid performance overhead for users who don't need formulas.New API Parameter: Added
read_formulas
parameter to all workbook creation methods:CalamineWorkbook.from_object(path_or_filelike, read_formulas=False)
CalamineWorkbook.from_path(path, read_formulas=False)
CalamineWorkbook.from_filelike(filelike, read_formulas=False)
load_workbook(path_or_filelike, read_formulas=False)
Formula Iterator: Added
CalamineSheet.iter_formulas()
method that returns aCalamineFormulaIterator
with consistent dimensions matching the data iterator.Performance Optimization: Implemented on-demand coordinate mapping instead of pre-allocating expanded ranges, ensuring minimal memory overhead.
Technical Details
get_value()
lookups rather than expanding formula ranges upfrontCalamineCellIterator
andCalamineFormulaIterator
exposeposition
,start
,width
, andheight
properties for coordinate recoveryFile Format Support
Supports formula reading across all major spreadsheet formats:
Formula syntax varies by format (e.g., ODS uses
of:=SUM([.A1:.B1])
vs Excel'sSUM(A1:B1)
).API Examples
Backward Compatibility
read_formulas=False
)Testing
Comprehensive test suite covering:
Performance Impact