Skip to content

Implement hash_map macro #144070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

stifskere
Copy link

Implementation of #144032

Implements the hash_map macro under std/src/macros.rs.

@rustbot
Copy link
Collaborator

rustbot commented Jul 17, 2025

r? @thomcc

rustbot has assigned @thomcc.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jul 17, 2025
@rustbot

This comment has been minimized.

@rustbot rustbot added has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 17, 2025
@stifskere stifskere force-pushed the feat/macros/hash_map branch from 3d020f6 to b5692af Compare July 17, 2025 12:57
@rustbot rustbot removed has-merge-commits PR has merge commits, merge with caution. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jul 17, 2025
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@NyxCode
Copy link

NyxCode commented Jul 17, 2025

Wouldn't using one of HashMap::with_capacity(), HashMap::reserve(), FromIterator<(K, V)>, Extend<(K, V)> or From<[(K, V); N]> be better, avoiding unnecessary allocations?

@stifskere
Copy link
Author

stifskere commented Jul 17, 2025

Wouldn't using one of HashMap::with_capacity(), HashMap::reserve(), FromIterator<(K, V)>, Extend<(K, V)> or From<[(K, V); N]> be better, avoiding unnecessary allocations?

I was thinking about HashMap::with_capacity() but I couldn't see how I could convert a repetition into a number.

From... would essentially be the same, just reallocation, so for clarity I just expanded into insert.

I could see that 0 $(+ 1)* could be evaluated constantly by the compiler, thus we could get a size, but I'm not sure if it's the best way.
It could not be the best way, because that expansion doesn't at all include a reference to the repetition declaration, so it can't be. There is no logical way of getting the size with a macro_rules unless with more complexity we make a recursive macro, which would limit the hash_map size to 254 without extra compiler parameters.

I will add a question about the size in the tracking issue.

@NyxCode
Copy link

NyxCode commented Jul 17, 2025

All the options i mentioned should be equivalent to with_capacity as far as I can tell, though I don't know if the optimizer can look through all options equally well (e.g. wrt. stack usage).
Both these should work, for example:

let len = 1 $(+ {stringify!($k); 1})*;
let mut map = HashMap::with_capacity(len);
$( map.insert($k, $v); )*
HashMap::from([$(($k, $v)),*])

/// [`HashMap::new`]: crate::collections::HashMap::new
/// [`HashMap::insert`]: crate::collections::HashMap::insert
#[macro_export]
#[allow_internal_unstable(hint_must_use)]
Copy link
Member

@Noratrieb Noratrieb Jul 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't use of this feature (maybe it would make sense to?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it because it makes sense that if you create a hash map you must use it, otherwise don't create it, I don't see the point otherwise.

@stifskere
Copy link
Author

stifskere commented Jul 18, 2025

All the options i mentioned should be equivalent to with_capacity as far as I can tell, though I don't know if the optimizer can look through all options equally well (e.g. wrt. stack usage). Both these should work, for example:

let len = 1 $(+ {stringify!($k); 1})*;
let mut map = HashMap::with_capacity(len);
$( map.insert($k, $v); )*
HashMap::from([$(($k, $v)),*])

I think I'd rather go for intrinsics, except since HashMap is a wrapper, I don't think there are many. I feel that if someone decides to expand the macro via IDE or cargo expand, they will find a lot of strings and be like "???". Probably open an issue later.

Otherwise, we could generate a constant count probably, the compiler should optimize this, and it should not have runtime cost, but a bit extra of constant cost, which may mean that if there are a lot of invocations the compiler might go a bit slower.

macro_rules! hash_map {
    () => {{
        ::std::collections::HashMap::new()
    }};

    ( $( $key:expr => $value:expr ),* $(,)? ) => {{
        let mut map = ::std::collections::HashMap::with_capacity(
            const { [$({ let _ = &$key; () }),*].len() }
        );

        $( map.insert($key, $value); )*
        map
    }}
}
image

@Noratrieb
Copy link
Member

I don't think intrinsics make sense here, this is a library type. I think the repeat insert method makes the most sense as it doesn't have the potential stack usage problem of from. Repeat insert is also how extend is implemented under the hood: https://github.com/rust-lang/hashbrown/blob/670213fb32d208759d0331996894e05604de0c18/src/map.rs#L4562.

with_capacity definitely makes sense though. You can count the elements like this: https://lukaswirth.dev/tlborm/decl-macros/building-blocks/counting.html

@Noratrieb
Copy link
Member

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

@stifskere
Copy link
Author

stifskere commented Jul 18, 2025

I don't think intrinsics make sense here, this is a library type. I think the repeat insert method makes the most sense as it doesn't have the potential stack usage problem of from. Repeat insert is also how extend is implemented under the hood: https://github.com/rust-lang/hashbrown/blob/670213fb32d208759d0331996894e05604de0c18/src/map.rs#L4562.

with_capacity definitely makes sense though. You can count the elements like this: https://lukaswirth.dev/tlborm/decl-macros/building-blocks/counting.html

I would go for the slice length approach, but resolved constantly, which is technically what I'm doing just that instead of making a variable using replace_expr!

So probably

macro_rules! hash_map {
    () => {{
        ::std::collections::HashMap::new()
    }};

    ( $( $key:expr => $value:expr ),* $(,)? ) => {{
        let mut map = ::std::collections::HashMap::with_capacity(
            const { [$($crate::hash_map!(@internal count $key)),*].len() }
        );

        $( map.insert($key, $value); )*
        map
    }};

    (@internal count $i:tt) => {()}
}

This would work best if there is some sort of thing inside the standard library so that the last branch is only used internally in an explicit way, which would also filter out the usage inside the second branch.

@Noratrieb
Copy link
Member

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

You should try this, which I only remembered after posting the first message

@stifskere
Copy link
Author

Oh actually you might be able to use the unstable macro_metavar_expr_count for the counting too, I forgot about that :D

You should try this, which I only remembered after posting the first message

The problem with this is that it would delay stabilization a lot, so I'm not sure, I don't think adding unstable to unstable helps

@Noratrieb
Copy link
Member

You're allowed to use unstable features internally with #[allow_internal_unstable()] and listing out the feature names there, on the macro.

@tmke8
Copy link

tmke8 commented Jul 19, 2025

Would it be possible to use HashMap::default() instead of HashMap::new() so that this macro can be used with non-default hashing algorithms as well? (Though this might cause type inference problems...)

EDIT: or rather

HashMap::with_capacity_and_hasher(capacity, Default::default())

for specifying capacity.

@stifskere
Copy link
Author

You're allowed to use unstable features internally with #[allow_internal_unstable()] and listing out the feature names there, on the macro.

Ok, I will check that now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants