-
Notifications
You must be signed in to change notification settings - Fork 35
Core-corrosion: Adding gitoxide to git-protocol integration tests #780
Conversation
When you say "all", do you mean all? There are some uses in the tests which function like fixtures (but have random keys, so are not easy to generate upfront). But there are also other uses which are tied to the fact that |
😅 All in this file except for the
Absolutely, I want it done yesterday to finally finish that git-fetch workblock I am still in, technically. I will keep that as MVP'ish as possible too to get the upload-pack building block more quickly. |
It feels I have been working on this for an eternity, so I did hope that it will 'just work' with some success. Will see what's causing the issue the gitoxide can't find an object that is definitely there but only if the repository was handled by On the bright side, one can already see how the Edit: The issue was caused by |
I think for the most part, using To make it nicer, I would love to remove some of the Besides that, what do you think 🥺? |
As a side-note, while preparing this PR with crates from There are ways to improve the situation already:
|
) | ||
.unwrap(); | ||
.unwrap() | ||
.detach(); | ||
repo.reference( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always found it a bit odd that reference
creates a reference, but well names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Me too! And it was aptly called create_reference
before. But as I worked with this it dawned on me how libgit2
(or git2
) is thinking and I started liking it and thought it was beneficial to not unnecessarily change names.
This also means that reference**s**
now creates an intermediate object for iterators, and you guessed it, that was called iter_references()
before I took the U-turn there as well.
I don't know if it's good to be so git2
inspired there, and already thought providing an alternative name for the same functionality potentially even behind feature
[2000 words later]… names are hard.
@@ -317,13 +335,17 @@ where | |||
assert!(out.pack.is_some()); | |||
} | |||
|
|||
let remote_repo = git2::Repository::open(&remote).unwrap(); | |||
let mut remote_repo = git::open(remote.clone()).unwrap().into_easy_arc_exclusive(); // without GATs we need the arc version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GATs would be needed on gitoxide's side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, just there. Here is the code for that and the reason mutable RefMut<'argh_a_lifetime_that_cannot_be_expressed, Repository>
can't be done yet.
@@ -341,6 +363,8 @@ where | |||
update_tips(&local_repo, &out.wanted_refs).unwrap(); | |||
} | |||
|
|||
// Need to refresh it as it didn't notice the new pack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.. what does it exactly mean to open
a repository, in terms of I/O, file handles, etc?
I'm asking because git2
does a lot actually (including resolving and parsing config files), so we ended up using git2::Repository
from a (database) pool to avoid fd exhaustion and other issues. I suspect that we'd not use easy mode in a lot of places in order to be able to carefully manage resource consumption, but I think documentation about what the cost of an easy repo is would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, documentation is non-existing in places and I plan to rectify that with a quick-pass of doc-writing before I move on.
Right now open(…)
doesn't do much except for mapping all packs in full and reading the local repository config file once. Everything else is lazy.
On the server you would do the same, but instead of holding a git::Repository
you would hold a git::EasyArcExclusive
. That way, access to the persistent pack memory maps is shared, but caches and on-demand maps (like packed-refs) are not.
Edit: An Easy
repo is nothing more than a shared Repository
and an cache, cloning the Easy
shares the repo and creates a new cache.
Even though it's not implemented, I would like to have a refresh()
to reload the object database. And thinking about it, it's easy to demo this API here but would then force the local
repo to be an EasyArcExclusive
as well.
Probably worth sketching this out while we are here.
Note that git-config
will be fully loaded into memory so no handles will be left open.
One of my builds got bitten by this too 😬 |
* consume all `gitoxide` code through `git-repository`
If I understand the problem correctly, then `git-repository` is depending on
_newer_ git-* crates, which we don't pull in because they are breaking as per
semver. Going forward, however, I would expect `git-repository` to be **more**
conservative than lower level crates, so we may not get functionality that is
considered unstable until there is a release of `git-repository`.
So I'm not sure: either `git-repository` re-exports everything else, and we can
control via an `unstable` feature whether we want the bleeding edge or not, or
we would probably maintain our own prelude crate with pinned versions, to avoid
having to manage those versions across multiple of our own crates. But I _think_
the latter would preclude depending on `git-repository`, as we enter crate hell
through the other door then.
|
That's correct, it's in stability tier 1. It re-exports all crates in the same stability tier and hides everything else.
Such a feature toggle exists and is used to re-exports all lower tier crates. Along with a new version bumping scheme described in this cargo smart-release ticket, if any of the plumbing crates have breaking changes with a minor bump, For a lack of a better idea, this would be my preference, as it seems more powerful than pinning all Thinking into the future of post-release crates, we would eventually stop using the |
Pending Changes
|
repo.commit( | ||
"refs/namespaces/foo/refs/heads/main", | ||
&auth.to_ref(), | ||
&auth.to_ref(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I was doing breaking changes, I unified parameter order 😅
let base_id = { | ||
repo.commit( | ||
"refs/namespaces/foo/refs/heads/main", | ||
&auth.to_ref(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the to_ref()
is necessary as we can't have real refs. The *Ref
is borrowed, too, for maximum flexibility, or else those who can't move these have to clone them. That's cheap, but I guess it's the question of who has to make the call.
I could imagine introducing a trait for ToRef
which eliminates the to_ref()
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and before I forget: the refs are a lie, internally it still creates an owned Commit
as it's easier to work with. Namely, object ids are in hex but nobody really has these, so they would need internal conversion and memory to hold that. A possible optimization, but one I skipped 😅.
let auth = git::actor::Signature::now_local_or_utc("apollo", "[email protected]"); | ||
|
||
let empty_tree_id = repo | ||
.write_object(&git::objs::Tree::empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now one can write object refs as well as owned objects, and there is a trait for that so calls look as good as they can.
@@ -341,6 +360,8 @@ where | |||
update_tips(&local_repo, &out.wanted_refs).unwrap(); | |||
} | |||
|
|||
// Need to refresh it as it didn't notice the new pack | |||
local_repo.refresh_object_database().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly what would have to be done on the server as well once it causes new packs to be written. Internally it currently replaces itself, which is certainly something that could be improved one day.
Signed-off-by: Sebastian Thiel <[email protected]>
I squashed everything into one commit and rebased against master. Is there anything else that would prevent merging this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the tests, just have the one question :)
link-git-protocol/Cargo.toml
Outdated
default-features = false # leave out max-performance as it doesn't build on msvc win due to sha1-asm. See https://github.com/RustCrypto/asm-hashes/issues/17 for details | ||
features = [ "one-stop-shop", "async-network-client", "unstable" ] | ||
|
||
[dependencies.git-features] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we adding these back rather than accessing them through git-repository
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It adds all of the default features back except for max-performance
, which unfortunately breaks the build.
I was thinking that it might be better to remove max-performance
from the default feature set for now, but felt bad about it.
However, now that I wrote this I revisited the issue and fixed it on git-repository
side as the superior choice :).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused, too. It seems like we need git-repository
and git-packetline
, but not the rest. Maybe a rebase went wrong or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! And even though I don't know how that happened, removing the superfluous packages did indeed not cause any trouble.
That way no special feature configuration is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this message 👍
Closed via 07013fc Thanks! |
Please note in order to provide actual protection from unexpected breakage coming up through the crates exposed by |
A draft PR to get feedback early. I's goal is to add additional Rust to the core of git which is currently accessed through a thin corrosive layer provided by the
git2
crate.By the end of this PR I'd expect all usages of
git2
to be replaced with at least equivalently readable substitutes provided bygitoxide
except for wheregit2
is explicitly under test.Doing so is extremely valuable to
gitoxide
as it gets to develop its high-level-yet-performant API based on actual usage of a proven and mature library.