Where this was done already, in `bail!` and `anyhow!`.
A newly enforced clippy lint warns about this, making the good
point that the debug representation can change in the future. But
the display representation it recommends using is less suitable in
these places, because it can result in more ambiguous output due to
the absence of quoting and escaping.
(In some cases, it seems like the display representation could
slightly exacerbate CVE-2024-43785 / #1534, but using the debug
representation on these two macro calls is peripheral to the cases
of greatest concern, and this doesn't really mitigate that.)
The clippy lint being suppressed here in two specific places is:
https://rust-lang.github.io/rust-clippy/master/index.html#unnecessary_debug_formatting
By running `cargo update`.
(It looks like holding back `getrandom`, as included in #2201, does
not enable Dependabot to cover all the transitive dependencies that
`cargo update` can update in its PRs.)
This allows them to have patch version updates but not major and
minor version updates. Becuase they are not yet at 1.0.0, this has
the effect of allowing SemVer-compatible but not SemVer-breaking
updates to them via Dependabot version update PRs.
This may be temporary and, in the case of `getrandom`, is intended
to be temporary. For details, see comments in `dependabot.yml`, and:
- https://github.com/GitoxideLabs/gitoxide/pull/2200#issuecomment-3361407300
- https://github.com/GitoxideLabs/gitoxide/pull/2093#issuecomment-3361449228
This also expands the comment for holding back `imara-diff` (all
referenced PRs in the file now have full URLs, not just numbers).
The effect of keeping back `expectrl` was tested and verified to be
the only new hold needed to allow version updates to work, in:
https://github.com/EliahKagan/gitoxide/pull/111
This fixes spelling errors in the non-policy files `README.md` and
`threat-model-notes.md` in `etc/security`. (In contrast, `irp.md`
and `threat-model.md` do not appear to contain spelling errors.)
It also fixes some minor Markdown code style inconsistencies in
`threat-model.md`. This does not affect the text of that document,
nor probably even how it is rendered.
- Give the workflow a shorter name
- Also trigger on "run-ci" branches (in addition to main)
- Also allow to be triggered from Actions tab
- Comment out currently unneeded permissions
- Use v5 of actions/checkout (rather than v4)
- Don't persist auth token after checkout (see #2187)
Because the basic configuration doesn't run on PRs from forks.
So far this is just the advanced configuration workflow file
written automatically when enabling it through the GitHub interface
for doing so, with no customizations. This should already be
sufficient to let it run on PRs from forks, but the immediately
forthcoming commits shall apply some customizations.
The main purpose of the readme is to help anybody who gets to this
directory with the goal of reporting a vulnerability to find the
documentation for doing so (which remains in `SECURITY.md`, not in
this directory). But it also briefly explains the contents.
The threat model notes also added here are not, even in principle,
a policy, but I think they may be useful to have, to be able to
refer to alongside the threat model outline.
While these have gone through some revision and discussion so far,
they are always subject to iteration and improvement, both in
general and in the particularly highlighted ways noted within them.
Also, the IRP essentially only covers vulnerability handling right
now, and the threat model outline is an outline (it does not model
the attack surfaces or detailed use cases of each individual
`gix-*` library crate).
Reviewed-by: Sebastian Thiel <sebastian.thiel@icloud.com>
The upwards search for the repository directory takes a directory as
input and then walks through the parents. It turns out that it was
broken when the repository was the same as the working directory.
The code checked when the directory components had been stripped to "", in
case the directory was replaced with `cwd.parent()`, so the loop missed
to check `cwd` itself. If the input directory was "./something", then
"." was checked which then succeeded.
This also tests the job by manually trying out several ways it
should fail to make sure it does, but I squashed those out. The can
be seen at EliahKagan#105 and are summarized as follows:
* Test that we always have `actions/checkout` not persist credentials
* Check that we catch `actions/checkout` with no `with`
* Improve `check-no-persist-credentials` output and maintainability
* Check that we catch checkout `with` without `persist-credentials`
* Check that we catch `persist-credentials` not set to boolean false
* Having tested the new check, restore `persist-credentials: false`
When `actions/checkout` is used to check out the repository on CI,
it persists credentials related to the GitHub token in the local
scope configuration at `.git/config`, unless `persist-credentials`
is explicitly set to `false`. This facilitates subsequent remote
operations on the repository that could otherwise fail, but we have
no such operations in any of our workflows.
As an added layer of protection to keep these credentials from
leaking into logs (or otherwise being displayed or subject to
exfiltration) in case there is ever unintended coupling between the
operation of the test suite (or any step subsequent to checkout
that is used to prepare or run tests or other checks) and the
cloned `gitoxide` repository itself, this:
- Adds `persist-credentials: false` in a `with` mapping on every
step that uses `actions/checkout`.
- Adds a new CI job that checks that every `actions/checkout` step
in any job in any workflow sets `persist-credentials` to `false`.
In addition to usual testing on CI, the `release.yml` workflow is
among the workflows changed here, and it has also been tested:
https://github.com/EliahKagan/gitoxide/actions/runs/17899238656
See also:
- https://github.com/actions/checkout/blob/main/README.md
(Covers what happens with/without `persist-credentials: false`).
- https://github.com/actions/checkout/issues/485
The `small` feature of the `gitoxide` crate does not directly or
indirectly include the `gitoxide-core-blocking-client` feature, and
therefore does not provide `gix clone`. Although it was documented
to provide only local operations, this was ambiguous: in the sense
of local and remote repository operations, cloning is arguably a
remote operation even when one clones from a filesystem path, or
file URL. But in the broader meaning of "local," this could mean
merely that network transport is omitted and that local cloning is
included.
This adds a short explicit note that `small` does not include
`gix clone`. This is a minimal fix for #2185 and it may make sense
to improve the description further (unless `small` is to change).
The upwards search for the repository directory takes a directory as
input and then walks through the parents. It turns out that it was
broken when the repository was the same as the working directory.
The code checked when the directory components had been stripped to "", in
case the directory was replaced with `cwd.parent()`, so the loop missed
to check `cwd` itself. If the input directory was "./something", then
"." was checked which then succeeded.
The `copy-royal` algorithm maintains the patterns and "shape" of
text sufficiently to keep diffs the same (in the vast majority of
cases). It is used in `internal-tools` to help prepare test cases
with what is important and relevant to a regression test of diff
behavior, rather than the exact original repository content in a
tree that has been found to trigger a bug. It avoids needless
verbatim reproduction, while preserving aspects that are useful and
necessary for testing. It keeps the focus on patterns, preventing
irrelevant details of code in a tree that triggered a bug from
being confused with the logic of gitoxide itself, and makes it less
likely to be touched inadvertently in efforts to fix bugs or
improve style (which, in test data, would cause subtle breakage).
Although these benefits are substantial and we intend to continue
using copy-royal in the preparation of test cases as needed if or
when regressions arise, some of the guidance and rationale we had
given for its use was inaccurate or misleading. Most importantly,
copy-royal cannot be used in practice to redact sensitive
information: if you have a repository whose contents should not be
made public, then it is not safe to share the output of copy-royal
run on that repository either.
Copy-royal is implemented (roughly speaking) by mapping alphabetic
characters down to ten letters. This removes some information, at
least in principle: that is, if it were given totally random
letters as input, then it would be impossible to reverse it to get
those letters back. Even on input that is much more structured and
predictable, such as real-world input, it obfuscates it, making it
look garbled and nonsensical. However, even when one intuitively
feels that it has destroyed information, it is possible to reverse
it in many cases, and possibly even in all practical cases.
The reason is that, in real world source code and natural language,
some sequences of letters are overwhelmingly more likely to occur
than others, both in general and (especially) contextually given
what surrounding text is present. The information that is removed
by mapping into ten letters could often be reconstructed by:
1. Building a grammar of possible inputs, which can be done in a
simple manner by translating the copy-royal output one wishes to
reverse into a regular expression in which every symbol in the
copy-royal output becomes a character class of characters that
map to it. In effect, for every output of the copy-royal
algorithm, there is a regex that matches the possible inputs.
2. Predicting, stepwise, what code or text is likely to have arisen
that matches that grammar. In principle this could be done with
a variety of techniques or even manually. But one fruitful
approach would be to use an autoregressive large language model,
and apply constrained decoding[1] to sample only logits
consistent with the regex. Small experiments carried out so far
suggest[2] this to be a workable technique when combined with
beam search[3]. (This technique does not require the specific
text or code being reconstructed to have existed when the model
was trained.)
Accordingly, this modifies the documentation of copy-royal to avoid
claiming that the input of copy-royal cannot be recovered, or
anything that recommends or may appear to recommend the use of
copy-royal to redact sensitive information. It also clarifies and
adjusts the explanation of when it makes sense to use copy-royal,
and describes some of its benefits that do not rely on the
assumption that it is infeasible (or even difficult) to reverse.
In the comment documenting `BlameCopyRoyal`, which is among those
edited in the above ways, this also edits its top line to make
clear more generally how `BlameCopyRoyal` relates to `git blame`.
[1]: https://github.com/Saibo-creator/Awesome-LLM-Constrained-Decoding
[2]: See link(s) in https://github.com/GitoxideLabs/gitoxide/pull/2180
[3]: https://en.wikipedia.org/wiki/Beam_search
Co-authored-by: Sebastian Thiel <sebastian.thiel@icloud.com>
The remote has a couple of "builder" methods to change
is fields, e.g. `push_url` for setting the push url.
A builder method for changing the fetch url of a remote
was missing. This makes it impossible to fully replicate
the functionality of `git remote set-url`.