Bitcoin Core Github
42 subscribers
126K links
Download Telegram
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627886053)
This is actually doing the opposite.. it's keeping the vectorized impl from having to worry about this case. In the current code, we have:
```c++
++j12;
if (!j12) ++j13;
...
input[8] = j12;
input[9] = j13;
```

So effectively `input[8]` and `input[9]` are treated as a single `uint64_t`. It's not possible to express "cast to `uint64_t` elements and increment" or "increment and overflow over there" with the vector extensions, so it turned out to be easier to just forbid the overflow cases
...
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627890205)
Thanks, yes, I meant to change that to `internal_bswap_32` before pushing. Will do.
🚀 hebasto merged a pull request: "move-only: MAX_BLOCK_TIME_GAP to src/qt"
(https://github.com/bitcoin-core/gui/pull/919)
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627891854)
Yep, will do.
🚀 fanquake merged a pull request: "Make `transaction_indentifier` hex string constructor evaluated at comptime"
(https://github.com/bitcoin/bitcoin/pull/34063)
💬 fanquake commented on pull request "ci: remove `doc/release-notes.md` from lint-spelling.py":
(https://github.com/bitcoin/bitcoin/pull/33968#issuecomment-3666368283)
Closing, as this code will be removed in #34053.
fanquake closed a pull request: "ci: remove `doc/release-notes.md` from lint-spelling.py"
(https://github.com/bitcoin/bitcoin/pull/33968)
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627906491)
Thanks for catching this! I forgot to mention in the PR description that big-endian was best-effort and untested. I figured our c-i would catch any obvious bugs. Agree it's not great that it didn't :(

Thanks for the quick fix too :)
💬 fanquake commented on pull request "lint: Remove confusing, redundant, and brittle lint-spelling":
(https://github.com/bitcoin/bitcoin/pull/34053#issuecomment-3666374736)
I think it's ok for linters to be removed as long as they aren't causing failures / enforcing anything.
👍 brunoerg approved a pull request: "[30.x] Finalise v30.1"
(https://github.com/bitcoin/bitcoin/pull/34092#pullrequestreview-3588777441)
code review ACK 2a21824b1190968ee8ba3120400c00e56be10591
🚀 fanquake merged a pull request: "lint: Remove confusing, redundant, and brittle lint-spelling"
(https://github.com/bitcoin/bitcoin/pull/34053)
📝 fanquake converted_to_draft a pull request: "test: p2p: check that peer's announced starting height is remembered"
(https://github.com/bitcoin/bitcoin/pull/33990)
Note that the announced starting height of a peer is neither verified nor used in any other logic, so reporting a bogus value [1] as done in the test doesn't have any consequences on the connection -- it's only used for inspection via the `getpeerinfo` RPC and in some debug messages (see also e.g. https://github.com/bitcoin-dot-org/Bitcoin.org/issues/1387#issuecomment-252934859). Admittedly this is not terribly important, but I'd still think having test coverage for this simple reporting functio
...
💬 l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627921902)
I had the same disappointment when implementing the obfuscation optimization. :)

@maflcko do we have a big-endian nightly that supports vectorized operations? Would it have caught this?
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627930434)
See updated title description where I touched on this.

I found (with lots of experimentation here) that different compilers use complicated and unpredictable heuristics to decide whether or not to unroll loops. Even if an unroll-able loop is detected, unrolling may be skipped because of the function size (as is the case here because it's huge).

So, I'd like to not take chances on unrolling. That means one of:
- Manual unrolling
- Macro-based (as-in `REPEAT10()`)
- A pragma
- Recursive
...
🤔 marcofleon reviewed a pull request: "fuzz: doc: remove any mention to `address_deserialize_v2`"
(https://github.com/bitcoin/bitcoin/pull/34091#pullrequestreview-3588807616)
ACK caf4843a59a9d2512d69f8fd88a9672112bd80ac

Those are the only three instances.
💬 vasild commented on pull request "Broadcast own transactions only via short-lived Tor or I2P connections":
(https://github.com/bitcoin/bitcoin/pull/29415#discussion_r2627936962)
<details>
<summary>[patch] cumulative patch to use optional</summary>

```diff
diff --git a/src/net_processing.cpp b/src/net_processing.cpp
index 512f99f0c5..4e6d77cfba 100644
--- a/src/net_processing.cpp
+++ b/src/net_processing.cpp
@@ -1086,13 +1086,13 @@ private:
void AddAddressKnown(Peer& peer, const CAddress& addr) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);
void PushAddress(Peer& peer, const CAddress& addr) EXCLUSIVE_LOCKS_REQUIRED(g_msgproc_mutex);

void LogBloc
...
💬 l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627942778)
Is there any way we could help that would make you reconsider? I don't mind running the benchmarks on a few platforms, as long as we can have simpler code. The current template magic is not something I would like to see more of - modern C++20 should be able to handle the situations we have here. I don't mind experimenting with this if you don't want to.
💬 maflcko commented on pull request "log: Use `__func__` for -logsourcelocations":
(https://github.com/bitcoin/bitcoin/pull/34088#discussion_r2627975576)
> > This seems like it could silently make some places use the wrong function name.
>
> Not with `function_name_short()`, though? I think `SourceLocation::function_name()` behaving identical to `std::source_location::function_name()` is desirable, letting users explicitly choose between the stdlib behaviour or our custom behaviour?

I am pretty sure that reverting a function argument incorrectly to `const std::source_location&` and incorrectly using `funciton_name()` will compile fine. Mayb
...
💬 theuni commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627980132)
I avoided lambdas as I've historically observed lots of optims being skipped (with clang, at least) when using them. But since you/@ajtowns/@l0rinc have all made the same comment, I'll play around and see if these indeed compile down to nothing as one would hope.
💬 sipa commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627980586)
@theuni See my `std::make_index_sequence` based I demonstrated above. It's template based, but doesn't need recursion, and doesn't rely on compiler unrolling. It simply expands to an expression `(a[0] = v, a[1] = v, a[2] = v, ...)` e.g.
💬 sipa commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627980657)
@theuni See my `std::make_index_sequence` based I demonstrated above. It's template based, but doesn't need recursion, and doesn't rely on compiler unrolling. It simply expands to an expression `(a[0] = v, a[1] = v, a[2] = v, ...)` e.g.