π¬ maflcko commented on pull request "log: Use `__func__` for -logsourcelocations":
(https://github.com/bitcoin/bitcoin/pull/34088#discussion_r2627471486)
Not sure about the implicit `operator const std::source_location&(){ return m_loc; };`. This seems like it could silently make some places use the wrong function name.
> function_name_short
Thx, done
(https://github.com/bitcoin/bitcoin/pull/34088#discussion_r2627471486)
Not sure about the implicit `operator const std::source_location&(){ return m_loc; };`. This seems like it could silently make some places use the wrong function name.
> function_name_short
Thx, done
π fanquake merged a pull request: "ci: Pin native tests on cross-builds to same commit"
(https://github.com/bitcoin/bitcoin/pull/34080)
(https://github.com/bitcoin/bitcoin/pull/34080)
π brunoerg opened a pull request: "fuzz: doc: remove any mention to `address_deserialize_v2`"
(https://github.com/bitcoin/bitcoin/pull/34091)
We don't have `address_deserialize_v2` target anymore since fac81affb527132945773a5315bd27fec61ec52f (we used to have `address_deserialize_v1_notime`, `address_deserialize_v1_withtime` and `address_deserialize_v2` but now we only have a single `address_deserialize` target) so it removes any mention to it.
(https://github.com/bitcoin/bitcoin/pull/34091)
We don't have `address_deserialize_v2` target anymore since fac81affb527132945773a5315bd27fec61ec52f (we used to have `address_deserialize_v1_notime`, `address_deserialize_v1_withtime` and `address_deserialize_v2` but now we only have a single `address_deserialize` target) so it removes any mention to it.
π¬ furszy commented on pull request "rpc, doc: clarify the response of listtransactions RPC":
(https://github.com/bitcoin/bitcoin/pull/32737#discussion_r2627466348)
Could use "transaction" instead of "blockchain transaction".
Also, I think the example could be improved, something like:
"For instance, a wallet transaction that sends funds to three addresses - one belonging to the wallet itself and two externalβ will produce four entries (one per output, including the change output).
As a result, the response of 'listransactions' will contain one entry in the 'receive' category and three entries in the 'send' category."
(feel free to add stuff to it
...
(https://github.com/bitcoin/bitcoin/pull/32737#discussion_r2627466348)
Could use "transaction" instead of "blockchain transaction".
Also, I think the example could be improved, something like:
"For instance, a wallet transaction that sends funds to three addresses - one belonging to the wallet itself and two externalβ will produce four entries (one per output, including the change output).
As a result, the response of 'listransactions' will contain one entry in the 'receive' category and three entries in the 'send' category."
(feel free to add stuff to it
...
π fanquake merged a pull request: "contrib: asmap-tool.py - Don't write binary to TTY"
(https://github.com/bitcoin/bitcoin/pull/34089)
(https://github.com/bitcoin/bitcoin/pull/34089)
π¬ brunoerg commented on pull request "fuzz: doc: remove any mention to `address_deserialize_v2`":
(https://github.com/bitcoin/bitcoin/pull/34091#issuecomment-3665817873)
cc @maflcko
(https://github.com/bitcoin/bitcoin/pull/34091#issuecomment-3665817873)
cc @maflcko
π€ danielabrozzoni reviewed a pull request: "log: show reindex progress in `ImportBlocks`"
(https://github.com/bitcoin/bitcoin/pull/33353#pullrequestreview-3588239911)
tACK d7de5b109f69293b375729363b4e5038f3fb6b15 - code reviewed and tested on my archival node.
Nice to see this!
Itβs true that not every file will take the same amount of time to reindex, but I think itβs better to have a rough estimate than no estimate at all.
(https://github.com/bitcoin/bitcoin/pull/33353#pullrequestreview-3588239911)
tACK d7de5b109f69293b375729363b4e5038f3fb6b15 - code reviewed and tested on my archival node.
Nice to see this!
Itβs true that not every file will take the same amount of time to reindex, but I think itβs better to have a rough estimate than no estimate at all.
π€ maflcko reviewed a pull request: "log: show reindex progress in `ImportBlocks`"
(https://github.com/bitcoin/bitcoin/pull/33353#pullrequestreview-3588272613)
review ACK d7de5b109f69293b375729363b4e5038f3fb6b15 π
<details><summary>Show signature</summary>
Signature:
```
untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: review ACK d7de5b109f69293b37572
...
(https://github.com/bitcoin/bitcoin/pull/33353#pullrequestreview-3588272613)
review ACK d7de5b109f69293b375729363b4e5038f3fb6b15 π
<details><summary>Show signature</summary>
Signature:
```
untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
trusted comment: review ACK d7de5b109f69293b37572
...
π¬ furszy commented on issue "GUI crashes in regtest":
(https://github.com/bitcoin-core/gui/issues/918#issuecomment-3665898521)
Can you check the directory permissions please
(https://github.com/bitcoin-core/gui/issues/918#issuecomment-3665898521)
Can you check the directory permissions please
π€ l0rinc requested changes to a pull request: "Add initial vectorized chacha20 implementation for 2-3x speedup"
(https://github.com/bitcoin/bitcoin/pull/34083#pullrequestreview-3586783659)
I went through the code roughly, I like that we're focusing on this and looking forward to specializing other parts of the code that are this critical (looking at you, SipHash!).
My biggest objection currently is that it's broken on big-endian systems - left a suggestion how to reproduce and fix it. This also reveals the lack of testing - we have to find a way to selectively enable and disable these optimizations to make sure we have tests that compare their outputs.
I agree with AJ that we
...
(https://github.com/bitcoin/bitcoin/pull/34083#pullrequestreview-3586783659)
I went through the code roughly, I like that we're focusing on this and looking forward to specializing other parts of the code that are this critical (looking at you, SipHash!).
My biggest objection currently is that it's broken on big-endian systems - left a suggestion how to reproduce and fix it. This also reveals the lack of testing - we have to find a way to selectively enable and disable these optimizations to make sure we have tests that compare their outputs.
I agree with AJ that we
...
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2626253726)
can we assume that all big endian systems have `__builtin_bswap32` available.
https://github.com/bitcoin/bitcoin/blob/432b18ca8d0654318a8d882b28b20af2cb2d2e5d/src/compat/byteswap.h#L14-L16 indicates we could use our existing helpers here instead:
```C++
ALWAYS_INLINE void vec_byteswap(vec256& vec)
{
if constexpr (std::endian::native == std::endian::big) {
for (size_t i = 0; i < 8; ++i) {
vec[i] = internal_bswap_32(vec[i]);
}
}
}
```
A small, fixed-
...
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2626253726)
can we assume that all big endian systems have `__builtin_bswap32` available.
https://github.com/bitcoin/bitcoin/blob/432b18ca8d0654318a8d882b28b20af2cb2d2e5d/src/compat/byteswap.h#L14-L16 indicates we could use our existing helpers here instead:
```C++
ALWAYS_INLINE void vec_byteswap(vec256& vec)
{
if constexpr (std::endian::native == std::endian::big) {
for (size_t i = 0; i < 8; ++i) {
vec[i] = internal_bswap_32(vec[i]);
}
}
}
```
A small, fixed-
...
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627477691)
Hmmm, it seems to me we're doing heavy recursive template instantiationshere to unroll loops.
My understanding (and experience with the mentioned Obfuscation PR) is that modern compilers are good at unrolling fixed-bound loops.
Can you please try if this also works and results in the same speedup?
```suggestion
template <size_t BITS, size_t I>
ALWAYS_INLINE void arr_add_xor_rot(std::array<vec256, I>& arr0, const std::array<vec256, I>& arr1, std::array<vec256, I>& arr2)
{
for (size_t
...
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627477691)
Hmmm, it seems to me we're doing heavy recursive template instantiationshere to unroll loops.
My understanding (and experience with the mentioned Obfuscation PR) is that modern compilers are good at unrolling fixed-bound loops.
Can you please try if this also works and results in the same speedup?
```suggestion
template <size_t BITS, size_t I>
ALWAYS_INLINE void arr_add_xor_rot(std::array<vec256, I>& arr0, const std::array<vec256, I>& arr1, std::array<vec256, I>& arr2)
{
for (size_t
...
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627310095)
Forcing vectorized operations on an emulated big-endian system (otherwise it falls back to the scalar implementation due to the safety checks):
```patch
diff --git a/src/crypto/chacha20_vec_base.cpp b/src/crypto/chacha20_vec_base.cpp
index 9fda9452a1..a2aa1e5552 100644
--- a/src/crypto/chacha20_vec_base.cpp
+++ b/src/crypto/chacha20_vec_base.cpp
@@ -20,7 +20,7 @@
# define CHACHA20_VEC_DISABLE_STATES_8
# define CHACHA20_VEC_DISABLE_STATES_6
# define CHACHA20_VEC_DISABLE_STATES_4
...
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627310095)
Forcing vectorized operations on an emulated big-endian system (otherwise it falls back to the scalar implementation due to the safety checks):
```patch
diff --git a/src/crypto/chacha20_vec_base.cpp b/src/crypto/chacha20_vec_base.cpp
index 9fda9452a1..a2aa1e5552 100644
--- a/src/crypto/chacha20_vec_base.cpp
+++ b/src/crypto/chacha20_vec_base.cpp
@@ -20,7 +20,7 @@
# define CHACHA20_VEC_DISABLE_STATES_8
# define CHACHA20_VEC_DISABLE_STATES_6
# define CHACHA20_VEC_DISABLE_STATES_4
...
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2626256140)
It seems to me the existing standard test vectors are too short (mostly < 128 bytes) to trigger the vectorized optimization - can we extend the tests to runa and compare the new specializations (could show as `skipped` when the given architecture isn't available locally)?
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2626256140)
It seems to me the existing standard test vectors are too short (mostly < 128 bytes) to trigger the vectorized optimization - can we extend the tests to runa and compare the new specializations (could show as `skipped` when the given architecture isn't available locally)?
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627488702)
Could we use https://github.com/bitcoin/bitcoin/blob/e16c22fe025f82166c7f3f15a37c96bf4a06e4cf/src/attributes.h#L19-L25 instead?
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627488702)
Could we use https://github.com/bitcoin/bitcoin/blob/e16c22fe025f82166c7f3f15a37c96bf4a06e4cf/src/attributes.h#L19-L25 instead?
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627561084)
I haven't used this before but the internets tells me we could the attribute syntax here, something like:
```suggestion
template <typename T, size_t N>
using VectorType [[gnu::vector_size(sizeof(T) * N)]] = T;
using vec256 = VectorType<uint32_t, 8>;
```
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627561084)
I haven't used this before but the internets tells me we could the attribute syntax here, something like:
```suggestion
template <typename T, size_t N>
using VectorType [[gnu::vector_size(sizeof(T) * N)]] = T;
using vec256 = VectorType<uint32_t, 8>;
```
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627508017)
There's a lot of repetition here - could we extract that to an `ALWAYS_INLINE` lambda and have something like:
```C++
if constexpr (ENABLE_16) process_blocks(16);
else if constexpr (ENABLE_8) process_blocks(8);
...
```
I haven't implemented it locally, maybe it's naive, let me know what you think.
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627508017)
There's a lot of repetition here - could we extract that to an `ALWAYS_INLINE` lambda and have something like:
```C++
if constexpr (ENABLE_16) process_blocks(16);
else if constexpr (ENABLE_8) process_blocks(8);
...
```
I haven't implemented it locally, maybe it's naive, let me know what you think.
π¬ l0rinc commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627510708)
+1 for `if constexpr`, should simplify the code a lot (especially if we migrate from recursive templates as well)
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627510708)
+1 for `if constexpr`, should simplify the code a lot (especially if we migrate from recursive templates as well)
π¬ sipa commented on pull request "Add initial vectorized chacha20 implementation for 2-3x speedup":
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627635541)
I believe it's possible to use compile-time loops here instead of recursive templates (if runtime loops don't get optimized sufficiently):
```c++
/** Store a vector in all array elements */
template <size_t I>
ALWAYS_INLINE void arr_set_vec256(std::array<vec256, I>& arr, const vec256& vec)
{
[&]<size_t... ITER>(std::index_sequence<ITER...>) {
((std::get<ITER>(arr) = vec),...);
}(std::make_index_sequence<I>());
}
```
(https://github.com/bitcoin/bitcoin/pull/34083#discussion_r2627635541)
I believe it's possible to use compile-time loops here instead of recursive templates (if runtime loops don't get optimized sufficiently):
```c++
/** Store a vector in all array elements */
template <size_t I>
ALWAYS_INLINE void arr_set_vec256(std::array<vec256, I>& arr, const vec256& vec)
{
[&]<size_t... ITER>(std::index_sequence<ITER...>) {
((std::get<ITER>(arr) = vec),...);
}(std::make_index_sequence<I>());
}
```
π fanquake opened a pull request: "[30.x] Finalise v30.1"
(https://github.com/bitcoin/bitcoin/pull/34092)
Finalise `v30.1`.
(https://github.com/bitcoin/bitcoin/pull/34092)
Finalise `v30.1`.
π¬ fanquake commented on pull request "log: Use `__func__` for -logsourcelocations":
(https://github.com/bitcoin/bitcoin/pull/34088#issuecomment-3666065446)
https://github.com/bitcoin/bitcoin/actions/runs/20307712389/job/58333931476?pr=34088#step:11:3126:
```bash
/home/admin/actions-runner/_work/_temp/src/test/logging_tests.cpp:135:120: error: no member named 'function_name' in 'SourceLocation'
135 | expected.push_back(tfm::format("[%s:%s] [%s] %s%s", util::RemovePrefix(loc.file_name(), "./"), loc.line(), loc.function_name(), prefix, msg));
|
...
(https://github.com/bitcoin/bitcoin/pull/34088#issuecomment-3666065446)
https://github.com/bitcoin/bitcoin/actions/runs/20307712389/job/58333931476?pr=34088#step:11:3126:
```bash
/home/admin/actions-runner/_work/_temp/src/test/logging_tests.cpp:135:120: error: no member named 'function_name' in 'SourceLocation'
135 | expected.push_back(tfm::format("[%s:%s] [%s] %s%s", util::RemovePrefix(loc.file_name(), "./"), loc.line(), loc.function_name(), prefix, msg));
|
...