💬 sipa commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731169)
I understand now. My point stands that both versions are doing something unnecessary, but I see now it's already a strict improvement:
* For `erase=false`, you erase spent coins where iterating, which involves a hashtable lookup, even though you already have a pointer to the element you're erasing (and it's just due to an interface limitation that this cannot be turned into an iterator that can be erased directly).
* For `erase=true`, you still iterate a second time (inside `mapCoins.clear()`)
...
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731169)
I understand now. My point stands that both versions are doing something unnecessary, but I see now it's already a strict improvement:
* For `erase=false`, you erase spent coins where iterating, which involves a hashtable lookup, even though you already have a pointer to the element you're erasing (and it's just due to an interface limitation that this cannot be turned into an iterator that can be erased directly).
* For `erase=true`, you still iterate a second time (inside `mapCoins.clear()`)
...
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731296)
Do we want to add more Boost dependencies? I thought we wanted to go the other way.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731296)
Do we want to add more Boost dependencies? I thought we wanted to go the other way.
💬 sipa commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731593)
Boost multi-index is headers-only, so it's just a build-time dependency, not a runtime one. For the mempool we already realistically can't avoid having something that's functionally equivalent to boost multiindex, so I'm not too worried about adding more reliance on it. I do know @theuni is looking into having a simple ad-hoc replacement for it, though.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731593)
Boost multi-index is headers-only, so it's just a build-time dependency, not a runtime one. For the mempool we already realistically can't avoid having something that's functionally equivalent to boost multiindex, so I'm not too worried about adding more reliance on it. I do know @theuni is looking into having a simple ad-hoc replacement for it, though.
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731791)
Err, not `master`, but the previous version of this PR that I force pushed over an hour ago.
In `master` it is looping through the entire `cacheCoins` map, and deleting spent entries. With this approach we do have to do a hash and lookup, but it's constant time instead of linear.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667731791)
Err, not `master`, but the previous version of this PR that I force pushed over an hour ago.
In `master` it is looping through the entire `cacheCoins` map, and deleting spent entries. With this approach we do have to do a hash and lookup, but it's constant time instead of linear.
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#issuecomment-2212512381)
Thank you for your thorough review @sipa! Will run some more extensive benchmarks for all the recent changes that should be done in the coming weeks.
(https://github.com/bitcoin/bitcoin/pull/28280#issuecomment-2212512381)
Thank you for your thorough review @sipa! Will run some more extensive benchmarks for all the recent changes that should be done in the coming weeks.
💬 sipa commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733013)
Hmm, right. But so there are still two possible variants for the `erase=false` case?
* 1. One where they are erased during BatchWrite, using outpoint-based erase
* 2. One where all the erasing is done inside `Sync`, and `BatchWrite` only does flag clearing.
(1) has higher CPU cost (due to hashing and hash-table search), while (2) will involves accessing entries' memory twice. Which of the two is faster isn't clear to me, as it'll depend on CPU cache sizes etc.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733013)
Hmm, right. But so there are still two possible variants for the `erase=false` case?
* 1. One where they are erased during BatchWrite, using outpoint-based erase
* 2. One where all the erasing is done inside `Sync`, and `BatchWrite` only does flag clearing.
(1) has higher CPU cost (due to hashing and hash-table search), while (2) will involves accessing entries' memory twice. Which of the two is faster isn't clear to me, as it'll depend on CPU cache sizes etc.
💬 sipa commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733399)
Ah, but (2) necessarily also involves an outpoint-based erase, as you only want to erase flagged entries without iterating the non-flagged ones. Right.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733399)
Ah, but (2) necessarily also involves an outpoint-based erase, as you only want to erase flagged entries without iterating the non-flagged ones. Right.
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733508)
2 also required an outpoint based deletion, so this approach just moves that deletion into the same loop instead of clearing unspent flags in the first loop and then looping through spent entries again in a second loop and deleting them via outpoint.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733508)
2 also required an outpoint based deletion, so this approach just moves that deletion into the same loop instead of clearing unspent flags in the first loop and then looping through spent entries again in a second loop and deleting them via outpoint.
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733587)
If only we could convert a ConsCachePair back into a map iterator...
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733587)
If only we could convert a ConsCachePair back into a map iterator...
💬 sipa commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733924)
In boost::multi_index you can :D (see https://www.boost.org/doc/libs/1_77_0/libs/multi_index/doc/reference/hash_indices.html#iterators, the `iterator_to` function).
I'll shut up now and mark this resolved. Thanks for bearing with me.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667733924)
In boost::multi_index you can :D (see https://www.boost.org/doc/libs/1_77_0/libs/multi_index/doc/reference/hash_indices.html#iterators, the `iterator_to` function).
I'll shut up now and mark this resolved. Thanks for bearing with me.
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667734233)
> you still iterate a second time (inside mapCoins.clear()),
Hmm I added this in the latest revision because we not had the mapCoins and could clear it.
Before this patch was just not erasing the entries and removing checking that it is empty in ReallocateCache. If the map is destroyed does it not essentially do a clear as well, so this is a wash whether we clear now or on destruction?
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667734233)
> you still iterate a second time (inside mapCoins.clear()),
Hmm I added this in the latest revision because we not had the mapCoins and could clear it.
Before this patch was just not erasing the entries and removing checking that it is empty in ReallocateCache. If the map is destroyed does it not essentially do a clear as well, so this is a wash whether we clear now or on destruction?
💬 paplorinc commented on pull request "optimization: Reduce cache lookups in CCoinsViewCache::FetchCoin":
(https://github.com/bitcoin/bitcoin/pull/30326#issuecomment-2212546977)
Thanks a lot for checking, @andrewtoth!
(https://github.com/bitcoin/bitcoin/pull/30326#issuecomment-2212546977)
Thanks a lot for checking, @andrewtoth!
💬 Prabhat1308 commented on pull request "p2p: Fill reconciliation sets (Erlay) attempt 2":
(https://github.com/bitcoin/bitcoin/pull/30116#discussion_r1667753973)
```suggestion
for (int i = 1; i < inbound_peers; ++i) {
// create 4000 random wtxids so that FANOUT_TARGETS_PER_TX_CACHE_SIZE is breached
for (int j = 0; j < 4000; ++j) {
tracker.ShouldFanoutTo(Wtxid::FromUint256(frc.rand256()), i,
/*inbounds_fanout_tx_relay=*/0, /*outbounds_fanout_tx_relay=*/0);
}
}
```
increases coverage of the units tests to check the cache_size check in ```IsFanoutTarget``` .
(https://github.com/bitcoin/bitcoin/pull/30116#discussion_r1667753973)
```suggestion
for (int i = 1; i < inbound_peers; ++i) {
// create 4000 random wtxids so that FANOUT_TARGETS_PER_TX_CACHE_SIZE is breached
for (int j = 0; j < 4000; ++j) {
tracker.ShouldFanoutTo(Wtxid::FromUint256(frc.rand256()), i,
/*inbounds_fanout_tx_relay=*/0, /*outbounds_fanout_tx_relay=*/0);
}
}
```
increases coverage of the units tests to check the cache_size check in ```IsFanoutTarget``` .
💬 tdb3 commented on issue "init: torcontrol argument should be validated":
(https://github.com/bitcoin/bitcoin/issues/23589#issuecomment-2212578536)
Looks like we currently allow lookups for the `-torcontrol` host based on the state of `-dns` (default 1) (see below). If we don't want to change that (e.g. to disallow allow hostnames, and allow only IP addresses), we're somewhat limited in what would be considered invalid for `-torcontrol`. I'm thinking this gets clarified this first.
https://github.com/bitcoin/bitcoin/blob/bd5d1688b4311e21c0e0ff89a3ae02ef7d0543b8/src/torcontrol.cpp#L151
(https://github.com/bitcoin/bitcoin/issues/23589#issuecomment-2212578536)
Looks like we currently allow lookups for the `-torcontrol` host based on the state of `-dns` (default 1) (see below). If we don't want to change that (e.g. to disallow allow hostnames, and allow only IP addresses), we're somewhat limited in what would be considered invalid for `-torcontrol`. I'm thinking this gets clarified this first.
https://github.com/bitcoin/bitcoin/blob/bd5d1688b4311e21c0e0ff89a3ae02ef7d0543b8/src/torcontrol.cpp#L151
💬 sipa commented on pull request "cluster mempool: cluster linearization algorithm":
(https://github.com/bitcoin/bitcoin/pull/30126#discussion_r1667779316)
No, the serializer is serializing the `this` object; the `rebuild` object represents what the deserializer already knows when it has deserialized up to whatever has been serialized already.
(https://github.com/bitcoin/bitcoin/pull/30126#discussion_r1667779316)
No, the serializer is serializing the `this` object; the `rebuild` object represents what the deserializer already knows when it has deserialized up to whatever has been serialized already.
👍 hodlinator approved a pull request: "random: add benchmarks and drop unnecessary Shuffle function"
(https://github.com/bitcoin/bitcoin/pull/30396#pullrequestreview-2161950964)
ACK 6ecda04fefad980872c72fba89844393f5581120
Thanks for taking the time to add valuable context!!
(I read up a bit, understand you implemented a 32-bit version of Daniel Lemire's nearly divisionless algo. Nice).
> I did notice that `std::shuffle` is a bit slower than the custom Shuffle on my system too (AMD 5950X, GCC 13.2)
Maybe worth adding some nuance to the PR summary which currently says "now that it appears that standard library `std::shuffle` beats it.".
### # `rand64()` invo
...
(https://github.com/bitcoin/bitcoin/pull/30396#pullrequestreview-2161950964)
ACK 6ecda04fefad980872c72fba89844393f5581120
Thanks for taking the time to add valuable context!!
(I read up a bit, understand you implemented a 32-bit version of Daniel Lemire's nearly divisionless algo. Nice).
> I did notice that `std::shuffle` is a bit slower than the custom Shuffle on my system too (AMD 5950X, GCC 13.2)
Maybe worth adding some nuance to the PR summary which currently says "now that it appears that standard library `std::shuffle` beats it.".
### # `rand64()` invo
...
💬 cobratbq commented on pull request "doc: external-signer, example 'createwallet' RPC call requires eight argument":
(https://github.com/bitcoin/bitcoin/pull/30354#issuecomment-2212614821)
> Do you happen to know which pull request added a new positional argument in the middle? Or was this instruction always broken? cc @achow101
I don't know. I'm new to this. However, in working with this documentation, it seems that there are more discrepancies or inaccuracies. It seems now the account-ID in `getdescriptors` is provided using `--account 0`, for example.
I'm currently going through the bitcoin-core sources, to figure out where my other mistakes are, because the _external-sig
...
(https://github.com/bitcoin/bitcoin/pull/30354#issuecomment-2212614821)
> Do you happen to know which pull request added a new positional argument in the middle? Or was this instruction always broken? cc @achow101
I don't know. I'm new to this. However, in working with this documentation, it seems that there are more discrepancies or inaccuracies. It seems now the account-ID in `getdescriptors` is provided using `--account 0`, for example.
I'm currently going through the bitcoin-core sources, to figure out where my other mistakes are, because the _external-sig
...
💬 cobratbq commented on pull request "doc: external-signer, example 'createwallet' RPC call requires eight argument":
(https://github.com/bitcoin/bitcoin/pull/30354#discussion_r1667792505)
New version pushed with `blank`, `avoid_reuse` and `load_on_startup` removed.
(https://github.com/bitcoin/bitcoin/pull/30354#discussion_r1667792505)
New version pushed with `blank`, `avoid_reuse` and `load_on_startup` removed.
💬 pythcoiner commented on pull request "descriptors: Be able to specify change and receiving in a single descriptor string":
(https://github.com/bitcoin/bitcoin/pull/22838#issuecomment-2212989535)
I'm not sure if it's the intended behavior, but passing the same `NUM` twice (here `<1;1>`) to `importdescriptors` does not fail and ends up with a single change descriptor:
```python
self.log.info("Multipath descriptors: duplicate NUM")
self.nodes[1].createwallet(wallet_name="multipath3", descriptors=True, blank=True)
w_multipath = self.nodes[1].get_wallet_rpc("multipath3")
self.test_importdesc({"desc": descsum_create(f"wpkh({xpriv}/<1;1>/*)"),
"active": True,
...
(https://github.com/bitcoin/bitcoin/pull/22838#issuecomment-2212989535)
I'm not sure if it's the intended behavior, but passing the same `NUM` twice (here `<1;1>`) to `importdescriptors` does not fail and ends up with a single change descriptor:
```python
self.log.info("Multipath descriptors: duplicate NUM")
self.nodes[1].createwallet(wallet_name="multipath3", descriptors=True, blank=True)
w_multipath = self.nodes[1].get_wallet_rpc("multipath3")
self.test_importdesc({"desc": descsum_create(f"wpkh({xpriv}/<1;1>/*)"),
"active": True,
...
💬 andrewtoth commented on pull request "Don't empty dbcache on prune flushes: >30% faster IBD":
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667968459)
Ok, but is there no C++-fu incantation that can get us from a `std::pair<const COutpoint, CCoinsCacheEntry>*` to a `CCoinsMap::iterator`? I can't figure out a way to get that to compile but I'm sure there's a way using `void*` or something.
(https://github.com/bitcoin/bitcoin/pull/28280#discussion_r1667968459)
Ok, but is there no C++-fu incantation that can get us from a `std::pair<const COutpoint, CCoinsCacheEntry>*` to a `CCoinsMap::iterator`? I can't figure out a way to get that to compile but I'm sure there's a way using `void*` or something.