Keep all the content related to the user when purging the DB #987

Open
opened 2025-04-02 08:34:19 +13:00 by dtonon · 13 comments
dtonon commented 2025-04-02 08:34:19 +13:00 (Migrated from github.com)

Advanced after implementing https://github.com/mikedilger/gossip/issues/986

When purging the DB, it would be nice to keep all of the user's content related to discussions they have participated in and notifications, as I do in https://github.com/dtonon/chronicle

Advanced after implementing https://github.com/mikedilger/gossip/issues/986 When purging the DB, it would be nice to keep all of the user's content related to discussions they have participated in and notifications, as I do in https://github.com/dtonon/chronicle
mikedilger commented 2025-04-03 12:43:38 +13:00 (Migrated from github.com)

This is on unstable. See the new docs/PRUNING.md for pruning instructions.

This is on unstable. See the new docs/PRUNING.md for pruning instructions.
dtonon commented 2025-04-03 18:06:45 +13:00 (Migrated from github.com)

Great, I will treat it today.

Great, I will treat it today.
dtonon commented 2025-04-03 21:22:47 +13:00 (Migrated from github.com)

See the new docs/PRUNING.md
3. Run gossip prune_old_events

Damn, why I didn't RTFM?!
I used the button in the settings, isn't it the same thing?

Now I reopened Gossip, it said that the password is wrong, and after few attempts it present me the wizard.

> See the new docs/PRUNING.md > 3. Run `gossip prune_old_events` Damn, why I didn't RTFM?! I used the button in the settings, isn't it the same thing? Now I reopened Gossip, it said that the password is wrong, and after few attempts it present me the wizard.
dtonon commented 2025-04-03 21:27:13 +13:00 (Migrated from github.com)

I tried again, now it accepted the password but showed me again the wizard. And I'm stuck without any continue button at this step:

Image

Maybe I should try to recover from an old backup.

Edit: Pressing Enter I was able to go on, I suspect that the continue button was out of the screen on the right, because in the "follow users" step I noticed that.

I tried again, now it accepted the password but showed me again the wizard. And I'm stuck without any continue button at this step: <img width="600" alt="Image" src="https://github.com/user-attachments/assets/1934493f-980c-48aa-be60-f24ea11f8ccb" /> <br/><br/> Maybe I should try to recover from an old backup. Edit: Pressing Enter I was able to go on, I suspect that the continue button was out of the screen on the right, because in the "follow users" step I noticed that.
dtonon commented 2025-04-03 21:54:28 +13:00 (Migrated from github.com)

Now the account loads fine.

Some stats.
Before pruning (using the settings button, with 90 days setting):

9580756992 Apr 3 09:59 data.mdb

After pruning:

Database has been pruned. 2782964 events removed.
12647088128 Apr 3 10:08 data.mdb

Something is wrong here, the space increased.

Running now gossip prune_old_events removed only 602 events, so the first pruning somehow worked.

So I run mdb_copy to compress the DB and I recovered some space:

6686621696 Apr 3 10:42 data.mdb

These are the stats:

General: 49152 bytes
Events: 1214365696 bytes, 470190 events
Event Index (Author + Kind): 459423744 bytes
Event Index (Kind): 10928128 bytes
Event Index (Tags): 3081912320 bytes
Event Seen on Relay: 1266696192 bytes
Event Viewed: 851968 bytes
Hashtags: 2703360 bytes
Relays: 2539520 bytes
People: 33243136 bytes
Person-Relays: 123797504 bytes
Person-Lists: 114688 bytes
Event Relationships By Id: 149831680 bytes
Event Relationships By Addr: 31784960 bytes
Nip46 Servers: 32768 bytes
Followings: 9666560 bytes
FoF: 2179072 bytes
Handlers: 49152 bytes
Configured Handlers: 49152 bytes

The total (6,388,198,744 bytes) is similar to the ls output.

The raw events occupy ~20% of the space, all the remaining is used by relations.

Are we sure that there is not bug purging Event Index (Tags)? 2935 MB seem too much.
I would also check Event Seen on Relay (1208 MB) and Event Index (Author + Kind) (438 MB).

Now the account loads fine. Some stats. Before pruning (using the settings button, with 90 days setting): > **9580756992** Apr 3 09:59 data.mdb After pruning: > Database has been pruned. 2782964 events removed. > **12647088128** Apr 3 10:08 data.mdb Something is wrong here, the space increased. Running now `gossip prune_old_events` removed only 602 events, so the first pruning somehow worked. So I run `mdb_copy` to compress the DB and I recovered some space: > **6686621696** Apr 3 10:42 data.mdb These are the stats: General: 49152 bytes Events: 1214365696 bytes, 470190 events Event Index (Author + Kind): 459423744 bytes Event Index (Kind): 10928128 bytes Event Index (Tags): 3081912320 bytes Event Seen on Relay: 1266696192 bytes Event Viewed: 851968 bytes Hashtags: 2703360 bytes Relays: 2539520 bytes People: 33243136 bytes Person-Relays: 123797504 bytes Person-Lists: 114688 bytes Event Relationships By Id: 149831680 bytes Event Relationships By Addr: 31784960 bytes Nip46 Servers: 32768 bytes Followings: 9666560 bytes FoF: 2179072 bytes Handlers: 49152 bytes Configured Handlers: 49152 bytes The total (6,388,198,744 bytes) is similar to the `ls` output. The raw events occupy ~20% of the space, all the remaining is used by relations. Are we sure that there is not bug purging `Event Index (Tags)`? 2935 MB seem too much. I would also check `Event Seen on Relay` (1208 MB) and `Event Index (Author + Kind)` (438 MB).
dtonon commented 2025-04-03 22:04:41 +13:00 (Migrated from github.com)

So I lower the setting to 30 days and run gossip prune_old_events, the db increased:

7846035456 Apr 3 10:55 data.mdb

So I tried to compress the db again and this helped:

5855723520 Apr 3 10:58 data.mdb

Stats:

General: 49152 bytes

Events: 470106112 bytes, 197508 events
Event Index (Author + Kind): 459423744 bytes
Event Index (Kind): 10928128 bytes
Event Index (Tags): 3081977856 bytes
Event Seen on Relay: 1266696192 bytes
Event Viewed: 360448 bytes
Hashtags: 688128 bytes
Relays: 2539520 bytes
People: 33243136 bytes
Person-Relays: 123813888 bytes
Person-Lists: 114688 bytes
Event Relationships By Id: 66797568 bytes
Event Relationships By Addr: 31784960 bytes
Nip46 Servers: 32768 bytes
Followings: 9650176 bytes
FoF: 2179072 bytes
Handlers: 49152 bytes
Configured Handlers: 49152 bytes

Event Index (Tags) (2937.5 MB) and Event Seen on Relay (1208.5 MB) have identical size. So there is definitively a bug.

Some other suggestions:

  • I would output the stats in MB, with separators, so they are easier to read
  • I would immediately compress the db after the prune
So I lower the setting to 30 days and run `gossip prune_old_events`, the db increased: > **7846035456** Apr 3 10:55 data.mdb So I tried to compress the db again and this helped: > **5855723520** Apr 3 10:58 data.mdb Stats: General: 49152 bytes Events: 470106112 bytes, 197508 events Event Index (Author + Kind): 459423744 bytes Event Index (Kind): 10928128 bytes Event Index (Tags): 3081977856 bytes Event Seen on Relay: 1266696192 bytes Event Viewed: 360448 bytes Hashtags: 688128 bytes Relays: 2539520 bytes People: 33243136 bytes Person-Relays: 123813888 bytes Person-Lists: 114688 bytes Event Relationships By Id: 66797568 bytes Event Relationships By Addr: 31784960 bytes Nip46 Servers: 32768 bytes Followings: 9650176 bytes FoF: 2179072 bytes Handlers: 49152 bytes Configured Handlers: 49152 bytes `Event Index (Tags)` (2937.5 MB) and `Event Seen on Relay` (1208.5 MB) have identical size. So there is definitively a bug. Some other suggestions: * I would output the stats in MB, with separators, so they are easier to read * I would immediately compress the db after the prune
dtonon commented 2025-04-03 22:12:02 +13:00 (Migrated from github.com)

Other note: all my DMs vanished, I suppose there is a bug here, too.

Other note: all my DMs vanished, I suppose there is a bug here, too.
mikedilger commented 2025-04-04 10:29:42 +13:00 (Migrated from github.com)

I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish? None of that makes any sense to me. Did you save the original?

The button in the settings is the same thing, but it competes with active gossip processes, so it is better to do it with command line. But the button should still work, just be slower. I'll go ahead and remove them though.

Also you'll get another shrink after you rebuild relationships.

I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish? None of that makes any sense to me. Did you save the original? The button in the settings is the same thing, but it competes with active gossip processes, so it is better to do it with command line. But the button should still work, just be slower. I'll go ahead and remove them though. Also you'll get another shrink after you rebuild relationships.
mikedilger commented 2025-04-04 10:40:23 +13:00 (Migrated from github.com)

LMDB space always increases, even when deleting things. It was a design tradeoff to be super fast. You have to mdb_copy -c to reclaim space.

LMDB space always increases, even when deleting things. It was a design tradeoff to be super fast. You have to `mdb_copy -c` to reclaim space.
mikedilger commented 2025-04-04 10:49:41 +13:00 (Migrated from github.com)

Ok wow my prune code is bad. It has been bad for a long time. It really should not prune certain important kinds of events. I guess I always have newish ones so I never noticed. I'm making fixes.

Ok wow my prune code is bad. It has been bad for a long time. It really should not prune certain important kinds of events. I guess I always have newish ones so I never noticed. I'm making fixes.
mikedilger commented 2025-04-04 13:40:47 +13:00 (Migrated from github.com)

Ok I think pruning is fixed on unstable. If you still have your pre-pruned database maybe try again if you dare.

Ok I think pruning is fixed on unstable. If you still have your pre-pruned database maybe try again if you dare.
dtonon commented 2025-04-04 21:49:09 +13:00 (Migrated from github.com)

I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish?
None of that makes any sense to me.

I really don't know!
PS: Only vanished DM older than 3/4 weeks, so I suppose their are not preserved in the prune.

Did you save the original?

Stupidly, I did not back up in advance; but I have 1 month old backup, I will do a test with this one.

> I don't know what happened! How did the password not work? Why is it showing you the wizard? How did your DMs vanish? None of that makes any sense to me. I really don't know! PS: Only vanished DM older than 3/4 weeks, so I suppose their are not preserved in the prune. > Did you save the original? Stupidly, I did not back up in advance; but I have 1 month old backup, I will do a test with this one.
mikedilger commented 2025-04-05 09:18:26 +13:00 (Migrated from github.com)

I'm building a command to import the events from a different (backup) LMDB. It works but I'm just refining the commit now.

I'm building a command to import the events from a different (backup) LMDB. It works but I'm just refining the commit now.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
nostr/gossip#987
No description provided.