Add cron method to gc LFS MetaObjects (#22385)

This PR adds a task to the cron service to allow garbage collection of
LFS meta objects. As repositories may have a large number of
LFSMetaObjects, an updated column is added to this table and it is used
to perform a generational GC to attempt to reduce the amount of work.
(There may need to be a bit more work here but this is probably enough
for the moment.)

Fix #7045

Signed-off-by: Andrew Thornton <art27@cantab.net>
This commit is contained in:
zeripath
2023-01-16 19:50:53 +00:00
committed by GitHub
parent 04c97aa364
commit 2cc3a6381c
9 changed files with 255 additions and 35 deletions

View File

@ -2213,6 +2213,28 @@ ROUTER = console
;SCHEDULE = @every 168h
;OLDER_THAN = 8760h
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Garbage collect LFS pointers in repositories
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;[cron.gc_lfs]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;ENABLED = false
;; Garbage collect LFS pointers in repositories (default false)
;RUN_AT_START = false
;; Interval as a duration between each gc run (default every 24h)
;SCHEDULE = @every 24h
;; Only attempt to garbage collect LFSMetaObjects older than this (default 7 days)
;OLDER_THAN = 168h
;; Only attempt to garbage collect LFSMetaObjects that have not been attempted to be garbage collected for this long (default 3 days)
;LAST_UPDATED_MORE_THAN_AGO = 72h
; Minimum number of stale LFSMetaObjects to check per repo. Set to `0` to always check all.
;NUMBER_TO_CHECK_PER_REPO = 100
;Check at least this proportion of LFSMetaObjects per repo. (This may cause all stale LFSMetaObjects to be checked.)
;PROPORTION_TO_CHECK_PER_REPO = 0.6
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Git Operation timeout in seconds