cc @taigrr
This PR will _hopefully_ help to fix some critical isseus in the real world with several or more [Yarn.social](https://yarn.social) pods running [yarnd](https://git.mills.io/yarnsocial/yarn) where starting back up after a power failure or crash can sometimes result in an empty `config.json` or empty `meta.json` or both!
I'm not actually sure how this can arise, and as yet I haven't been able to reproduce it (_I can only assume this has to be failures cases outside of our control_); but in any case the application and database is recoverable by simply `rm config.json` and/or `rm meta.json`.
So this PR makes errors loading the config and metadata first-class errors and exported error types that consumers of the library can use to perform automated recovery without requiring human intervention.
Basiclaly in this case it's no big deal we lost the database config of metadata, we can simply carry on.
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/241
Co-authored-by: James Mills <james@mills.io>
Co-committed-by: James Mills <james@mills.io>
Co-authored-by: James Mills <1290234+prologic@users.noreply.github.com>
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
Co-authored-by: Tai Groot <tai@taigrr.com>
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/160
Co-authored-by: James Mills <james@mills.io>
Co-committed-by: James Mills <james@mills.io>
Added Sift and ScanSift functions for review without tests (for now)
fix docstrings
Added tests for Sift and ScanSift
Note this also fixes a bug in the Scan() function where the RMutex is not locked, allowing a potential race condition
closes#231
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/232
Co-authored-by: Tai Groot <tai@taigrr.com>
Co-committed-by: Tai Groot <tai@taigrr.com>
Fixes#228
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/229
Co-authored-by: James Mills <james@mills.io>
Co-committed-by: James Mills <james@mills.io>
[ADDED] new tests for TTL expiration race condition, see #216
[REMOVED] removes cleanup / automatic expiration from get() function to resolve#216
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/227
Co-authored-by: Tai Groot <tai@taigrr.com>
Co-committed-by: Tai Groot <tai@taigrr.com>
Supercesd #219 after rebasing on master after migrating off Github.
Co-authored-by: Nicolò Santamaria <nicolo.santamaria@protonmail.com>
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
Co-authored-by: Tai Groot <taigrr@noreply@mills.io>
Reviewed-on: https://git.mills.io/prologic/bitcask/pulls/224
Co-authored-by: James Mills <prologic@noreply@mills.io>
Co-committed-by: James Mills <prologic@noreply@mills.io>
* Add test case for Locking after Merge
* retain lock file after merge
* remove replacing lock file (not needed)
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
Co-authored-by: yash <yash.chandra@grabpay.com>
* add failing test case to highlight the race condition on bug
note : the test "TestLock" is non deterministic, its outcome depends
on the sequence of instructions yielded by the go scheduler on each run.
There are two values, "goroutines" and "succesfulLockCount", which can
be edited to see how the test performs.
With the committed value, resp "20" and "50", I had a 100% failure on
my local machine, running linux (Ubuntu 20.04).
Sample test output :
$ go test . -run TestLock
--- FAIL: TestLock (0.17s)
lock_test.go:91: [runner 14] lockCounter was > 1 on 5 occasions, max seen value was 2
lock_test.go:91: [runner 03] lockCounter was > 1 on 2 occasions, max seen value was 3
lock_test.go:91: [runner 02] lockCounter was > 1 on 3 occasions, max seen value was 3
lock_test.go:91: [runner 00] lockCounter was > 1 on 1 occasions, max seen value was 2
lock_test.go:91: [runner 12] lockCounter was > 1 on 7 occasions, max seen value was 3
lock_test.go:91: [runner 01] lockCounter was > 1 on 8 occasions, max seen value was 2
lock_test.go:91: [runner 04] lockCounter was > 1 on 6 occasions, max seen value was 4
lock_test.go:91: [runner 13] lockCounter was > 1 on 1 occasions, max seen value was 2
lock_test.go:91: [runner 17] lockCounter was > 1 on 4 occasions, max seen value was 2
lock_test.go:91: [runner 10] lockCounter was > 1 on 3 occasions, max seen value was 2
lock_test.go:91: [runner 08] lockCounter was > 1 on 6 occasions, max seen value was 2
lock_test.go:91: [runner 09] lockCounter was > 1 on 4 occasions, max seen value was 2
lock_test.go:91: [runner 05] lockCounter was > 1 on 1 occasions, max seen value was 2
lock_test.go:91: [runner 19] lockCounter was > 1 on 3 occasions, max seen value was 3
lock_test.go:91: [runner 07] lockCounter was > 1 on 4 occasions, max seen value was 3
lock_test.go:91: [runner 11] lockCounter was > 1 on 9 occasions, max seen value was 2
lock_test.go:91: [runner 15] lockCounter was > 1 on 1 occasions, max seen value was 3
lock_test.go:91: [runner 16] lockCounter was > 1 on 1 occasions, max seen value was 3
FAIL
FAIL github.com/prologic/bitcask 0.176s
FAIL
* flock: create a wrapper module, local to bitcask, around gofrs.Flock
the racy TestLock has been moved to bitcask/flock
* flock: add test for expected regular locking behavior
* flock: replace gofrs/flock with local implementation
* update go.sum
* Add build constraint for flock_unix.go
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
* new merge approach
* code refactor
* comment added
* isMerging flag added to allow 1 merge operation at a time
* get api modified. merge updated (no recursive read locks)
Co-authored-by: yash <yash.chandra@grabpay.com>
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
* live backup first commit
* exclude lock file in backup
* create path if not exist for backup
Co-authored-by: yash <yash.chandra@grabpay.com>
Co-authored-by: James Mills <prologic@shortcircuit.net.au>
* Add configuration options for FileMode
Add two additional configuration values, and their corresponding default values:
* DirFileModeBeforeUmask - Dir FileMode is used on all directories created. DefaultDirFileModeBeforeUmask is 0700.
* FileFileModeBeforeUmask - File FileMode is used on all files created, except for the "lock" file (managed by the Flock library). DefaultFileFileModeBeforeUmask is 0600.
When using these bits of configuration, keep in mind these FileMode values are set BEFORE any umask rules are applied. For example, if the user's umask is 022, setting DirFileFileModeBeforeUmask to 777 will result in directories with FileMode set to 755 (this umask prevents the write bit from being applied to group and world permissions).
* moving defer statements after checking for errors
use os.ModePerm const instead of os.FileMode(777)
* fix spelling/grammar
* skip these tests for Windows as they appear to break - Windows is less POSIX-y than it claims
* ignore "lock" file for default case too -- this was incorrectly passing before including this, as my local dev station has umask 022
* internal/data: comment exported functions
* internal/data: make smaller codec exported api surface
* make key and value sizes serializing bubble up to everything
* Makefile setup & go mod tidy
* Add Unit Test for testing a corrupted config
* Add Unit Test for testing errors from .Stats()
* Refactor Datafile into an interface and add Unit Tests for testing Merge() errors
* Refactor indexer into an interface and add Unit Tests for .Close() errors
* Add Unit Tests for .Delete() errors
* Add Unit Tests for testing Put/Get errors
* Add Unit Test for testing Open errors (bad path for example)
* Refactor out bitcask.writeConfig
* Add more tests for config errors
* Add unit test for options that might error
* Add more test cases for close errors
* Add test case for rotating datafiles
* Fix a possible data race in .Stats()
* Add test case for checksum errors
* Add test case for Sync errors with Put and WithSync enabled
* Refactor and use testify.mock for mocks and generate mocks for all interfaces
* Refactor TestCloseErrors
* Refactored TestDeleteErrors
* Refactored TestGetErrors
* Refactored TestPutErrors
* Refactored TestMergeErrors and fixed a bug with .Fold()
* Add test case for Scan() errors
* Apparently only Scan() can return nil Node()s?
* bitcask/codec_index: check key and data sizes
* codec_index: tests for key and data size overflows
* codec_index: simplify internal funcs for unused returns