-
Notifications
You must be signed in to change notification settings - Fork 179
move WAL tailing code from Prometheus to TSDB WAL package #606
base: master
Are you sure you want to change the base?
move WAL tailing code from Prometheus to TSDB WAL package #606
Conversation
7dfb442
to
1caa80c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I think moving all the record related code to a record
package makes a lot of sense. I would try to limit code in that package to just types/methods directly interacting with a record though. Left a couple more specific comments around things you might be able to move out.
Thanks for all the responses, they helped me get my head around all this! As you said, the amount of code in the root package can be problematic to wrap your head around. My main concern with the current state of this PR is that the After thinking about this for a couple hours:
I am definitely curious what you and the maintainers think! |
As I am not much familiar with WAL tailing yet, I am still wrapping my head about what the
I will take a better look before the weekend hits. |
62729b5
to
0c895bd
Compare
I've moved the tombstones code to it's own package. Unfortunately keeping |
I like the tombstone move. I think you could remove the |
What is the problem you are trying to solve with this PR? |
@krasi-georgiev the code for tailing/watching the WAL currently lives in the remote write package in prometheus/prometheus. We think it makes more sense for it to live in the WAL package, which is in tsdb repo right now. Unfortunately the WAL tailing code relies on a couple of types that live in the root |
hm , I see. |
ca248e3
to
faf81f5
Compare
faf81f5
to
e944b75
Compare
e944b75
to
d6ba13b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did another pass tonight, generally I am liking the refactor and beginnings of pulling things out of the base package.
@csmarchbanks thanks for the review and for the sync.Pool suggestion #606 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not familiar with WAL Watcher, but other changes look fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting to look good from my perspective! Left another handful of comments, mostly pretty minor.
f44209e
to
476bbec
Compare
@csmarchbanks @codesome could use another review when you have some time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more comments.
wal/watcher.go
Outdated
SeriesReset(int) | ||
} | ||
|
||
type watcherMetrics struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if this were WatcherMetrics
to make it easier to pass around elsewhere.
wal/watcher.go
Outdated
} | ||
|
||
// NewWatcher creates a new WAL watcher for a given WriteTo. | ||
func NewWatcher(reg prometheus.Registerer, logger log.Logger, name string, writer WriteTo, walDir string) *Watcher { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should take a WatcherMetrics
, not a registerer, otherwise creating two watchers with the same registerer would panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both NewWatcherMetrics
and NewLiveReaderMetrics
call reg.Register
, not reg.MustRegister
. But I guess the real question here is whether it's reasonable for consumers of the WAL Watcher to have to create and pass the metrics struct rather than just passing a Registerer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, reg.Register
won't panic, but it also won't register the new metrics, so we would be blind for those queues/after a reload.
I think it is reasonable to have a consumer create and pass the metrics struct, just like NewLiveReader
.
wal/watcher.go
Outdated
} | ||
|
||
// NewWatcher creates a new WAL watcher for a given WriteTo. | ||
func NewWatcher(reg prometheus.Registerer, logger log.Logger, name string, writer WriteTo, walDir string) *Watcher { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, reg.Register
won't panic, but it also won't register the new metrics, so we would be blind for those queues/after a reload.
I think it is reasonable to have a consumer create and pass the metrics struct, just like NewLiveReader
.
wal/watcher.go
Outdated
maxSegment int | ||
} | ||
|
||
func NewWatcherMetrics(reg prometheus.Registerer) *watcherMetrics { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these take a consumer and provide an Unregister
so that if a consumer disappears it can be removed from the metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't Unregister
as all the Watcher metrics are under the same metric but just WithLabels
. Both QueueManager and Watcher Stop
delete their metrics via DeleteLabelValues
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I hope I did not miss anything, PR is big.
Forgot to mention I haven't reviewed |
Signed-off-by: Callum Styan <[email protected]>
something from Prometheus. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
manually creating a checkpoint dir and renaming files. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
after refactoring. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
refactor. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
the live reader, expose WriteTo interface. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
4b7a4fb
to
8c40bb9
Compare
thanks @codesome, the only new addition to edit: do we want to merge this before TSDB is moved in prometheus/prometheus? or after, and just apply the diff from here to the other repo? |
than creating and registering in the constructor. Signed-off-by: Callum Styan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good overall - I agree with code movement, thanks for this!
However, I think there are some suggestions from my side:
- PR is huge. Can we just move things around in one PR and add some extra logic (
seriesSample
) in another? It will help massively even if it's tedious ): e.g I don't understand really why this change. tombstones
package can be simplified due to nice clear name- some public methods/functions are missing comments.
@@ -607,7 +608,7 @@ func (c *LeveledCompactor) write(dest string, meta *BlockMeta, blocks ...BlockRe | |||
} | |||
|
|||
// Create an empty tombstones file. | |||
if _, err := writeTombstoneFile(c.logger, tmp, newMemTombstones()); err != nil { | |||
if _, err := tombstones.WriteTombstoneFile(c.logger, tmp, tombstones.NewMemTombstones()); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure tombstones.WriteTombstoneFile
needs to be WriteTombstoneFile
etc? Why not just tombstones.WriteFile
if we have clear package name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same for NewMemTombstones
err error | ||
} | ||
|
||
func newCompactionSeriesSet(i IndexReader, c ChunkReader, t TombstoneReader, p index.Postings) *compactionSeriesSet { | ||
func newCompactionSeriesSet(i IndexReader, c ChunkReader, t tombstones.TombstoneReader, p index.Postings) *compactionSeriesSet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
func newCompactionSeriesSet(i IndexReader, c ChunkReader, t tombstones.TombstoneReader, p index.Postings) *compactionSeriesSet { | |
func newCompactionSeriesSet(i IndexReader, c ChunkReader, t tombstones.Reader, p index.Postings) *compactionSeriesSet { |
// ErrOutOfBounds is returned if an appended sample is out of the | ||
// writable time range. | ||
ErrOutOfBounds = errors.New("out of bounds") | ||
|
||
// ErrAmendSample is returned if an appended sample has the same timestamp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why moving around?
//lint:ignore SA6002 safe to ignore and actually fixing it has some performance penalty. | ||
h.appendPool.Put(b[:0]) | ||
} | ||
|
||
func (h *Head) getSeriesBuffer() []*memSeries { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have to change this while just moving packages around?
@@ -30,7 +32,7 @@ import ( | |||
"github.com/prometheus/tsdb/fileutil" | |||
) | |||
|
|||
const tombstoneFilename = "tombstones" | |||
const TombstoneFilename = "tombstones" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, with tombstones
package name you don't need tombstones
prefix everywhere, we can simplify (:
@@ -54,8 +71,8 @@ type TombstoneReader interface { | |||
Close() error | |||
} | |||
|
|||
func writeTombstoneFile(logger log.Logger, dir string, tr TombstoneReader) (int64, error) { | |||
path := filepath.Join(dir, tombstoneFilename) | |||
func WriteTombstoneFile(logger log.Logger, dir string, tr TombstoneReader) (int64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it is exposed we need comment e.g for goDoc ): Same everywhere else for public functions we add.
This is probably a bit of a pain to review, but moving the WAL watcher code from Prometheus to TSDB (into the WAL package) requires moving some code from the root tsdb package. Specifically the Watcher uses
RefSeries
andRefSample
. Moving those requires a bunch of other code to be moved as well.If anyone knows a nicer way to make the
RefSeries
/RefSamples
change or has thoughts about the new package organization let me know, none of this is set in stone.cc @tomwilkie @gouthamve