Closed
Description
Component(s)
pkg/stanza
What happened?
Description
Consider the following scenario for fileconsumer:
- We read a file during a poll cycle, and emit all the logs. Let's say we emitted 10 lines.
- Before beginning the next poll cycle we do the following things in the background.
- Write more logs to the file.
- Copy that file, move it out of the pattern, and truncate it to 0.
- Write more logs to the previous file. More than 10 lines (this is necessary)
- Call next poll()
While performing the above steps, I noticed the excess logs that we wrote (after 10 lines), were emitted twice i.e. duplicated.
Steps to Reproduce
- Used the following to produce this bug/issue.
- The following test passes, unfortunately.
func TestOutOfPattern(t *testing.T) {
tempDir := t.TempDir()
cfg := NewConfig()
cfg.Include = append(cfg.Include, fmt.Sprintf("%s/*.log1", tempDir))
cfg.StartAt = "beginning"
operator, emitCalls := buildTestManager(t, cfg)
operator.persister = testutil.NewMockPersister("test")
temp := openTempWithPattern(t, tempDir, "*.log1")
writeString(t, temp, "testlog1\n")
operator.poll(context.Background())
waitForToken(t, emitCalls, []byte("testlog1"))
// write more log, before next poll() begins
writeString(t, temp, "testlog2\n")
// copy the file to another file i.e. rotate, out of pattern
temp2 := openTempWithPattern(t, tempDir, "*.log2")
temp.Seek(0, 0)
_, err := io.Copy(temp2, temp)
require.NoError(t, err)
temp.Seek(0, 0)
temp.Truncate(0)
temp.Write([]byte("testlog4\ntestlog5\n"))
// begin next poll()
fmt.Print("\n\n\nSecond poll\n")
operator.poll(context.Background())
// INCORRECT, should emit testLog5 only once.
waitForTokens(t, emitCalls, [][]byte{[]byte("testlog5"), []byte("testlog4"), []byte("testlog5")})
}
Expected Result
It should only emit the logs once.
Actual Result
Duplication.
Proposed fix
- The fix for this would be to compare previous fingerprints, and newer fingerprints and only emit more logs if they're the same.
- I can work on a PR if that sounds okay.
Collector version
v0.85.0
Environment information
Environment
OS: (e.g., "Ubuntu 20.04")
Compiler(if manually compiled): (e.g., "go 14.2")
OpenTelemetry Collector configuration
No response
Log output
No response
Additional context
No response