-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add custom split logic for scanner #125
Conversation
79b8135
to
12a84c9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice PR!
Just a few minor changes needed 👍
subtitles.go
Outdated
@@ -927,3 +930,32 @@ func escapeHTML(i string) string { | |||
func unescapeHTML(i string) string { | |||
return htmlUnescaper.Replace(i) | |||
} | |||
|
|||
func NewScanner(i io.Reader) *bufio.Scanner { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm missing something, I don't see the point of exporting this method, therefore could you rename it to newScanner
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Updated.
subtitles.go
Outdated
return scanner | ||
} | ||
|
||
func splitLines(data []byte, atEOF bool) (advance int, token []byte, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really see the point of creating an extra splitLines
function, since newScanner
is simple enough could you add the split function directly as an anonymous function:
scanner.Split(func(data []byte, atEOF bool) (advance int, token []byte, err error) {
...
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
subtitles_test.go
Outdated
func TestNewScanner(t *testing.T) { | ||
exts := []string{"vtt", "srt", "ssa"} | ||
for _, ext := range exts { | ||
s, err := astisub.OpenFile("./testdata/example-in-scan-line." + ext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename the testdata files to example-in-carriage-return.(srt|ssa|vtt)
instead?
12a84c9
to
c46e2e6
Compare
@asticode all comments have been addressed. Let me know if there is anything else. Otherwise, can I get another tag 🙏 ? |
Thanks again for the PR ❤️ I've created the |
This PR provides a custom
bufio.SplitFunc
that works with carriage returns\r
, allowing for parsing of files created by windows software.Closes #118