-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCC > SRT error, Domesday LD Capture: AttributeError: 'NoneType' object has no attribute 'append_text' #394
Comments
@valnoel Can you look at this issue in the context of your work improving the SCC reader? |
It was raised to my attention that it seems the Japanese character sets are not present in the scc codes file, thus I imagine this might be a difficult task to achieve? It was also noted that the content in the example is "two byte unicode", not sure if that's helpful. Just passing on some information from the Domesday group conversation. Thanks to the maintainers for assistance! |
The SCC reader does not currently support Japanese characters, which do not appear in the CEA-608 specification. It seems an extension was once submitted to the specification, but I don't have any more information about it... Otherwise, it seems CEA-708 introduces the Unicode characters support, which allow the display of Japanese and other languages. @palemieux What do you think? |
Ok will look at this next week. |
@rktcc Can you provide a link to the forum discussion thread? I could not find any specification for carrying arbitrary unicode characters in SCC. |
Hi, I am sorry for the delay. Here is the discussion on ttconv missing Japanese character sets: https://discord.com/channels/665557267189334046/676084498097766451/1140876443719577650
Here is a thought that the Norpak Non-Western addition may be what's needed... https://discord.com/channels/665557267189334046/676084498097766451/1141486766579265576
Someone mentions a reference of CEA-608 set 6.4, Table 4, for Asian languages; however only PRC and (South) Korea are mentioned. https://discord.com/channels/665557267189334046/676084498097766451/1141484827619635240
There's also a thought that it could be CC/Teletext, however as other subtitle content has been extracted from LaserDiscs using the Domesday, and converted from SCC to plaintext SRT, I would have to guess the Japanese SCC data would be the same, just the character sets missing from ttconv.
I hope this is helpful in some capacity in either closing the ticket due to lack of project support, or adding some kind of additional processing. If more info is needed I can look more. The Discord is free to join, sadly this is not hosted on an actual forum. Alternatively the general chat can be joined from IRC, on channel #domesday86 on https://libera.chat IRC network; you would not need to sign up for Discord in that case as a bot hands messages each way. Discord Invite: https://github.com/happycube/ld-decode#documentation Thank you again |
I have joined the discord server. In the meantime, I have spent some quality time staring at the sample file and it does not look like CEA 608 at all, e.g.: Is that noise/errors from the laserdisc capture? Could it be something totally different like bitmaps? |
head.scc.txt (rename from .txt to .scc since Github didn't like .scc.)
This is an SCC file extracted from a LaserDisc film captured using a Domesday Duplicator. Additionally, this is a Japanese language film.
This issue has occurred in the past with other Domesday captures but I used https://github.com/atsampson/ttconv until it stopped working now, and I can't sort out what changes they made before merging the updates to 1.0.7.
I wonder if possibly the capture has errors or is flawed and this is causing the "unsupported characters", or if it's just because Japanese character set is not supported?
thank you
The text was updated successfully, but these errors were encountered: