| Nov |
DEC |
Jan |
|
14 |
|
| 2019 |
2020 |
2021 |
About this capture
The Wayback Machine - http://web.archive.org/web/20201214031458/https://github.com/smacke/ffsubsync
Skip to content
/;ref_cta:Sign up;ref_loc:header logged out">
Sign up
●
Features →
●Code review
●Project management
●Integrations
●Actions
●Packages
●Security
●Team management
●Hosting
●Mobile
●Customer stories →
●Security →
●
●
●
●Explore GitHub →
Learn & contribute
●Topics
●Collections
●Trending
●Learning Lab
●Open source guides
Connect with others
●Events
●Community forum
●GitHub Education
●GitHub Stars program
●
●
Plans →
●Compare plans
●Contact Sales
●Nonprofit →
●Education →
In this repository
All GitHub
↵
Jump to
↵
-
No suggested jump to results
{{ message }}
●
Sponsor smacke/ffsubsync
●
Watch
63
●
Star
4.7k
●
Fork
176
Automagically synchronize subtitles with video.
MIT License
4.7k
stars
176
forks
Star
Watch
●
Code
●
Issues
31
●
Pull requests
1
●
Actions
●
Projects
0
●
Security
●
Insights
More
●
Code
●
Issues
●
Pull requests
●
Actions
●
Projects
●
Security
●
Insights
Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
Sign up for free
7
branches
18
tags
Go to file
Code
Clone
Use Git or checkout with SVN using the web URL.
Work fast with our official CLI.
Learn more.
●
Open with GitHub Desktop
●
Download ZIP
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio
If nothing happens, download the GitHub extension for Visual Studio and try again.
Latest commit
smacke
add path warning in readme for windows users
f63d0bf
Dec 6, 2020
add path warning in readme for windows users
f63d0bf
Git stats
●
257
commits
Files
Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
.github
change ci workflow name to `master`
May 11, 2020
docs
ffsubsync in docs index" class="link-gray" href="/web/20201214031458/https://github.com/smacke/ffsubsync/commit/099bfc095a0fef05224408caaa453eb23f85eb5e">fix subsync -> ffsubsync in docs index
Jun 6, 2020
ffsubsync
Merge pull request #107 from alucryd/batch
Nov 28, 2020
gui
better version handling when version string not available from git tag
Jun 21, 2020
resources
disable lfs and rm ff* binaries
Jun 14, 2020
scripts
better version handling when version string not available from git tag
Jun 21, 2020
test-data @ 4020413
bump test data submodule
Nov 27, 2020
tests
treat srtin as list in integration tests
Nov 28, 2020
.gitattributes
disable lfs and rm ff* binaries
Jun 14, 2020
.gitignore
support syncing several subs against the same ref
Nov 25, 2020
.gitmodules
add test-data as submodule and add integration testing step to CI
Mar 10, 2020
.readthedocs.yml
Add docs and rebrand project as `ffsubsync` which is available on PyPI
May 11, 2020
.travis.yml
pyinstaller / gooey packaging + remove sklearn and add skearn_shim
Jun 3, 2020
CODE_OF_CONDUCT.md
add code of conduct
Sep 5, 2020
HISTORY.rst
experimental golden section search; change max offset seconds to 60; …
Oct 11, 2020
LICENSE
small edits to readme
Feb 25, 2019
MANIFEST.in
use versioneer for versioning
Jun 8, 2020
Makefile
forgot to use new deploy script in Makefile; also automate rebase of …
Jun 9, 2020
README.md
add path warning in readme for windows users
Dec 6, 2020
pytest.ini
integration" class="link-gray" href="/web/20201214031458/https://github.com/smacke/ffsubsync/commit/c0e8e879da6e28596cb8b502e64ffcd7c287c5f0">add integration marker and rename end_to_end -> integration
Mar 10, 2020
requirements-dev.txt
skip twine for 3.4
Sep 5, 2020
requirements.txt
use webrtcvad-wheels for windows via environment marker in requiremen…
Sep 23, 2020
setup.cfg
use versioneer for versioning
Jun 8, 2020
setup.py
use webrtcvad-wheels for windows via environment marker in requiremen…
Sep 23, 2020
versioneer.py
use versioneer for versioning
Jun 8, 2020
README.md
FFsubsync




Language-agnostic automatic synchronization of subtitles with video, so that
subtitles are aligned to the correct starting point within the video.
| Turn this: |
Into this: |
 |
 |
Helping Development
At the request of some, you can now help cover my coffee expenses using the
Github Sponsors button at the top, or using the below Paypal Donate button:
Install
First, make sure ffmpeg is installed. On MacOS, this looks like:
brew install ffmpeg
(Windows users: make sure ffmpeg is on your path and can be referenced
from the command line!)
Next, grab the script. It should work with both Python 2 and Python 3:
pip install ffsubsync
If you want to live dangerously, you can grab the latest version as follows:
pip install git+https://github.com/smacke/ffsubsync@latest
Usage
ffs, subsync and ffsubsync all work as entrypoints:
ffs video.mp4 -i unsynchronized.srt -o synchronized.srt
There may be occasions where you have a correctly synchronized srt file in a
language you are unfamiliar with, as well as an unsynchronized srt file in your
native language. In this case, you can use the correctly synchronized srt file
directly as a reference for synchronization, instead of using the video as the
reference:
ffsubsync reference.srt -i unsynchronized.srt -o synchronized.srt
ffsubsync uses the file extension to decide whether to perform voice activity
detection on the audio or to directly extract speech from an srt file.
Sync Issues
If the sync fails, there are a few recourses available. The best one to try
first is to specify --vad=auditok as a command line option, since sometimes
auditok works well with ffsubsync in the
case of of muffled or otherwise low-quality audio. Auditok does not
specifically detect voice, but instead detects all audio; this property can
yield suboptimal syncing behavior when a proper VAD can work
well, but can be effective in some cases.
The next step is to try different values for --max-offset-seconds. By default
ffsubsync runs with --max-offset-seconds=600, since subititles are unlikely
to be offset by more than 10 minutes in practice, and enforcing this constraint
typically leads to a better outcome. There may be some rare cases in which
subtitles are more egregiously out of sync and where increasing this value can
help.
If the sync still fails, consider trying one of the following similar tools:
●sc0ty/subsync: does speech-to-text and looks for matching word morphemes
●kaegi/alass: rust-based subtitle synchronizer with a fancy dynamic programming algorithm
●tympanix/subsync: neural net based approach that optimizes directly for alignment when performing speech detection
●oseiskar/autosubsync: performs speech detection with bespoke spectrogram + logistic regression
●pums974/srtsync: similar approach to ffsubsync (WebRTC's VAD + FFT to maximize signal cross correlation)
Speed
ffsubsync usually finishes in 20 to 30 seconds, depending on the length of the
video. The most expensive step is actually extraction of raw audio. If you
already have a correctly synchronized "reference" srt file (in which case audio
extraction can be skipped), ffsubsync typically runs in less than a second.
How It Works
The synchronization algorithm operates in 3 steps:
(一)Discretize video and subtitles by time into 10ms windows.
(二)For each 10ms window, determine whether that window contains speech. This
is trivial to do for subtitles (we just determine whether any subtitle is
"on" during each time window); for video, use an off-the-shelf voice
activity detector (VAD) like
the one built into webrtc.
(三)Now we have two binary strings: one for the subtitles, and one for the
video. Try to align these strings by matching 0's with 0's and 1's with
1's. We score these alignments as (# video 1's matched w/ subtitle 1's) - (#
video 1's matched with subtitle 0's).
The best-scoring alignment from step 3 determines how to offset the subtitles
in time so that they are properly synced with the video. Because the binary
strings are fairly long (millions of digits for video longer than an hour), the
naive O(n^2) strategy for scoring all alignments is unacceptable. Instead, we
use the fact that "scoring all alignments" is a convolution operation and can
be implemented with the Fast Fourier Transform (FFT), bringing the complexity
down to O(n log n).
Limitations
In most cases, inconsistencies between video and subtitles occur when starting
or ending segments present in video are not present in subtitles, or vice versa.
This can occur, for example, when a TV episode recap in the subtitles was pruned
from video. FFsubsync typically works well in these cases, and in my experience
this covers >95% of use cases. Handling breaks and splits outside of the beginning
and ending segments is left to future work (see below).
Future Work
Besides general stability and usability improvements, one line
of work aims to extend the synchronization algorithm to handle splits
/ breaks in the middle of video not present in subtitles (or vice versa).
Developing a robust solution will take some time (assuming one is possible).
See #10 for more details.
History
The implementation for this project was started during HackIllinois 2019, for
which it received an Honorable Mention (ranked in the top 5 projects,
excluding projects that won company-specific prizes).
Credits
This project would not be possible without the following libraries:
●ffmpeg and the ffmpeg-python wrapper, for extracting raw audio from video
●VAD from webrtc and the py-webrtcvad wrapper, for speech detection
●srt for operating on SRT files
●numpy and, indirectly, FFTPACK, which powers the FFT-based algorithm for fast scoring of alignments between subtitles (or subtitles and video)
●Other excellent Python libraries like argparse and tqdm, not related to the core functionality, but which enable much better experiences for developers and users.
License
Code in this project is MIT licensed.
About
Automagically synchronize subtitles with video.
Topics
subtitles
video
audio
ffmpeg
vad
fft
synchronization
sync
subtitle
captions
vlc
vlc-media-player
srt
srt-subtitles
voice-activity-detection
speech-detection
fast-fourier-transform
alignment
string-alignment
caption
Resources
Readme
License
MIT License
0.4.8
Latest
Sep 23, 2020
+ 17 releases
Sponsor this project
Learn more about GitHub Sponsors
No packages published
●
@obroomhall / AutoTrim
Languages
●
Python
97.6%
●
Shell
2.1%
●
Makefile
0.3%
●© 2020 GitHub, Inc.
●Terms
●Privacy
●
●Security
●Status
●Help
●Contact GitHub
●Pricing
●API
●Training
●Blog
●About
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products.
Learn more.
We use optional third-party analytics cookies to understand how you use GitHub.com so we can build better products.
You can always update your selection by clicking Cookie Preferences at the bottom of the page.
For more information, see our Privacy Statement.
Essential cookies
We use essential cookies to perform essential website functions, e.g. they're used to log you in.
Learn more
Always active
Analytics cookies
We use analytics cookies to understand how you use our websites so we can make them better, e.g. they're used to gather information about the pages you visit and how many clicks you need to accomplish a task.
Learn more