txt2srt

txt2srt is a tiny language that applies functions on a series of images and creates special effects. It also creates srt subtitles from a textual description.

Let me describe it via some simple examples.

T: 00:00:00,000
S: Kadath. By Eidon, Eidon@tutanota.com.
S: Original images produced by povray. All rights reserved.

T: 00:00:10,700
S: Not much to say here

T: 00:00:22,000
...

Strings such as the above ones describe subtitles. For instance, if I run txt2srt and I type the above text, the following text will be written in file subtitles.srt:

1
00:00:00,000 --> 00:00:10,700
Kadath. By Eidon, Eidon@tutanota.com.
Original images produced by povray. All rights reserved.

2
00:00:10,700 --> 00:00:22,000
Not much to say here

One could conclude that txt2srt is yet another subtitles description language. Well, it is, though it is more than that. Again, let me show this through examples:

IMAGE: 012223333333
FORMAT: %04d
TYPE: png
FPS: 23.976

T: 00:00:00,000
S: Kadath. By Eidon, Eidon@tutanota.com.
S: Original images produced by povray. All rights reserved.
CMD: echo cp $image output1/$image

T: 00:00:10,700
Not much to say here.
CMD: echo cp $image output2/$image

T: 00:00:22,000
...

The above tells txt2srt the following:

This time, the output of txt2srt is twofold: subtitles are once more generated and end up in subtitles.srt; in addition, a Bash shell script, genscript.sh, is created. That script includes two loops, that execute respectively echo cp $image output1/$image and echo cp $image output2/$image on each frame of block 1 and block 2. As a result, the frames of block i are stored in directory outputi.

So far, nothing spectacular. A little more interesting is what is shown in next example:

IMAGE: 012223333333
FORMAT: %04d
TYPE: png
FPS: 23.976

T: 00:00:00,000
S: Kadath. By Eidon, Eidon@tutanota.com.
S: Original images produced by povray. All rights reserved.
CMD: cp $image output/

T: 00:00:10,700
S: Not much to say here
CMD: echo convert $image -fill '"rgba(0,0,0,1)"' -colorize ${param}% output/$image
FROM: 1
TO: 80

T: 00:00:22,000
...

A first difference ith the previous example is given by the two directives FROM and TO: they define the initial and the final value of a floating point numer, which varies linearly in the loop that processes the frames in the current block.

As it was the case already in previous example, the above second block spans through time interval [10.700'', 22''] which corresponds to 11.3 * 23.976 = 270 frames; more precisely, frames 0122233333330257.png ... 0122233333330527.png (527 – 257 = 270). Variable $param shall vary linearly from 1 to 80 with a step equal to 79/270, namely 0.292593. At the beginning of the processing loop, $param shall be equal to 10.0, and at its end it will be euql to 80.0. Note that $param is used in the CMD command to control the colorize parameter of the ImageMagick utility convert. This means that all images in the block will get gradually colorized (in this case, darkened).

This is the corresponding loop being created:

da=257
a=527
step="0.292593"
param=$da
from=1
to=80
for i in `seq $da $a` ; do
    param=$(bc -l <<< "$from + ($i - $da) * $step")
    iparam=$(round $param)
    n=$(printf '%04d' $i)
    image="012223333333${n}.png"
    echo convert $image -fill '"rgba(0,0,0,1)"' -colorize ${param}% output/$image
done

Inspecting the loop reveals that another variable is being updated: $iparam. That is simply the rounded value of $param, which is useful in commands expecting integer values, such as the -region parameter of ImageMagick's convert.


Now let me summarize the above examples through the following input:

IMAGE: 012222234
FORMAT: %04d
TYPE: png
FPS: 23.976

T: 00:00:00,000
S: No special effects here
S: Original images produced by povray
CMD: echo cp $image output/

T: 00:00:10,700
S: Here I start gradually darkening the pictures
S: The degree of darkening gradually grows from 1% to 81%
CMD: echo convert $image -fill '"rgba(0,0,0,1)"' -colorize ${param}% output/$image
FROM: 1
TO: 81

T: 00:00:22,000
S: Here a focus region shrinks from top to bottom, while the scene
S: gets un-darkened from 81% to about 10%
CMD: echo convert $image -region 1280x720+0+$((720-iparam)) -fill '"rgba(0,0,0,1)"' -colorize $((10 + iparam / 10))% output/$image
FROM: 719
TO: 1

T: 00:00:42,000
S: FLASH! No special effects again
S: for four seconds
CMD: echo cp $image output/

T: 00:00:46,000
S: And here I redden the scene, from 1% to 95%
S: for 14 seconds; and then I stop
CMD: echo convert $image -fill '"rgba(255,0,0,1)"' -colorize ${iparam}% output/$image
FROM: 1
TO: 95

T: 00:00:60,000

The above has been applied to the following 1 minute of 23.976fps frames: Input test video

The result is as follows: Output video

The example is available also in file examples.stxt.

Compilation instructions

The code requires flex. If you don't have it, execute sudo apt-get install flex.

Type make. The provided Makefile will invoke flex and gcc with the proper flags.

How to run txt2srt

If the frames that constitute your video and file examples.stxt are in the current directory, do as follows:

txt2srt < examples.stxt 

As already mentioned, this creates subtitles in subtitles.srt and Bash shell scripts genscript.sh. It also creates Bash script output/ffmpeg.sh, that is to create the video with the output frames and subtitles.srt. Now execute:

./genscript.sh > process-frames.sh
chmod +x process-frames.sh
./process-frames.sh
cd output
./ffmpeg.sh 

Let me explain a little what the above five lines do:

  1. genscript.sh “unrolls” its loops and creates a script that executes the “right” command on each input frame. The script is called process-frames.sh.
  2. That script is made executable...
  3. ...and is executed. This populates the output folder with output frames.
  4. We enter directory output
  5. and execute in there the ffmpeg.sh script.

The latter is a one-liner that invokes ffmpeg as follows:

ffmpeg -y -framerate 23.976000 -pattern_type glob $threads -i '012223333333*.png' $threads -c:v libx265 -vf subtitles=subtitles.srt -b:v 2500k -an -r 23.976000 -pix_fmt yuv420p output.mp4

That's it. The output video is called output.mp4.

Where can I find it?

Here!

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.