Einleitung
Auf der DVD zu The IT Crowd "Version 2.0" ließen sich neben den bekannten Features auch einige Überraschungen finden. Schon aus Episode 4 der ersten Staffel waren die Base64-enkodierten Untertitel bekannt - nur das sie in dieser Staffel nicht die eigentlichen Untertitel enthalten. Die l33t-Untertitel (sid 0x21) sind entweder reiner Text oder Daten auf deren Benutzung später eingegangen wird.
- Titel 1
- EasterEgg1.jpg
- Titel 2
- Barber.z5
- Titel 3
- EggDump
- Titel 4
- SOPHIE
- Titel 5
- EasterEgg3.part.2
Außerdem kann der Wild-mit-der-Fernbedienung-Herumspieler unter Titel 10 auf der DVD einen Ausschnitt einer älteren BBC-Sendung finden, die Aufschluss über eine weitere, versteckte Überraschung liefert. Dieses Wissen kann in Titel 6 der DVD angewandt werden um Hint.tiff und EasterEgg3.part.1 zu extrahieren.
Neben mir scheinen sich außerdem noch ein paar weitere Käufer dieser DVDs damit beschäftigt zu haben. Diese sollen natürlich nicht unerwähnt bleiben
Da die Beiden nur teilweise erfolgreich waren und ich auch erst nach meiner Extraktion auf beide gestoßen bin, werde ich meine Vorgehensweise grob zusammenfassen. Dies sollte vor allem Leute aus dem Vereinigten Königreich und Irland noch die Möglichkeit geben bis zum 23. Juni 2008 am Wettbewerb teilzunehmen.
Extracting files
Dump subtitles
- Dump title 1 - 6
#! /bin/sh -e for title in `seq 1 6`; do mplayer dvd://${title} -dumpstream -dumpfile title${title}.vob done - Extract raw subs from vob with transcode
#! /bin/sh -e for title in `seq 1 6`; do tccat -i title${title}.vob -L| tcextract -x ps1 -t vob -a 0x21 > title${title}.sub done - Convert to vobsub (needs VTS_01_0.IFO from DVD)
#! /bin/sh -e for title in `seq 1 6`; do subtitle2vobsub -o itcrowd${title} -i VTS_01_0.IFO -a 2 < title${title}.sub done - Convert every subtitle to image
#! /bin/sh -e for title in `seq 1 6`; do mkdir -p itcrowd${title} vobsub2pgm -t 2 itcrowd${title} itcrowd${title}/itcrowd${title}_ for image in `find itcrowd${title}/*.pgm`; do convert ${image} ${image}.pcx && rm ${image} done mmv "itcrowd${title}/*.pgm.pcx" "itcrowd${title}/#1.pcx" done
Decode
The subtitles from episode 1, 2, 3 and 5 contain base64 encoded files. It's possible to use tools like base64 from coreutils - but they can only handle text files. So get 100 secretaries and let them convert each file into a text file. ;-)
I "created" a small program which memorize every letter I converted manually and does it the same way I did when it finds "the picture" again. So I had two runs - one to fill the brain of this little dump OCR software and another to save the result.
If you want to start with an empty brain.db (no, you don't want that... trust me), don't pipe the output anywhere because the first run is an interactive run (full brain.db is inside the archive). Afterwards it is intelligent enough and you can convert everything in a separate file.
#! /bin/sh -e
BRAIN=~/brain.db
for title in `seq 1 5`; do
for image in `find itcrowd${title}/*.pcx`; do
itcrowd_decoder ${BRAIN} ${image} > ${image}.txt
done
done
Merging
It's important to know that some subtitles are duplicates and must be removed before we can continue by merging everything. There are different strategies to do that... for example by looking at the timestamps. It's a possible duplicate when there is no difference between the end of the last subtitle and the start of the new one. I used a small script to do that.
#! /usr/bin/env python
dict = {}
lines = open("itcrowd6_.srtx").readlines()
i = 0
cur = ""
last_end_time = ""
for line in lines:
if i % 4 == 0:
cur = line
elif i % 4 == 1:
t = line.split()
if t[0] == last_end_time:
print cur
last_end_time = t[2]
i += 1
- itcrowd1_*.pcx.txt
- 0125
- 0140
- 0236
- 0433
- 0494
- 0538
- itcrowd2_*.pcx.txt
- 0127
- 0363
- itcrowd3_*.pcx.txt
- 0009
- itcrowd5_*.pcx.txt
- 0002
- 0187
- 0378 (needs to be replaced)
- 0751
- 0902
- itcrowd6_*.pcx.txt (different tool to decode that into txt files)
- 0002
- 0298 (needs to be replaced)
- 0602
- 0877
- 1366
After we removed the duplicates and cat'ed the remaining ones into itcrowd${title}.txt, "base64 -d" can do his work (don't forget to remove MIME-Header from the beginning of the file).
Inspecting files
EasterEgg1.jpg
- picture of Moss
- Exif comment: You've found an Easter Egg!
- Exif GPS-Position: Pinewood Studios (I.T. Department set)
Barber.z5
The second episode will bring us a game for z-machine interpreters like frotz. My personal walkthrough to get 42/42 points (there are some bugs which makes it even harder):
.--------.
| |
| Toilet |
| |
.------.----------o----#---o---------------------------------------------.
| | | Toilet | |
| Lift # Corridor # # End of Corridor |
| | | Area | |
o------o----------o----#---o------------o---------o-----------o----#-----o--------.
| | | Storage | Moss's | | Jen's |
| Toilet | Kitchen # # # Entrance # |
| | | Corner | Desk | | Office |
o--------o------------o----#----o----- -----o----#-----o--------o
| Richmond's | Roy's | | Games |
| # # Workbench # |
| Room | Desk | | Corner |
o------------o---------o-----------o----------o
- Lift
- take screwdriver
- press button
- wait
- east
- Basement Corridor
- east
- The Toilet Area
- east
- End of Corridor
- take spectrum
- check keypad
- read note
- type 3584 on keypad
- south
- Office Entrance
- take zx80
- east
- Jen's Office
- take computer
- switch off computer
- switch on computer
- west
- Office Entrance
- south
- Games Corner
- take xbox
- west
- Workbench
- take radio
- north
- Moss's Desk
- take commodore amiga
- west
- Storage Corner
- take bitstik
- west
- Kitchen
- east
- Storage Corner
- west
- Kitchen
- take pendant
- check ceiling
- east
- Storage Corner
- south
- Roy's Desk
- west
- say I am your boss to roy
- open red door
- plugh
- Richmond's Room
- check lights
- east
- Roy's Desk
- use computer
- plugh
- Richmond's Room
- check lights
- check lights
- check lights
- check lights
- check lights
- open red door
- east
- Roy's Desk
- east
- Workbench
- east
- Games Corner
- north
- Office Entrance
- open door
- north
- End of the Corridor
- west
- The Toilet Area
- type 5982 on keypad
- north
- Toilet
- take two-way radio
- switch on two-way radio
- tune two-way radio to 159.65
- listen to radio
EggDump
- bbc master mode0 image
- can be decoded by using SOPHIE or into a PPM by
#! /usr/bin/env python data = open("EggDump").read() print "P1" print "640 512" def print_byte(x): for i in range(0, 8): print ((x & 0x80) >> 7) x = x << 1 def decode_blockline(x, y, line): byte_pos = y * 640 + x * 8 + line byte = ord(data[byte_pos]) print_byte(~byte) for y in range(0, 32): for line in range(0, 16): for x in range(0, 80): decode_blockline(x, y, line / 2)
SOPHIE
A small bbc basic program - so we need an emulator for bbc machines.
I tried
BeepEm to simulate the basic interpreter and the acorn hardware.
You can load the image with the basic
program and EggDump
from the previous episode, type CHAIN "SOPHIE" and press
ENTER to watch a picture of Roy with an easter egg
and hear the it crowd melody.
DFS Explorer
can be used to create that image (I used the Linux freeware version).
It is extreme important to set Load of EGGDUMP to FF3000 and Exec to FF7FFF
and save SOPHIE as basic program (don't know how to do that without typing
everything into the basic shell and saying SAVE "SOPHIE").
More about the usage of the basic shell can be found in the official User Guide.
EasterEgg3.part.2
After watching the episode again with mplayer, I noticed that subtitle page 378 differs from the one I got from subtitleripper. Even with SubRip it wasn't possible (ok, this thing exported nothing as it should) to rip the subtitle image. Only a typing monkey could create a working version which filled the missing bytes.
You cannot use the data before you didn't extract EasterEgg3.part.1 from title 6.
Thornes box
This one is very different. No base64 encoded stuff all over the screen - only a small box in the lower right corner. If you have watched the (hidden) track 10 on the DVD, you will (maybe) know what that means. I tried to think a little bit about what each line could mean
| 01-02 | Border to mark the start of the box |
|---|---|
| 03-04 | Bit Nr. 7 |
| 05-06 | Blank line |
| 07-08 | Bit Nr. 6 |
| 09-10 | Blank line |
| 11-12 | Bit Nr. 5 |
| 13-14 | Blank line |
| 15-16 | Bit Nr. 4 |
| 17-18 | Blank line |
| 19-20 | Bit Nr. 3 |
| 21-22 | Blank line |
| 23-24 | Bit Nr. 2 |
| 25-26 | Blank line |
| 27-28 | Bit Nr. 1 |
| 29-30 | Blank line |
| 31-32 | Bit Nr. 0 |
| 33-34 | Blank line |
| 35-36 | Parity-Bit |
| 37-38 | Border to mark the end of the box |
9 bits is a little bit too much - so maybe the lower bit is a parity
bit (i tried to check it by decoding everything and
it looks quite good). It seems to work the same way the overscan
does (with an extra empty line after begin and before end marker).
So mplayer -vf crop=1:42:719:38 ../itcrowd6.vob -vo png -nosound can be used
to get the "raw" overscan images.
And again, I miss two bytes. Where are they gone? You can look in the subtitle blob at offset 1002 and see a nice message and guess where one byte is missing (0x49 at the beginning of the file). The other missing byte ruins the image file directory at offset 1186 - so the missing byte must be in front of it. After some research, I noticed that subtitle2vobsub or vobsub2pgm is buggy and we can replace itcrowd6_0298.pcx and some other broken exports with a version I saw by using mplayer. This one will give us another hint what to do next.
The overscan has many blank frames and some extra frames to learn the encoding (frame 36-290) which must be removed. After the conversation you will get EasterEgg3.part.1 with some greetings at the beginning of the file and a checksum + file length... wait a second, I am missing a byte again?!?
This time the frame for offset 17708 is missing. Just insert 0x03 and everything looks fine.
You can cat both parts (without the comment at the beginning) together and get a picture of Jen with her easter egg. Take a look at it with a tool like exif to get more information out of the image - like the image unique id. Looks suspicious, or? This hex string can easily converted in a ASCII string (use an hex editor if you don't know all ASCII chars hex representation). If you done it right it should give you a smart hint for the competition - 'A: "Theobromine"'.
The competition
There is/was a competition on howlonghaveyougot.com (read terms and conditions and then press "Competition") were you can win some prices (if you are from the united kingdom - so I am not allowed to take part).
- Where's Moss's Easter Egg?
- In the l33t subtitles of season 2 episode 1 - or you could say somewhere in the Pinewood Studios (I.T. Department set).
- What's in Jen's Easter Egg?
- A question "Q: What's unique about this image?" and the answer to the question in its image id (413A20225468656F62726F6D696E6522). Translated into ASCII it means: A: "Theobromine"
- What's FNmesijos(89246,14164) in hex?
- C9F90000
- What's the codec talkback frequency?
- 159.65 (you can find it at the back cover of your dvd box)
- What was locked in the toilet cupboard?
- Apple Newton PDA
- What is killing fashion?
- Home sewing is killing fashion (and it's illegal)
- What was Roy's losing poker hand?
- What's your email address?
- Never saw this in an IT crowd episode - so it is impossible to answer ;)