
Video
Content Recognition Systems
Attrasoft
Image and Video
Recognition Experts
Attrasoft
P. O. Box 13051
Savannah, GA. 31406
USA
Phone: (912) 484-1717
January
2008
© Attrasoft 1998 - 2008
5
Performance (Default Setting)
5.1 Library Fingerprint Conversion
Time
5.2 Unknown Video Fingerprint
Conversion Time
7.2 Accuracy and Speed Trade Off
9.4 Evaluation Criteria and Outputs
Attrasoft provides products &
services to monitor both Video & Image content. The Attrasoft pattern
matching algorithms search for matches within the content. This document systematically describes
Attrasoft’s content based video recognition systems, which can be used to
identify video clips, movies, and television content by analyzing digital video
files. Attrasoft has robust content recognition technologies that use video
fingerprinting techniques.
Piracy is running rampant on the Internet.

Figure 1.1
Copyrighted material on Youtube.com.
Accurate
automated identification of digital content is important in various applications
including the content owner’s anti-piracy efforts, the Internet Service
Provider’s anti-piracy efforts, as well as media company’s needs to collect
critical market data for strategic planning.
There are several approaches to video
identifications, including human labor, digital watermarking, keyword, and
content-based recognition, which is further divided into audio and video
content recognition.
Human labor is
very good at detecting some extraordinary variations including:
Human labor,
however, is inadequate to handle the problem in a large scale because of three
factors: error, cost, and scalability. After a certain limit, typically a few
hundred videos, human brains have difficulty to identify video clips
accurately. The Human approach has a long unit of processing time and the cost
grows exponentially vs the amount of video contents. The Human approach to
detect video piracy also does not scale very well. These problems (error, cost,
and scalability) make the human solution unfavorable in the long run.
A digital watermark is a signal which is embedded
into digital data (audio, video, images and text) that could be detected or
extracted later. It is mostly used to insert copyright information of the data.
Using watermarking and/or
forensic marking requires modification of the original content or files. Video
identification based on digital watermarks can be used only in some simple
cases because a digital
watermark can be altered or removed. An identification system based on
something that can be altered easily by content users is not very reliable.
Keyword search requries attaching keywords to
image frames in a video. It is generally accepted that it takes 3 human hours
to process one hour of video for about 10 keywords. If what you want is not included
in the keyword, this search will fail. Also, this 1 to 3 ratio requires a large
amount of labor hours. Once again this approach does not scale.
This leads to
a natural conclusion, an automatic content-based identification system (audio, video, images, and text) will be required to identify.
The leading Audio technology providers are limited by their approach: most cannot identify clips shorter than 5 seconds. Audio fingerprinting does not work for modified video. For example, a music track over an NFL video clip will not be detected by audio alone. Therefore, while Audio technology is an important part of the solution, it is simply not complete.
Attrasoft
issues this report for potential vendors, which require VIDEO content
recognition systems to identify video content by analysis of digital files. This document systematically describes
Attrasoft’s content recognition systems ability to identify video clips,
movies, and television content by analyzing digital video files. Attrasoft
has robust content recognition technologies that use video fingerprinting
techniques, which is currently in a large-scale production & has been
proven in real world environment.
The
Attrasoft solution is an automated content identification platform based on
internally developed image and pattern recognition algorithms. Attrasoft converts digital video content to
fingerprints and makes Fast, Accurate, & Scalable Video Content
Recognition via the fingerprints.
Attrasoft video recognition
technology is rooted from Attrasoft image recognition technology. Video
identification is easier than image identification because a video has many
more image frames than a single image. To
download demo software used in this white paper, go to: http://attrasoft.com.
This white
paper intends to provide sufficient information for a full assessment of the
Attrasoft Technology against the anticipated requirements for video content
matching.
This white
paper also provides the Attrasoft Video Content Matching product road map, so
the reader can anticipate the next generation of Attrasoft products. Attrasoft
will adopt a phased approach to systematically improve the fingerprint
technology to meet specific marketplace requirements.
First of all, digital videos are
converted into digital fingerprints; see the following Figure:


![]()
Digital Contents Fingerprints
Figure 2.1. Digital videos are converted
into digital fingerprints.
A fingerprint
consists of a set of digital attributes computed from a video clip. Attrasoft’s
competitive advantage is its proprietary algorithms that will both:
(1) Create fingerprints, and
(2) Match fingerprints.
Later, when an
unknown video is matched against the library, it will be converted into a
fingerprint at that time and then matched against the fingerprint library.

Yes/No?
Figure 2.2. An unknown video is converted
into a fingerprint and then matched against the fingerprint library.
Attrasoft
video recognition technology is rooted from Attrasoft image recognition
technology. An image is converted into a set of attributes, called an image
signature. A video is decomposed into a set of images, which are in turn
converted into image signatures. The collection of these images signatures
forms a video fingerprint.
Image Signature
An Image Signature is a collection of digital attributes for an image.
Video Fingerprint
A Video fingerprint is a collection of digital attributes for a video.
The Attrasoft video fingerprint consists of a set of image signatures computed from individual image frames. The computation depends on the time interval that an image signature will be computed and added to the video fingerprint.
Sampling
Interval/Sampling Rate
Sampling Interval is the time interval that an image signature will be computed and added to the video fingerprint. Sampling Rate is the number of image signatures per unit time that will be computed and added to the video fingerprint.
The Sampling
Intervals are different for the video library and the unknown video. In
general, an unknown video does not require as many image signatures as the
library video.
Attrasoft
provides products including software, complete systems, and developer tools.
For those who are interested in the Attrasoft core technology, ImageFinder for Windows is an Off-the-Shelf Application Software that enables System Integrators, Solution Developers, and Individuals to quickly test their own Image Recognition ideas.
Attrasoft VideoFinder
is stand-alone software, which provides video content recognition software.
Attrasoft VideoFinder is currently available and can be purchased from
Attrasoft.com.

Figure 2.3. Attrasoft VideoFinder is
currently available and can be obtained via email: from Attrasoft.com.
Attrasoft KeepWatch
is a stand-alone system, which consists of both software and hardware for
content recognition.
Developers
interested in obtaining the video fingerprint component can take a look at the
Attrasoft TransApplet:
Information on the VideoFinder, the off-the-shelf product:
http://attrasoft.com/videofinder70/help/
Information on the VideoFinder demo:
http://attrasoft.com/videofinder70/index.htm
Information on the ImageFinder, the off-the-shelf product:
http://www.imagequery.net/imagefinder70/
Information on the TransApplet, the off-the-shelf product:
http://attrasoft.com/oldsite/transapplet70/
To order, go to:
http://attrasoft.com/oldsite/gs.html
The Attrasoft Technology provides highly accurate identification of video files including low rates of misidentified (false acceptance) and unidentified content (false rejection).
The foundation of the video identification is image
identification, which breaks a video into image frames and makes image
identification. Attrasoft Image Identification technology is currently in
large-scale production.
a.
False-Positives (content not in the identifier database incorrectly
identified).
The
False-Positives rate of Attrasoft Technology for large libraries of movie and
television content is expected to be less than 0.1%, if no variant versions are involved.
b. False-Negatives (content in the identifier database not identified).
The False-Negatives rate is expected to be less than
0.1%, if no variant
versions are involved.
c. Attrasoft
solution accuracy is not affected by:
(i)
The size of
the content library: for example, 10000 video clips are just as good as 1000
video clips;
(ii)
The
resolution and quality of the digital video file: for example, high compression
is just as good as low compression; and
(iii)
The length
of video available for analysis: for example, a 5-second clip is just as good a
30-second clip.
When a video
has multiple versions to start with, it may have multiple fingerprints. If a
variant version can be identified, then no new fingerprint will be required. If
a variant version cannot be identified, then a new fingerprint will be
required. Generally speaking, adding the variant version of the
fingerprint to master library will increase the accuracy.
Attrasoft solution identifies the time
section of a video that the recognized video segment matches. By default, the
identified video segment is specified to one second.
Attrasoft solution does offer options for
modifying the Technology to decrease the rate of False-Positives potentially at
the expense of higher False-Negatives and visa versa. In some applications, False-Negative is more acceptable than a
False-Positive; and in some applications, False-Negative is less acceptable
than a False-Positive.
Attrasoft Technology does not require
changes to digital video files.
While it is the intention of Attrasoft to support all common video codecs for a wide-range of resolutions and bit rates, the current Version of the Attrasoft VideoFinder supports:
*.avi,
*.mwv, and
limited *.mpeg (*.mpg).
Attrasoft will
adopt a phased approach to gradually support all common video codecs.
The unsupported files include:
*.mp4
*.mp3
*.mov
*.wma
*.flv
*.ac3
some *.mpg, *mpeg