Video Content Recognition Systems

Attrasoft White Paper

 

 

 

Attrasoft

Image and Video Recognition Experts

 

 

Attrasoft

P. O. Box 13051

Savannah, GA. 31406

USA

 

http://attrasoft.com

gina@attrasoft.com

Phone:  (912) 484-1717

 

 

January 2008

 

 

 

© Attrasoft 1998 - 2008

Table of Contents

Table of Contents. 2

Abstract 3

1. Introduction. 4

2. Attrasoft Technology. 6

2.1 System Description. 6

2.2 Terminology. 6

2.3 Product Description. 7

2.4 Additional Information. 8

3 System Functionality. 10

3.1 Accuracy. 10

3.2 Variation. 10

3.3 Time Section. 11

3.4 Parameters. 11

3.5 Passive. 11

3.6 Codecs. 11

3.7 Survivability. 12

3.8 Minimal Sample Size. 13

3.9 Hinting. 13

3.10 Robustness. 13

3.11 Content Identifier 13

3.12 International 14

3.13 Additional Information. 14

4. Hardware. 15

5 Performance (Default Setting) 17

5.1 Library Fingerprint Conversion Time. 17

5.2 Unknown Video Fingerprint Conversion Time. 17

5.3 Atomic Matching Speed. 18

5.4 Physical Storage. 18

5.5 RAM Requirements. 18

5.6 Summary. 19

6 Performance (Other Setting) 20

7 Scalability. 21

7.1 Default Rate. 21

7.2 Accuracy and Speed Trade Off 22

8 Product Road Map. 23

9. Your Testing Cases. 24

9.1 File Sets. 24

9.2 Functionality Test Cases. 24

9.3 Testing. 24

9.4 Evaluation Criteria and Outputs. 25

 

 


Abstract

 

Attrasoft provides products & services to monitor both Video & Image content. The Attrasoft pattern matching algorithms search for matches within the content. This document systematically describes Attrasoft’s content based video recognition systems, which can be used to identify video clips, movies, and television content by analyzing digital video files. Attrasoft has robust content recognition technologies that use video fingerprinting techniques.

 


 

1. Introduction

 

Piracy is running rampant on the Internet.

 

 

 

Figure 1.1 Copyrighted material on Youtube.com.

 

Accurate automated identification of digital content is important in various applications including the content owner’s anti-piracy efforts, the Internet Service Provider’s anti-piracy efforts, as well as media company’s needs to collect critical market data for strategic planning.

There are several approaches to video identifications, including human labor, digital watermarking, keyword, and content-based recognition, which is further divided into audio and video content recognition.

Human labor is very good at detecting some extraordinary variations including:

 

 

Human labor, however, is inadequate to handle the problem in a large scale because of three factors: error, cost, and scalability. After a certain limit, typically a few hundred videos, human brains have difficulty to identify video clips accurately. The Human approach has a long unit of processing time and the cost grows exponentially vs the amount of video contents. The Human approach to detect video piracy also does not scale very well. These problems (error, cost, and scalability) make the human solution unfavorable in the long run.

A digital watermark is a signal which is embedded into digital data (audio, video, images and text) that could be detected or extracted later. It is mostly used to insert copyright information of the data. Using watermarking and/or forensic marking requires modification of the original content or files. Video identification based on digital watermarks can be used only in some simple cases because a digital watermark can be altered or removed. An identification system based on something that can be altered easily by content users is not very reliable.

Keyword search requries attaching keywords to image frames in a video. It is generally accepted that it takes 3 human hours to process one hour of video for about 10 keywords. If what you want is not included in the keyword, this search will fail. Also, this 1 to 3 ratio requires a large amount of labor hours. Once again this approach does not scale.

This leads to a natural conclusion, an automatic content-based identification system (audio, video, images, and text) will be required to identify.

 

The leading Audio technology providers are limited by their approach: most cannot identify clips shorter than 5 seconds. Audio fingerprinting does not work for modified video. For example, a music track over an NFL video clip will not be detected by audio alone. Therefore, while Audio technology is an important part of the solution, it is simply not complete.

 

Attrasoft issues this report for potential vendors, which require VIDEO content recognition systems to identify video content by analysis of digital files. This document systematically describes Attrasoft’s content recognition systems ability to identify video clips, movies, and television content by analyzing digital video files. Attrasoft has robust content recognition technologies that use video fingerprinting techniques, which is currently in a large-scale production & has been proven in real world environment.    

*     

*The Attrasoft solution is an automated content identification platform based on internally developed image and pattern recognition algorithms.  Attrasoft converts digital video content to fingerprints and makes Fast, Accurate, & Scalable Video Content Recognition via the fingerprints.

 

Attrasoft video recognition technology is rooted from Attrasoft image recognition technology. Video identification is easier than image identification because a video has many more image frames than a single image. To download demo software used in this white paper, go to: http://attrasoft.com.

 

This white paper intends to provide sufficient information for a full assessment of the Attrasoft Technology against the anticipated requirements for video content matching.

 

This white paper also provides the Attrasoft Video Content Matching product road map, so the reader can anticipate the next generation of Attrasoft products. Attrasoft will adopt a phased approach to systematically improve the fingerprint technology to meet specific marketplace requirements.


 

 

2. Attrasoft Technology

 

2.1 System Description

 

First of all, digital videos are converted into digital fingerprints; see the following Figure:

 

 

 

 

 

 

 

 

 


Digital Contents                                                            Fingerprints

 

Figure 2.1. Digital videos are converted into digital fingerprints.

 

A fingerprint consists of a set of digital attributes computed from a video clip. Attrasoft’s competitive advantage is its proprietary algorithms that will both:

 

(1)   Create fingerprints, and

(2)   Match fingerprints.

 

Later, when an unknown video is matched against the library, it will be converted into a fingerprint at that time and then matched against the fingerprint library.

 

 

                             Yes/No?

 

Figure 2.2. An unknown video is converted into a fingerprint and then matched against the fingerprint library.

 

2.2 Terminology

 

 

Attrasoft video recognition technology is rooted from Attrasoft image recognition technology. An image is converted into a set of attributes, called an image signature. A video is decomposed into a set of images, which are in turn converted into image signatures. The collection of these images signatures forms a video fingerprint.

 

 Image Signature

 An Image Signature is a collection of digital attributes for an image.

 

Video Fingerprint

A Video fingerprint is a collection of digital attributes for a video.

 

The Attrasoft video fingerprint consists of a set of image signatures computed from individual image frames. The computation depends on the time interval that an image signature will be computed and added to the video fingerprint.

 

Sampling Interval/Sampling Rate

Sampling Interval is the time interval that an image signature will be computed and added to the video fingerprint. Sampling Rate is the number of image signatures per unit time that will be computed and added to the video fingerprint.

 

The Sampling Intervals are different for the video library and the unknown video. In general, an unknown video does not require as many image signatures as the library video.

 

2.3 Product Description

 

Attrasoft provides products including software, complete systems, and developer tools.

 

For those who are interested in the Attrasoft core technology, ImageFinder for Windows is an Off-the-Shelf Application Software that enables System Integrators, Solution Developers, and Individuals to quickly test their own Image Recognition ideas.

 

Attrasoft VideoFinder is stand-alone software, which provides video content recognition software. Attrasoft VideoFinder is currently available and can be purchased from Attrasoft.com.

 

 

 

Figure 2.3. Attrasoft VideoFinder is currently available and can be obtained via email: from Attrasoft.com.

 

 

Attrasoft KeepWatch is a stand-alone system, which consists of both software and hardware for content recognition.

 

Developers interested in obtaining the video fingerprint component can take a look at the Attrasoft TransApplet:

 

 

2.4 Additional Information

 

Information on the VideoFinder, the off-the-shelf product:

 

http://attrasoft.com/videofinder70/help/

 

Information on the VideoFinder demo:

 

http://attrasoft.com/videofinder70/index.htm

 

 

Information on the ImageFinder, the off-the-shelf product:

 

http://attrasoft.com/oldsite/

http://www.imagequery.net/imagefinder70/

 

Information on the TransApplet, the off-the-shelf product:

 

http://attrasoft.com/oldsite/transapplet70/

 

To order, go to:

http://attrasoft.com/oldsite/gs.html

 


 

3. System Functionality

 

The Attrasoft Technology provides highly accurate identification of video files including low rates of misidentified (false acceptance) and unidentified content (false rejection).

 

The foundation of the video identification is image identification, which breaks a video into image frames and makes image identification. Attrasoft Image Identification technology is currently in large-scale production.

 

 

3.1 Accuracy

 

a. False-Positives (content not in the identifier database incorrectly identified).

 

The False-Positives rate of Attrasoft Technology for large libraries of movie and television content is expected to be less than 0.1%, if no variant versions are involved.

 

b. False-Negatives (content in the identifier database not identified).

 

The False-Negatives rate is expected to be less than 0.1%, if no variant versions are involved.

 

c. Attrasoft solution accuracy is not affected by:

 

(i)                  The size of the content library: for example, 10000 video clips are just as good as 1000 video clips;

(ii)                The resolution and quality of the digital video file: for example, high compression is just as good as low compression; and

(iii)               The length of video available for analysis: for example, a 5-second clip is just as good a 30-second clip.

 

3.2 Variation

 

When a video has multiple versions to start with, it may have multiple fingerprints. If a variant version can be identified, then no new fingerprint will be required. If a variant version cannot be identified, then a new fingerprint will be required. Generally speaking, adding the variant version of the fingerprint to master library will increase the accuracy.

 

 

3.3 Time Section

 

 Attrasoft solution identifies the time section of a video that the recognized video segment matches. By default, the identified video segment is specified to one second.

 

3.4 Parameters

 

 Attrasoft solution does offer options for modifying the Technology to decrease the rate of False-Positives potentially at the expense of higher False-Negatives and visa versa.  In some applications, False-Negative is more acceptable than a False-Positive; and in some applications, False-Negative is less acceptable than a False-Positive.

 

3.5 Passive

 

Attrasoft Technology does not require changes to digital video files.

 

3.6 Codecs

 

While it is the intention of Attrasoft to support all common video codecs for a wide-range of resolutions and bit rates, the current Version of the Attrasoft VideoFinder supports:

 

*.avi,

*.mwv, and

limited *.mpeg (*.mpg).

 

Attrasoft will adopt a phased approach to gradually support all common video codecs.

 

The unsupported files include:

 

*.mp4

*.mp3

*.mov

*.wma

*.flv

*.ac3

some *.mpg, *mpeg