Image Tampering Detection

Summary

Document Tampering detection is the task to identify if the digital document image was tampered or not. Previously we looked into various techniques used to identify document tampering.

In the last meeting, we surveyed various datasets on the distribution of the pixel area that was tampered. Also, we looked into the threshold pixel area below which the detection methods completely failed. Please look at the section below for further details.

Work done between: [02/04/19 -> 16/04/19]

Tamapered Area Analysis

Effect on detection on decrease in patch area

We tried to find out the effect of reducing the size of tampering on tampering detection. We show the effect on the current methods used to detect tampering.

In the bar plot above, we show the number of pixels below which each method start to fail. Hence if the area of the patch is smaller than the threshold the particular method fails to detect the tampering. For comparision, generally images contains between about 40000(200x200) to 1048576(1024x1024) pixels.


1. In Yashas' architecture [1], image is converted to 64x64 patches and is checked if it has at least 20% region tampered(which corresponds to about 820 pixels).
2.JPEG artifacts [2] fails when the tampered region is less than 1500 pixels.
3.CMFD [3] fails when the tampered region is less than 3000 pixels.
4.Splicebuster/Noiseprint [4] fail when the tampered region is less than 4096 pixels.

Dataset Level Anaysis
For each dataset below we explain the dataset & show the distribuition of patch size in the dataset.

Below we show the distribuition in patch sizes across datasets and whether they contain small tampered regions. We also observe that the payslip dataset has the smallest tampered regions(tampered region of 3 pixels), followed by the CASIA and Find-it challenge(100 or 400 pixels respectively). IEEE dataset has tampering regions larger than 1000 pixels.


Click on the box below to open each dataset

  • 1. IEEE

    Released in 2013. Contains 451 fake images and 1000 pristine images. Resolution 780x1024.

    Link to the dataset

  • 2. CASIA V2

    Contains 5000 fake images and 7000 pristine images. Resolution 256x400.

    Link to the dataset

  • 3. ICPR 2018 Find-it Challenge

    The dataset is released by Mickaël Coustaty, University of La Rochelle,France. It was released as an ICPR challenge in 2018. Contains 200 fake images and 1800 pristine images. Resolution varying image sizes in HD resolution. Report on challenge

    Link to the dataset

  • 4. Payslip Dataset

    The dataset is released by Mickaël Coustaty, University of La Rochelle,France. The dataset contains 3 types of tampering. Copy Paste Intra, Copy Paste Inter & Imitation.

    Link to the dataset

Work done between: [19/03/19 -> 26/03/19]

Results

The tables below compare the f1 score and accuracy of different methods across various datasets
F1 Score ICPR Challenge Casia V2 IEEE Forensic
Yashas 0.32 0.76 0.86
Fusion 0.89 0.43 -
CMFD 0.36 0.42 0.55
Splicebuster 0.65 0.53 0.72
Fusion + CMFD + Splicebuster 0.954 0.67 0.87
Accuracy ICPR Challenge Casia V2 IEEE Forensic
Yashas 0.84 0.75 0.87
Fusion 0.88 0.67 -
CMFD 0.84 0.69 0.80
Splicebuster 0.803 0.57 0.86
Fusion + CMFD + Splicebuster 0.982 0.79 0.91

Work done between: [12/03/19 -> 19/03/19]

Completed Experiments on Findit Dataset using Yashas' Method.
Researched about companies, labs, and people working in forensic. More this can be found in the Slides here




Work done between: [02/03/19 -> 12/03/19]

Thoroughly understood of the concepts used in ICPR challenge.




Work done between: [24/02/19 -> 12/03/19]

Created slides for the detailed research meeting.
Researched about various techniques in document tampering detection and available datasets. Reasearched about print & scan based forgery, fake receipt templates and printer identification. More on this can be found in slides here




Work done between: [17/02/19 -> 24/03/19]

Completed creating poster and website for the R&D showcase. Link to website


Work done between: [01/01/19 -> 10/02/19]

Completed Experiments on Findit Dataset using Yashas' Method. Results can be found here

A detailed presentation of these methods on Findit Challenge can be found here: Link
Previous SRM meeting slides: Link
DRM meeting slides: Link