Python package for detecting copy-move attack on a digital image
This is a python package for detecting copy-move attack on a digital image.
This project is part of our paper that has been published at Springer. More detailed theories and steps are explained there.
To install the package, simply hit it with pip: pip3 install pimage
. Example script for using this package is also provided here.
The algorithm can be dynamically configured with Configuration
class. If omitted, the default value from both of the paper will be used.
The default value and description for each of the parameter is detailed on configuration.py.
from pimage.configuration import Configuration
conf = Configuration(
block_size=32,
nn=2,
nf=188,
nd=50,
p=(1.80, 1.80, 1.80, 0.0125, 0.0125, 0.0125, 0.0125),
t1=2.80,
t2=0.02
)
block_size
: The first algorithm use block size of 32
pixels so this package will use the same value by default. Increasing the size means faster run time at a reduced accuracy. Analogically, decreasing the size means longer run time with increased accuracy.The API for detection process is provided via copy_move.detect()
method. For example:
from pimage import copy_move
from pimage.configuration import Configuration
conf = Configuration(block_size=32)
fraud_list, ground_truth_image, result_image = copy_move.detect("dataset_example_blur.png", configuration=conf)
fraud_list
will be the list of (x_coordinate, y_coordinate)
of the blocks group and the total number of the blocks it is formed with. If this list is not empty, we can assume that the image is being tampered. For example, running the cattle dataset with 32 px of block size will result in:
((-57, -123), 2178)
((-11, 140), 2178)
((-280, 114), 2178)
((-34, -305), 2178)
((-37, 148), 2178)
the above output means there are 5 possible matched/identical region with 2178 overlapping blocks on each of it
ground_truth_image
contains the black and white ground truth of the detection result. This is useful for comparing accuracy, MSE, etc with the ground truth from the dataset
result_image
is the given image where the possible fraud region will be color-bordered (if any)
ground_truth_image
and result_image
will be formatted as numpy.ndarray
. It can further be processed as needed. For example, it can be programmatically modified and then exported later as image like so:
import imageio
imageio.imwrite("result_image.png", result_image)
imageio.imwrite("ground_truth_image.png", ground_truth_image)
To quickly run the detection command for your image, the copy_move.detect_and_export()
is also provided. The command is identical with .detect()
but it also save the result to desired output path.
from pimage import copy_move
copy_move.detect_and_export('dataset_example_blur.png', 'output')
this code will save the ground_truth_image
and result_image
inside output
folder.
When running copy_move.detect()
or copy_move.detect_and_export()
, you can pass verbose=True
to output
the status of each step. The default value will be False
so nothing will be printed.
Example output when verbose mode is being enabled:
Processing: dataset/multi_paste/cattle_gcs500_copy_rb5.png
Step 1 of 4: Object and variable initialization
Step 2 of 4: Computing characteristic features
100%|██████████| 609/609 [04:14<00:00, 2.39it/s]
Step 3 of 4:Pairing image blocks
100%|██████████| 241163/241163 [00:00<00:00, 816659.95it/s]
Step 4 of 4: Image reconstruction
Found pair(s) of possible fraud attack:
((-57, -123), 2178)
((-11, 140), 2178)
((-280, 114), 2178)
((-34, -305), 2178)
((-37, 148), 2178)
Computing time : 254.81 second
Sorting time : 0.89 second
Analyzing time : 0.3 second
Image creation : 1.4 second
Total time : 0:04:17 second
The implementation generally manipulates overlapping blocks, and are constructed based on two algorithms:
We know that the first algorithm use coordinate
and principal_component
features, while the second algorithm use coordinate
and seven_features
.
Knowing that, we then attempt to give a tolerance by merging all the features like so:
The attributes are saved as one object. A lexicographical sorting is then applied to the principal component and the seven features.
The principal component will bring similar block closer, while the seven features will back up the detection for a block that can't be detected by principal component due to being applied with post region duplication process (for example being blurred).
By doing so, the new algorithm will have a tolerance regarding variety of the input image. The detection result will be relatively smooth and accurate for any type of image, with a trade-off in run time as we basically run two algorithm.
All the result of the dataset should be inside output
directory of this repository.
The image shown is ordered as: original, attacked, and the resulting detection image.
The project is formerly written with Python 2 for our Undergraduate Thesis, which is now left unmaintained here. The original thesis is written in Indonesian that in any case can also be downloaded from here.