A R-CNN machine learning model for handling Pop-up window in mobile Apps.
See also Vision-ui a series algorithms for mobile UI testing.
A R-CNN (Region-based Convolutional Neural Networks) machine learning model for handling pop-up window in mobile apps.
Vision-ml is a machine learning model that identifies the UI element that closes the Pop-up window and return its UI coordinate (x, y) on the screen.
A typical usage scenario would be:
In mobile testing, when using Appium or similar framework for UI automation, it is usually very tricky to locate the components on the Pop-up window which is rendered on top of the current screen.
Input a mobile App screenshot with the Pop-up, and you will get the predicted result (as shown in the blue box).
1 | 2 | 3 |
---|---|---|
Python3.6.x
# create venv before install requirements
pip install -r requirements.txt
You can use Vision with a pre-trained model in "model/trained_model_1.h5", the number in the file name is for version control, you can update it in file named "all_config".
There are two ways of using Vision.
model_predict("path/to/image.png", view=True)
python rcnn_predict.py
You can create server with Dockerfile
python vision_server.py
curl http://localhost:9092/client/vision -F "file=@${IMAGE_PATH}.png"
{
"code": 0,
"data": {
"position": [
618,
1763
],
"score": 1.0
}
}
You can choose to use your train image if your close button has different feature from the given training set. Just take a screenshot and put it in the "image/" folder
Rename the train image with prefix "1" for close button and "0" for background.
You can refer to the given training images in the repo for examples.
Button image named 1_1.png:
Background image named 0_3.png:
0_0.png 0_1.png 0_2.png 0_3.png 0_4.png 0_5.png 0_6.png 1_0.png 1_1.png 1_2.png 1_3.png 1_4.png 1_5.png 1_6.png
Image().get_augmentation()
train_model()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_1 (Conv2D) (None, 48, 48, 32) 320
_________________________________________________________________
conv2d_2 (Conv2D) (None, 46, 46, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 23, 23, 32) 0
_________________________________________________________________
conv2d_3 (Conv2D) (None, 21, 21, 64) 18496
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 10, 10, 64) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 8, 8, 64) 36928
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 4, 4, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 1024) 0
_________________________________________________________________
dense_1 (Dense) (None, 128) 131200
_________________________________________________________________
dense_2 (Dense) (None, 2) 258
=================================================================
Total params: 196,450
Trainable params: 196,450
Non-trainable params: 0
In all_config.py we have training params of batch_size and epochs
With CPU of [email protected]:
The R-CNN model refers to this paper.