AutoRef implementation: Difference between revisions

Revision as of 16:55, 31 March 2021

The implementation for the AutoRef autonomous referee for RoboCup Middle Size League (MSL) robot soccer is the proposed design of the AutoRef system.

In 2021 contributions by MSD 2020 focused on detecting ball-to-player distance violations.

Introduction

Objective statement

The main objective of the implementation part of the project was to detect violations of the rules related to the distance between the ball and players during the following game states:

Free kick
Kick-off
Corner kick
Goal kick
Throw-in
Dropped-ball
Penalty kick

These rules are described in Laws number 8, 10, 13, 14, and 15 of the MSL Rulebook.

Motivation

This objective was chosen due to several reasons:

- Past projects analysis has shown that this functionality has never been designed before

- Stakeholder interviews (the MSL referees) have led to the conclusion that this kind of rules are hard to control for a human being

- Proof of concept for the developed functional specification was desirable

- Learning goals of the team members correspond to the technical solutions necessary for the functionality development

Scope of work

The following topics were included in the implementation scope:

Requirements formulation

Architectural decomposition development

Individual code blocks development

Individual code integration

Software testing on images and videos

These topics are explained in detail in the following sections.

Process model

Introduction

The development activities of the design team need support from process models. In this project, the V-model is chosen to guide the development procedure from requirement engineering to system validation. Due to the particularity of the project itself, some details of the model have been changed. At the same time, the agile approach was used during system development and combined with the traditional V-model, which makes the project progress more flexible and efficient.

Use of V-model

V-model has the following advantages for the development of the project:

- Design team's project is based on machine vision and software algorithms. V-model was first proposed in the software development environment and has matured in the software development field.

- The project team has five members, all of whom can participate in the development of subsystems and they can be developed at the same time. On the premise that the system architecture is determined, V-model can greatly improve the development efficiency.

- The system development starts from the fourth week, which means that the team needs to complete the system development in five weeks, and the mature and ready-made V-model process can save a lot of time spent on project management.

Alt text — Left:V-model used for guiding the project process for design team; Right: Test plan derived from the V-model

Based on the general V-model, a detailed test plan has been made for the verification of the system both from functional and performance perspectives.

Based on the requirements derivation results, functional and performance requirements were set up and related tests were planned as shown in the figure above. In this plan, the details of the V-model are supplemented, and more detailed test steps and iterations are added in the test and verification phases. The technical blocks were integrated into the first phase, then several images regarding typical use cases are created from the simulation environment (refer to Section 3) in order to verify the functional requirements. Videos were created to test the performance of the system based on particular scenarios (refer to Section 4). Code was updated iteratively after several times of tests. After the code was verified, a simulated game video was created in the simulation environment to illustrate how the system works in the 'real' world.

Agile approach

Due to the limited project time and various uncertainties in the system development process, Agile approach was applied in the system development process, which is mainly reflected in the system architecture and design choice part.

There are two main difficulties in the development of this project:

- How to implement an efficient and fast detection algorithm?

- How to achieve an accurate image capture in reality?

Usually, the algorithm needs to be executed after the system obtains the image, but it is worth noting that the design of vision system has the following two considerations, which greatly increases the complexity of the system design:

- Fixed camera OR moved camera

- One camera OR multiple cameras

After careful evaluation, we thought that time spent on algorithm development will be greatly reduced in the design of the vision system, which is not the result we want. Therefore, we finally determined the system development scheme based on Agile. That is, a single fixed camera is initially used, and a simulated game situation has been created under the software simulation environment as a reference sample for algorithm development. Optimization of vision system can be carried out after the algorithm is developed.

The main idea is quickly designing and checking the performance of the algorithm we developed, which also confirms the rationality of ‘decision fast’ in the Agile approach.

Major design choices

Programming language

MATLAB was chosen as the programming language due to the availability of in-built functions and documentation, which is useful in a quick test of a proof-of-concept of the implemented algorithm.

Selection of test environment

It was decided to use a simulation environment, to quickly test and verify the functionality of the ball-player distance violation check algorithm. It was desirable to verify the implementation in a simulation environment before committing to a specific choice of hardware. Other factors leading to the choice of a simulation environment included—time limitations, limited access to the tech-united playing field (covid restrictions), and the limited availability of match video footage with the desirable qualities (RGB top-view).

A custom simulation environment was built using MATLAB, Simulink 3D animation, and the Virtual Reality Modelling Language (VRML). Another option was to use GreenField, the visualization software used by Tech United in replaying recorded match data. This was not done due to less familiarity with Linux systems in the team, and a potential dependency on Tech-united developers, which could lead to a time-consuming learning curve.

Vision system parameters

The main design choices to be made were regarding the selection of the vision system, involving the following aspects:

Mobility of the camera(s) (static vs moving)
Location(s) of the camera(s)
Camera resolution

Motivations for choosing a static camera were as following:

Simplified localization requirements
Simplified implementation architecture
Less risk of invasiveness (spatial and audio)

Motivations for selecting a top camera view were as following:

Using a top-view camera, projection errors can be minimized, which makes the implementation easier
Having a movable top view camera that stays right above the ball would help avoid perspective distortions to a greater extent, but:

We would need to account for localization uncertainty in the case of drones
Multiple cameras might still be needed to detect events that are not in the vicinity of the ball

It would be possible to view the entire field using a camera of sufficient height or field of view

Parameters to consider when using a top-view camera:

Mounting height
Field of View
Resolution or Image size

Parameters to consider for the ball-player distance violation decision-making algorithm:

Pixel to meter ratio or Resolution
Frequency rate—The frequency rate required for the decision-making algorithm is defined based on

The pixel-meter ratio,
Worst-case scenarios defined considering robot dimensions, speed, and acceleration limits

Maximum allowed perspective distortion

When using a single top camera, perspective distortions are unavoidable
High distortions can affect the visibility of the ball, separation of team players, and also lead to incorrect position estimate of the players.

Possible issues of using a single static top camera

Perspective distortions

Objects that are not directly below the camera would be seen at an oblique angle

Occlusions

The ball may not be completely visible and be occluded by players
Players that are too close to each other may not be detected separately using a simple camera

Limitations on accuracy

Placing the camera very high above the field, in such a way that the entire field is visible, could lead to objects being very small in the images, and affect the accuracy of detections
Going with a higher resolution camera would improve the accuracy of the field

Final decision:

Considering time limitations, and the implementation complexity related to the use of multiple cameras and moving cameras, a decision was made to test the ball-player distance violation algorithm for a single top camera concept. The camera was considered to be at a height of 12 m above the center of the field and facing downwards. A choice was also made to keep the field of view to a manageable value of 1.2 radians, such that perspective distortions are minimized. Meanwhile, to have a manageable resolution, the images taken were considered to be Full HD (1920 x 1080) RGB images.

Requirements

Functional requirements

Formal formulation:

The first and foremost sub-requirement is that the system must be modular. Other sub-requirements are as follows:

The system must detect the players and the ball inside the soccer field boundaries and identify the players’ team.
The system must detect the different zones inside the soccer field (corner area, penalty area, etc.)

And the main requirements are:

The system must check if the distance between the center of the ball and any part of the attacker team members (except for the kicker) before free kick, corner kick, kick-off, goal kick, and throw-in is not less than 2m. (with acceptable 5cm inaccuracy). One of the robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 2m.

The system must check if the distance between the center of the ball and any part of the defender team members before free kick, corner kick, kick-off, goal kick, and throw-in is not less than 3m. (with acceptable 5cm inaccuracy). One of the defender robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 3m.

The system must check if the distance between the center of the ball and any part of all the players before the dropped-ball is 1m. (with acceptable 5cm inaccuracy in this distance.). One of the robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 1m.

Why the requirements are formulated with these numbers?

The distance value between the center of the ball and the players before certain events has been specifically mentioned in the law book.

The players' length and width are 50cm, 10% of the players' size was considered as the acceptable accuracy tolerance for the system.

How the requirements were formulated:

Ball-player zone violation is a task that enforces several laws. In order to come up with functional requirements for this task, the first step was to detect the required skills necessary for implementing this task.

The skills are as follows:

ball detection (currently known as ball identify)
player detection (updated: player identify)
player team identification (updated: player team classify)
zone detection (updated: field area classify)

The second step was to detect the related laws for each skill. In this order, the tasks that are part of that law could be identified. This step is an important one because when going into detail for skill implementation, it is important to keep in mind this skill is going to be used in those other tasks in the future and the implementation must be as general and adaptable for future use as possible. For keeping the system modular, it is necessary to specify requirements for the system in this way.

The identified laws are as follows:

Law number 8 (kick-off)
Law number 10 (the method of scoring)
Law number 13 (free kick)
Law number 14 (penalty kick)
Law number 15 (throw-in kick)

For example, the identified tasks for these laws that are related to the ball detection skill are:

Measure the ball’s velocity, for checking if it is stationary (from law number 13)
Detect which player has taken the ball (form law number 10)
Measure how much the ball has rolled before it has touched by other players (form law number 8)

For another example, the identified tasks for player detection skill are:

Control players’ location
Count the number of players
Detect the kicking player
Detect other players

And the process for other skills is the same.

The third step is to come up with the actual requirements and analyze them.

Performance requirements

Frequency

Formal formulation

The system must be able to realize the functional requirements (based on the system accuracy) at least every 89 ms in order to avoid false-negative detections, which means it should have a detection frequency of 11.22Hz.

Why ‘frequency’

The FPS (frame per second) is 30 to 60 for most vision systems, however, the speed of the algorithm might be lower, thus, delays or loss of frames may happen during the processing. In order to get rid of the loss of information of the system, there should be a frequency requirement of the system to constrain the minimum speed of the system.

The frequency requirement is set based on the accuracy requirement formulated in Section 2.2, which means the system should be fast enough to obtain the necessary information based on the accuracy of 0.02 m. In other words, the accuracy influences the time a player escaping from the AOI. If the accuracy is set too small, the time that a player uses to escape is shorter, thus it is harder for the system to capture the second image which shows the violation of the player.

Only false negative (non-violation) is considered in the ‘frequency’ requirement is because a delayed detection is allowed in refereeing systems. If the information of violation has been captured, even though the player is no longer violating at the time the system informs the foul, it doesn't mean that the previous foul can be ignored. Therefore, obtaining enough information is the focus of this requirement, because insufficient information is more likely to lead to false-negative detections.

How these requirements are formulated

These scenarios are formulated based on two extreme scenarios. Assuming a player is captured right at the boulder of the circular area with a 1-meter radius, the system should be fast enough to capture the second image of the player if it enters into the circular area further than 0.02 m (tolerance). If the player enters into the area less than 0.02 m, the system is not necessary to detect the violation.

It is common that a player cuts straight into the area as shown above. In this case, the tolerance describes that, if the player cuts inside the area more than 0.02 m, it should be detected. If it is less than 0.02 m, the time that the player stays inside the area should be shorter (imaging the player cut the area with no tolerance, the time is infinitely short).

Because of the characteristics of a circle, when two circles are tangent, the highest point of the circle along the vertical axis is often higher than the tangent point (shown by red line 'b' in the second figure), which means that for a tolerance of 0.02 m, the distance between the tangent point and the lowest point of the big circle is often less than 0.02 m (shown by red line 'a' in the second figure). Thus, the added length of line 'a' and 'b' should be 0.02 m.

Based on Triangle Similarity Theorems, the length of 'a' can be calculated as below.

a + b = 0.02 m

a = R (radius of the big circle) - L (vertical distance from centre to blue line）

b = r (radius of the small circle) - 0.25L (vertical distance from centre to blue line)

By combining these equations, we can get：

1.25*a = 0.02 m

a = 0.016 m

Based on the length of a, the L can be found out and the length between the two players in figure 1 is:

distance = 2*(1.25*sqrt[(1-L_square)]) = 0.4455 m

time = distance / max_v(5m/s) = 0.0891 s

frequency = 11.22

Assuming a player who wants to cheat and challenge the extreme performance of the system, enters the area and stops at the tolerance boundary (0.02 m inside), it then needs to accelerate to escape the area before the second capture. Thus, the duration after the first capture till the second capture includes two phases: deceleration and acceleration. We assume the brake causes infinite deceleration thus the time in the first phase is based on the maximum velocity (5 m/s).

Time used for deceleration is:

time_1 = 0.02 / max_v = 0.004 s

Time used for acceleration to escape can be calculated as below:

distance (0.02m) = initial velocity (0m/s) * time + 1/2 * acceleration (2m/s2) * time_2_square

time_2 = 0.1429 s

Thus, the total time can be calculated by adding them:

time = 0.1469 s

The frequency required to capture the second image is:

frequency = 1 / time = 6.81

Scenario 1 is more strict than scenario 2, however, in order to ensure the accuracy and reliability of the system test results, we tested both scenarios. Please refer to the Section 6 for details.

Accuracy

Other context information

Colour detection requirement

Minimal distortion requirement

Architecture decomposition

The software architecture used for the implementation of the distance violation checking algorithm is shown in the figure below.

The algorithm is applicable during the following game situations—kick-off, free-kick, throw-in, goal-kick, corner-kick, drop-ball, and penalty-kick, when the ball has been placed at the desired location by the referee, and before the transition to the ‘ball-in-play’ situation. During these situations, players from the attacking and defending teams need to maintain minimum distance requirements from the ball. These distance requirements vary depending on the state of the game (e.g.: during kick-off, throw-in, free-kicks, corners and goal-kicks, the attacking team players other than the kicker should be at least 2 meters away from the ball center, while during an in-game penalty kick, all players other than the kicker should be 3 meters away from the ball center ) The scope of the implemented block could be more clearly visualized by referring to state diagram related to throw-in enforcement (REFER to DIAGRAM IN WIKI). The block is applicable in

the throw-in positioning state, when the ball has been placed at the required location, and
the throw-in start state when the players have positioned themselves correctly with respect to the thrower.

Inputs

Game state: the algorithm requires information on the state of the game, which determines the radii of the areas of interest.
Attacking team ID: The team which takes the kick is the attacking team, and the algorithm requires information on whether team A (left-side) or team B (right-side) takes the kick. In the case of dropped ball however, no ID is needed, as neither team gets to take the ball first.
A 2-D image identifying penalty and goal areas on the field, obtained from a calibration image (RGB, Full HD) of the empty field.
Ball color values (RGB) and tolerance on the values.
Player marker color values (RGB) and the tolerance on the values.
Current camera image (RGB, Full HD).

Sub-blocks

The sub-code blocks are briefly discussed below, further details are given in the subsequent sections

Pre-processing blocks

(zone of field function+ get penalty zones)—separates out the penalty and goal areas of the football field, given an input calibration image, and estimates the meters-pixels ratio.

Internal blocks

Blocks used only for the first iteration of ball-player-distance-violation check algorithm, given the reference location of the ball.

Area of interest function

Blocks used in each iteration of the ball-player distance violation check algorithm

Ball detection algorithm
Player detection algorithm
Player classification algorithm
Decision-making algorithm

Explanation of individual code blocks

Zone of field detection

Ball detection

It is important to have the center of the ball, in the context of our objective to detect violation. Hence, the objective of the ball detection function (function name from the code) is to find the ball center after detecting the ball using image processing techniques. as explained below, the ball shall be detected using color based and shape-based detection algorithms.

Algorithm

Masking the image using a color filter: The filtering operation is used to isolate the ball pixels from the background pixels. The filtering is achieved by a logical check of each pixel values to a minimum and maximum filter value of each channels of the RGB image matrix (test_image). It is assumed that depending on the ball color used in a match or tournament, these filter values shall be provided as an input to the software system. For the current work the ball filters were estimated using the Color Thresholder app within MATLAB. The required inputs are the ball_color and tolerance and the output is a logical matrix.
Performing morphological operations on the filtered image: The filtered image from step 1 may have pixels other than the ball pixels or in other words, there may be noises in the filtered image. Morphological operations like closing and filling were used to remove these noises in the filtered image. A disk-shaped element was used as the structuring elements for the operations, as the object of interest, ball, is always a circular object. Also, for the current scope using a top camera at a fixed height, the radius for the disk element was fixed to a value of 1.
Cropping the filtered image: As the next step, it is possible to use circular object detection algorithms to find the ball in the image. However, it was found that the detection algorithms run faster if the input image is smaller in size. Therefore, the logical image matrix output from step 2 is cropped during this operation.
Finding the ball center: Finally, the center of the ball is found using the function ‘imfindcircle’ function. ‘imfindcircle’ is built in function in MATLAB which uses Circular Hough Transform (CHT) to find circular objects within an image.

Inputs

Test_image – 3 channel RGB matrix of the image frame of the game video
ball_color – a 3 element array input for the color filter operation in step 2
tolerance - input for the color filter operation in step 2

Outputs

centers – a 2 element array input of the ball center in pixel coordinates

General remarks

For the current scope of work the ball detection algorithm has been working well. However, the following points can also be researched more for the optimization of the algorithm.

The color mask filtering can be made robust to be used for wider range of ball colors. For the current scope using an orange color ball, RGB color space was selected within the color thresholder app. This is also because a conversion of image space will also contribute to higher processing time. When the scope is increased of for other ball colors also, a robust method for the filter estimation may be researched upon.
There is a possibility for the structuring element used in the morphological operations to be not robust if the camera vision is mobile. Currently the structural element values are fixed for a particular value of the camera height and therefore was not a concern within the current scope.
Current ball detection algorithm uses CHT for the circular object detection. Machine learning algorithms are also a possible implementation methodology to be researched upon. However, an increase in the processing time may probably be a shortcoming of using machine learning algorithm would be that the processing time increase.

Player detection

Area of interest

Area of Interest (AOI) mentions the area that the violation may happen. This shows the difference between this scheme and the distance detection scheme. The reason of using AOI to detect the violating player rather than calculating the distances between each player and the ball is as follows:

According to the rulebook, none of the parts of a robot should be in certain distance with regard to the ball. Since robots may have various shapes, it is difficult to realize the detection of the closest point on a complex shaped boundary with regard to the ball.

Algorithm

Create a zero matrix with the same size as the image. First, an RGB image is input into the block and is binarized into a 2D matrix. Then the number of rows and columns are measured and used to create a new 2D matrix with zeros.
Specify different AOIs based on game states. Since the AOI radius are different based on different laws, the radius of two circles can be specified as the input of the block and the map-pixel ratio can be used to calculate the pixel radius for inserting circles in the following step.
Insert circles and create the output matrix. Firstly, a red filled circle is inserted in the first blank image while a green circle is inserted in the second blank image. Then an AOI matrix can be created with two layers by comparing the pixel value of red layer of first image and the pixel value of the green layer of the second image with 0.5 respectively (if truth, pixel value becomes 1). Finally, a 3D matrix with two layers can be created as AOI for the following processing. Both layers contain the labelled circles with 1.

Inputs

image_frame - 3 channel RGB matrix of the image frame of the game video
center_ball – The matrix coordinate of the center of the detected ball
rad_meter_1 – The radius of the inner circle based on the typical game state
rad_meter_2 - The radius of the outer circle based on the typical game state
MPratio – The ratio showing the number of pixels used to represent the distance in the real world.

Outputs

AOI – a 3D matrix showing the information of the inner circle and outer circle in separate layers.

Player classification

Decision making function

The objective of the decision-making function is to determine the number of players violating the ball-player distance requirement from teams A and B. It takes into account the location of the penalty zones of the field, as well as information on which team is attacking (not applicable for a drop-ball situation). Considering the risk of false positives—players identified as violating the distance requirement when they are not, due to distorted ground projections when viewed from a top-central camera—a confidence value is also given for each violation, along with the centroid of the region of the players overlapping with the area of interest.

Algorithm

The algorithm first identifies three situations, the first in which team A attacks, the second in which team B attacks and the third, which is the drop-ball situation. In each situation, the following major steps are taken

Identification of players violating the corresponding areas of interest by multiplication of the player detection matrices containing team information with the area of interest matrices. This generates the checkA and checkB matrices for teams A and B respectively.
Performing the regionprops operation on the check matrices to segment the violating players from each team.
For the defending team, identification of violating players present on their respective penalty zones by multiplying with field zones matrix with the corresponding check matrix. One player is allowed to be within the minimum allowed distance while in the penalty zone.
Based on the ratios of the total projected areas of each detected player and the projected areas within the areas of interest, calculate a confidence value for each player found to violate the area of interest.

Inputs

Single-channel binary matrices of the areas of interest for the attacking and defending teams respectively
A single-channel matrix, labelling player detections on the entire field based on their teams (1 for team A, 2 for team B).
A single channel matrix labelling the penalty and goal areas of the field.
A scalar containing information on the attacking team (1 for team A, 2 for team B, 0 for a drop-ball case).
Single channel matrices for team A and B, labelling players based on their total projected areas in pixels.
Single channel matrices for team A and B, labelling players with unique IDs to help avoid recounting the same player, especially when the said player is near the edge of the corresponding area of interest.

Outputs

Struct variables for teams A and B, containing:

The number of violators from each team for the current frame, accounting for the penalty zone exemption for defending teams.
The confidence of violation for each detected player in the corresponding area of interest
The centroid of the overlapping area of each detected player with the corresponding area of interest.

General remarks

The decision making function identifies the number of players violating the area of interest from each team, as well as gives a confidence value and the approximate location of the violation. The confidence of violation, given to counter the effect of perspective distortion, can be used in such a way that players violating the area of interest below a confidence value are not considered as violators. Therefore, players who appear to violate the area of interest, while they do not, have a lower chance of being considered as violators. However, using such a confidence measure has a trade-off—it could lead to actual violators also being ignored. The reasoning behind using such a confidence measure was that it would be better to have a slightly lenient system that detects gross violations and lets off minor violations than an overly strict system that could lead to unnecessary interruptions in the game.

@@ Line 275: / Line 275: @@
 frequency = 11.22
+[[File:FREQUENCY-REQUIREMENTS-2.jpg|center||frame|none|alt=Alt text|Scenario 2: Players enter the aera of interest at the fastest speed and accelerate to leave.]]
+Assuming a player who wants to cheat and challenge the extreme performance of the system, enters the area and stops at the tolerance boundary (0.02 m inside), it then needs to accelerate to escape the area before the second capture. Thus, the duration after the first capture till the second capture includes two phases: deceleration and acceleration. We assume the brake causes infinite deceleration thus the time in the first phase is based on the maximum velocity (5 m/s).
+Time used for deceleration is:
+time_1 = 0.02 / max_v = 0.004 s
+Time used for acceleration to escape can be calculated as below:
+distance (0.02m) = initial velocity (0m/s) * time + 1/2 * acceleration (2m/s2) * time_2_square
+time_2 = 0.1429 s
+Thus, the total time can be calculated by adding them:
+time = 0.1469 s
+The frequency required to capture the second image is:
+frequency = 1 / time = 6.81
+Scenario 1 is more strict than scenario 2, however, in order to ensure the accuracy and reliability of the system test results, we tested both scenarios. Please refer to the Section 6 for details.
 ====Accuracy====

AutoRef implementation: Difference between revisions

Revision as of 16:55, 31 March 2021

Introduction

Objective statement

Motivation

Scope of work

Process model

Introduction

Use of V-model

Agile approach

Major design choices

Programming language

Selection of test environment

Vision system parameters

Requirements

Functional requirements

Formal formulation:

Why the requirements are formulated with these numbers?

How the requirements were formulated:

Performance requirements

Frequency

Formal formulation

Why ‘frequency’

How these requirements are formulated

Accuracy

Other context information

Colour detection requirement

Minimal distortion requirement

Architecture decomposition

Inputs

Sub-blocks

Explanation of individual code blocks

Zone of field detection

Ball detection

Algorithm

Inputs

Outputs

General remarks

Player detection

Area of interest

Algorithm

Inputs

Outputs

Player classification

Decision making function

Algorithm

Inputs

Outputs

General remarks

Verification

Image use case testing

Video use case testing

Long video simulation

Conclusion and recommendations for future work

Conclusion

recommendations

Navigation menu

Search