AutoRef implementation: Difference between revisions
20204919@TUE (talk | contribs) |
|||
Line 275: | Line 275: | ||
frequency = 11.22 | frequency = 11.22 | ||
[[File:FREQUENCY-REQUIREMENTS-2.jpg|center||frame|none|alt=Alt text|Scenario 2: Players enter the aera of interest at the fastest speed and accelerate to leave.]] | |||
Assuming a player who wants to cheat and challenge the extreme performance of the system, enters the area and stops at the tolerance boundary (0.02 m inside), it then needs to accelerate to escape the area before the second capture. Thus, the duration after the first capture till the second capture includes two phases: deceleration and acceleration. We assume the brake causes infinite deceleration thus the time in the first phase is based on the maximum velocity (5 m/s). | |||
Time used for deceleration is: | |||
time_1 = 0.02 / max_v = 0.004 s | |||
Time used for acceleration to escape can be calculated as below: | |||
distance (0.02m) = initial velocity (0m/s) * time + 1/2 * acceleration (2m/s2) * time_2_square | |||
time_2 = 0.1429 s | |||
Thus, the total time can be calculated by adding them: | |||
time = 0.1469 s | |||
The frequency required to capture the second image is: | |||
frequency = 1 / time = 6.81 | |||
Scenario 1 is more strict than scenario 2, however, in order to ensure the accuracy and reliability of the system test results, we tested both scenarios. Please refer to the Section 6 for details. | |||
====Accuracy==== | ====Accuracy==== |
Revision as of 15:55, 31 March 2021
The implementation for the AutoRef autonomous referee for RoboCup Middle Size League (MSL) robot soccer is the proposed design of the AutoRef system.
In 2021 contributions by MSD 2020 focused on detecting ball-to-player distance violations.
Introduction
Objective statement
The main objective of the implementation part of the project was to detect violations of the rules related to the distance between the ball and players during the following game states:
- Free kick
- Kick-off
- Corner kick
- Goal kick
- Throw-in
- Dropped-ball
- Penalty kick
These rules are described in Laws number 8, 10, 13, 14, and 15 of the MSL Rulebook.
Motivation
This objective was chosen due to several reasons:
- Past projects analysis has shown that this functionality has never been designed before
- Stakeholder interviews (the MSL referees) have led to the conclusion that this kind of rules are hard to control for a human being
- Proof of concept for the developed functional specification was desirable
- Learning goals of the team members correspond to the technical solutions necessary for the functionality development
Scope of work
The following topics were included in the implementation scope:
- Requirements formulation
- Architectural decomposition development
- Individual code blocks development
- Individual code integration
- Software testing on images and videos
These topics are explained in detail in the following sections.
Process model
Introduction
The development activities of the design team need support from process models. In this project, the V-model is chosen to guide the development procedure from requirement engineering to system validation. Due to the particularity of the project itself, some details of the model have been changed. At the same time, the agile approach was used during system development and combined with the traditional V-model, which makes the project progress more flexible and efficient.
Use of V-model
V-model has the following advantages for the development of the project:
- Design team's project is based on machine vision and software algorithms. V-model was first proposed in the software development environment and has matured in the software development field.
- The project team has five members, all of whom can participate in the development of subsystems and they can be developed at the same time. On the premise that the system architecture is determined, V-model can greatly improve the development efficiency.
- The system development starts from the fourth week, which means that the team needs to complete the system development in five weeks, and the mature and ready-made V-model process can save a lot of time spent on project management.
Based on the general V-model, a detailed test plan has been made for the verification of the system both from functional and performance perspectives.
Based on the requirements derivation results, functional and performance requirements were set up and related tests were planned as shown in the figure above. In this plan, the details of the V-model are supplemented, and more detailed test steps and iterations are added in the test and verification phases. The technical blocks were integrated into the first phase, then several images regarding typical use cases are created from the simulation environment (refer to Section 3) in order to verify the functional requirements. Videos were created to test the performance of the system based on particular scenarios (refer to Section 4). Code was updated iteratively after several times of tests. After the code was verified, a simulated game video was created in the simulation environment to illustrate how the system works in the 'real' world.
Agile approach
Due to the limited project time and various uncertainties in the system development process, Agile approach was applied in the system development process, which is mainly reflected in the system architecture and design choice part.
There are two main difficulties in the development of this project:
- How to implement an efficient and fast detection algorithm?
- How to achieve an accurate image capture in reality?
Usually, the algorithm needs to be executed after the system obtains the image, but it is worth noting that the design of vision system has the following two considerations, which greatly increases the complexity of the system design:
- Fixed camera OR moved camera
- One camera OR multiple cameras
After careful evaluation, we thought that time spent on algorithm development will be greatly reduced in the design of the vision system, which is not the result we want. Therefore, we finally determined the system development scheme based on Agile. That is, a single fixed camera is initially used, and a simulated game situation has been created under the software simulation environment as a reference sample for algorithm development. Optimization of vision system can be carried out after the algorithm is developed.
The main idea is quickly designing and checking the performance of the algorithm we developed, which also confirms the rationality of ‘decision fast’ in the Agile approach.
Major design choices
Programming language
MATLAB was chosen as the programming language due to the availability of in-built functions and documentation, which is useful in a quick test of a proof-of-concept of the implemented algorithm.
Selection of test environment
It was decided to use a simulation environment, to quickly test and verify the functionality of the ball-player distance violation check algorithm. It was desirable to verify the implementation in a simulation environment before committing to a specific choice of hardware. Other factors leading to the choice of a simulation environment included—time limitations, limited access to the tech-united playing field (covid restrictions), and the limited availability of match video footage with the desirable qualities (RGB top-view).
A custom simulation environment was built using MATLAB, Simulink 3D animation, and the Virtual Reality Modelling Language (VRML). Another option was to use GreenField, the visualization software used by Tech United in replaying recorded match data. This was not done due to less familiarity with Linux systems in the team, and a potential dependency on Tech-united developers, which could lead to a time-consuming learning curve.
Vision system parameters
The main design choices to be made were regarding the selection of the vision system, involving the following aspects:
- Mobility of the camera(s) (static vs moving)
- Location(s) of the camera(s)
- Camera resolution
Motivations for choosing a static camera were as following:
- Simplified localization requirements
- Simplified implementation architecture
- Less risk of invasiveness (spatial and audio)
Motivations for selecting a top camera view were as following:
- Using a top-view camera, projection errors can be minimized, which makes the implementation easier
- Having a movable top view camera that stays right above the ball would help avoid perspective distortions to a greater extent, but:
- We would need to account for localization uncertainty in the case of drones
- Multiple cameras might still be needed to detect events that are not in the vicinity of the ball
- It would be possible to view the entire field using a camera of sufficient height or field of view
Parameters to consider when using a top-view camera:
- Mounting height
- Field of View
- Resolution or Image size
Parameters to consider for the ball-player distance violation decision-making algorithm:
- Pixel to meter ratio or Resolution
- Frequency rate—The frequency rate required for the decision-making algorithm is defined based on
- The pixel-meter ratio,
- Worst-case scenarios defined considering robot dimensions, speed, and acceleration limits
- Maximum allowed perspective distortion
- When using a single top camera, perspective distortions are unavoidable
- High distortions can affect the visibility of the ball, separation of team players, and also lead to incorrect position estimate of the players.
Possible issues of using a single static top camera
- Perspective distortions
- Objects that are not directly below the camera would be seen at an oblique angle
- Occlusions
- The ball may not be completely visible and be occluded by players
- Players that are too close to each other may not be detected separately using a simple camera
- Limitations on accuracy
- Placing the camera very high above the field, in such a way that the entire field is visible, could lead to objects being very small in the images, and affect the accuracy of detections
- Going with a higher resolution camera would improve the accuracy of the field
Final decision:
Considering time limitations, and the implementation complexity related to the use of multiple cameras and moving cameras, a decision was made to test the ball-player distance violation algorithm for a single top camera concept. The camera was considered to be at a height of 12 m above the center of the field and facing downwards. A choice was also made to keep the field of view to a manageable value of 1.2 radians, such that perspective distortions are minimized. Meanwhile, to have a manageable resolution, the images taken were considered to be Full HD (1920 x 1080) RGB images.
Requirements
Functional requirements
Formal formulation:
The first and foremost sub-requirement is that the system must be modular. Other sub-requirements are as follows:
- The system must detect the players and the ball inside the soccer field boundaries and identify the players’ team.
- The system must detect the different zones inside the soccer field (corner area, penalty area, etc.)
And the main requirements are:
- The system must check if the distance between the center of the ball and any part of the attacker team members (except for the kicker) before free kick, corner kick, kick-off, goal kick, and throw-in is not less than 2m. (with acceptable 5cm inaccuracy). One of the robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 2m.
- The system must check if the distance between the center of the ball and any part of the defender team members before free kick, corner kick, kick-off, goal kick, and throw-in is not less than 3m. (with acceptable 5cm inaccuracy). One of the defender robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 3m.
- The system must check if the distance between the center of the ball and any part of all the players before the dropped-ball is 1m. (with acceptable 5cm inaccuracy in this distance.). One of the robots may stay anywhere inside the penalty area (except for the goal area) of its own team, even if the distance to the ball is shorter than 1m.
Why the requirements are formulated with these numbers?
The distance value between the center of the ball and the players before certain events has been specifically mentioned in the law book.
The players' length and width are 50cm, 10% of the players' size was considered as the acceptable accuracy tolerance for the system.
How the requirements were formulated:
Ball-player zone violation is a task that enforces several laws. In order to come up with functional requirements for this task, the first step was to detect the required skills necessary for implementing this task.
The skills are as follows:
- ball detection (currently known as ball identify)
- player detection (updated: player identify)
- player team identification (updated: player team classify)
- zone detection (updated: field area classify)
The second step was to detect the related laws for each skill. In this order, the tasks that are part of that law could be identified. This step is an important one because when going into detail for skill implementation, it is important to keep in mind this skill is going to be used in those other tasks in the future and the implementation must be as general and adaptable for future use as possible. For keeping the system modular, it is necessary to specify requirements for the system in this way.
The identified laws are as follows:
- Law number 8 (kick-off)
- Law number 10 (the method of scoring)
- Law number 13 (free kick)
- Law number 14 (penalty kick)
- Law number 15 (throw-in kick)
For example, the identified tasks for these laws that are related to the ball detection skill are:
- Measure the ball’s velocity, for checking if it is stationary (from law number 13)
- Detect which player has taken the ball (form law number 10)
- Measure how much the ball has rolled before it has touched by other players (form law number 8)
For another example, the identified tasks for player detection skill are:
- Control players’ location
- Count the number of players
- Detect the kicking player
- Detect other players
And the process for other skills is the same.
The third step is to come up with the actual requirements and analyze them.
Performance requirements
Frequency
Formal formulation
The system must be able to realize the functional requirements (based on the system accuracy) at least every 89 ms in order to avoid false-negative detections, which means it should have a detection frequency of 11.22Hz.
Why ‘frequency’
The FPS (frame per second) is 30 to 60 for most vision systems, however, the speed of the algorithm might be lower, thus, delays or loss of frames may happen during the processing. In order to get rid of the loss of information of the system, there should be a frequency requirement of the system to constrain the minimum speed of the system.
The frequency requirement is set based on the accuracy requirement formulated in Section 2.2, which means the system should be fast enough to obtain the necessary information based on the accuracy of 0.02 m. In other words, the accuracy influences the time a player escaping from the AOI. If the accuracy is set too small, the time that a player uses to escape is shorter, thus it is harder for the system to capture the second image which shows the violation of the player.
Only false negative (non-violation) is considered in the ‘frequency’ requirement is because a delayed detection is allowed in refereeing systems. If the information of violation has been captured, even though the player is no longer violating at the time the system informs the foul, it doesn't mean that the previous foul can be ignored. Therefore, obtaining enough information is the focus of this requirement, because insufficient information is more likely to lead to false-negative detections.
How these requirements are formulated
These scenarios are formulated based on two extreme scenarios. Assuming a player is captured right at the boulder of the circular area with a 1-meter radius, the system should be fast enough to capture the second image of the player if it enters into the circular area further than 0.02 m (tolerance). If the player enters into the area less than 0.02 m, the system is not necessary to detect the violation.
It is common that a player cuts straight into the area as shown above. In this case, the tolerance describes that, if the player cuts inside the area more than 0.02 m, it should be detected. If it is less than 0.02 m, the time that the player stays inside the area should be shorter (imaging the player cut the area with no tolerance, the time is infinitely short).
Because of the characteristics of a circle, when two circles are tangent, the highest point of the circle along the vertical axis is often higher than the tangent point (shown by red line 'b' in the second figure), which means that for a tolerance of 0.02 m, the distance between the tangent point and the lowest point of the big circle is often less than 0.02 m (shown by red line 'a' in the second figure). Thus, the added length of line 'a' and 'b' should be 0.02 m.
Based on Triangle Similarity Theorems, the length of 'a' can be calculated as below.
a + b = 0.02 m
a = R (radius of the big circle) - L (vertical distance from centre to blue line)
b = r (radius of the small circle) - 0.25L (vertical distance from centre to blue line)
By combining these equations, we can get:
1.25*a = 0.02 m
a = 0.016 m
Based on the length of a, the L can be found out and the length between the two players in figure 1 is:
distance = 2*(1.25*sqrt[(1-L_square)]) = 0.4455 m
time = distance / max_v(5m/s) = 0.0891 s
frequency = 11.22
Assuming a player who wants to cheat and challenge the extreme performance of the system, enters the area and stops at the tolerance boundary (0.02 m inside), it then needs to accelerate to escape the area before the second capture. Thus, the duration after the first capture till the second capture includes two phases: deceleration and acceleration. We assume the brake causes infinite deceleration thus the time in the first phase is based on the maximum velocity (5 m/s).
Time used for deceleration is:
time_1 = 0.02 / max_v = 0.004 s
Time used for acceleration to escape can be calculated as below:
distance (0.02m) = initial velocity (0m/s) * time + 1/2 * acceleration (2m/s2) * time_2_square
time_2 = 0.1429 s
Thus, the total time can be calculated by adding them:
time = 0.1469 s
The frequency required to capture the second image is:
frequency = 1 / time = 6.81
Scenario 1 is more strict than scenario 2, however, in order to ensure the accuracy and reliability of the system test results, we tested both scenarios. Please refer to the Section 6 for details.
Accuracy
Other context information
Colour detection requirement
Minimal distortion requirement
Architecture decomposition
The software architecture used for the implementation of the distance violation checking algorithm is shown in the figure below.
The algorithm is applicable during the following game situations—kick-off, free-kick, throw-in, goal-kick, corner-kick, drop-ball, and penalty-kick, when the ball has been placed at the desired location by the referee, and before the transition to the ‘ball-in-play’ situation. During these situations, players from the attacking and defending teams need to maintain minimum distance requirements from the ball. These distance requirements vary depending on the state of the game (e.g.: during kick-off, throw-in, free-kicks, corners and goal-kicks, the attacking team players other than the kicker should be at least 2 meters away from the ball center, while during an in-game penalty kick, all players other than the kicker should be 3 meters away from the ball center ) The scope of the implemented block could be more clearly visualized by referring to state diagram related to throw-in enforcement (REFER to DIAGRAM IN WIKI). The block is applicable in
- the throw-in positioning state, when the ball has been placed at the required location, and
- the throw-in start state when the players have positioned themselves correctly with respect to the thrower.
Inputs
- Game state: the algorithm requires information on the state of the game, which determines the radii of the areas of interest.
- Attacking team ID: The team which takes the kick is the attacking team, and the algorithm requires information on whether team A (left-side) or team B (right-side) takes the kick. In the case of dropped ball however, no ID is needed, as neither team gets to take the ball first.
- A 2-D image identifying penalty and goal areas on the field, obtained from a calibration image (RGB, Full HD) of the empty field.
- Ball color values (RGB) and tolerance on the values.
- Player marker color values (RGB) and the tolerance on the values.
- Current camera image (RGB, Full HD).
Sub-blocks
The sub-code blocks are briefly discussed below, further details are given in the subsequent sections
- Pre-processing blocks
- (zone of field function+ get penalty zones)—separates out the penalty and goal areas of the football field, given an input calibration image, and estimates the meters-pixels ratio.
- Internal blocks
- Blocks used only for the first iteration of ball-player-distance-violation check algorithm, given the reference location of the ball.
- Area of interest function
- Blocks used in each iteration of the ball-player distance violation check algorithm
- Ball detection algorithm
- Player detection algorithm
- Player classification algorithm
- Decision-making algorithm
Explanation of individual code blocks
Zone of field detection
Ball detection
It is important to have the center of the ball, in the context of our objective to detect violation. Hence, the objective of the ball detection function (function name from the code) is to find the ball center after detecting the ball using image processing techniques. as explained below, the ball shall be detected using color based and shape-based detection algorithms.
Algorithm
- Masking the image using a color filter: The filtering operation is used to isolate the ball pixels from the background pixels. The filtering is achieved by a logical check of each pixel values to a minimum and maximum filter value of each channels of the RGB image matrix (test_image). It is assumed that depending on the ball color used in a match or tournament, these filter values shall be provided as an input to the software system. For the current work the ball filters were estimated using the Color Thresholder app within MATLAB. The required inputs are the ball_color and tolerance and the output is a logical matrix.
- Performing morphological operations on the filtered image: The filtered image from step 1 may have pixels other than the ball pixels or in other words, there may be noises in the filtered image. Morphological operations like closing and filling were used to remove these noises in the filtered image. A disk-shaped element was used as the structuring elements for the operations, as the object of interest, ball, is always a circular object. Also, for the current scope using a top camera at a fixed height, the radius for the disk element was fixed to a value of 1.
- Cropping the filtered image: As the next step, it is possible to use circular object detection algorithms to find the ball in the image. However, it was found that the detection algorithms run faster if the input image is smaller in size. Therefore, the logical image matrix output from step 2 is cropped during this operation.
- Finding the ball center: Finally, the center of the ball is found using the function ‘imfindcircle’ function. ‘imfindcircle’ is built in function in MATLAB which uses Circular Hough Transform (CHT) to find circular objects within an image.
Inputs
- Test_image – 3 channel RGB matrix of the image frame of the game video
- ball_color – a 3 element array input for the color filter operation in step 2
- tolerance - input for the color filter operation in step 2
Outputs
- centers – a 2 element array input of the ball center in pixel coordinates
General remarks
For the current scope of work the ball detection algorithm has been working well. However, the following points can also be researched more for the optimization of the algorithm.
- The color mask filtering can be made robust to be used for wider range of ball colors. For the current scope using an orange color ball, RGB color space was selected within the color thresholder app. This is also because a conversion of image space will also contribute to higher processing time. When the scope is increased of for other ball colors also, a robust method for the filter estimation may be researched upon.
- There is a possibility for the structuring element used in the morphological operations to be not robust if the camera vision is mobile. Currently the structural element values are fixed for a particular value of the camera height and therefore was not a concern within the current scope.
- Current ball detection algorithm uses CHT for the circular object detection. Machine learning algorithms are also a possible implementation methodology to be researched upon. However, an increase in the processing time may probably be a shortcoming of using machine learning algorithm would be that the processing time increase.
Player detection
Area of interest
Area of Interest (AOI) mentions the area that the violation may happen. This shows the difference between this scheme and the distance detection scheme. The reason of using AOI to detect the violating player rather than calculating the distances between each player and the ball is as follows:
According to the rulebook, none of the parts of a robot should be in certain distance with regard to the ball. Since robots may have various shapes, it is difficult to realize the detection of the closest point on a complex shaped boundary with regard to the ball.
Algorithm
- Create a zero matrix with the same size as the image. First, an RGB image is input into the block and is binarized into a 2D matrix. Then the number of rows and columns are measured and used to create a new 2D matrix with zeros.
- Specify different AOIs based on game states. Since the AOI radius are different based on different laws, the radius of two circles can be specified as the input of the block and the map-pixel ratio can be used to calculate the pixel radius for inserting circles in the following step.
- Insert circles and create the output matrix. Firstly, a red filled circle is inserted in the first blank image while a green circle is inserted in the second blank image. Then an AOI matrix can be created with two layers by comparing the pixel value of red layer of first image and the pixel value of the green layer of the second image with 0.5 respectively (if truth, pixel value becomes 1). Finally, a 3D matrix with two layers can be created as AOI for the following processing. Both layers contain the labelled circles with 1.
Inputs
- image_frame - 3 channel RGB matrix of the image frame of the game video
- center_ball – The matrix coordinate of the center of the detected ball
- rad_meter_1 – The radius of the inner circle based on the typical game state
- rad_meter_2 - The radius of the outer circle based on the typical game state
- MPratio – The ratio showing the number of pixels used to represent the distance in the real world.
Outputs
- AOI – a 3D matrix showing the information of the inner circle and outer circle in separate layers.
Player classification
Decision making function
The objective of the decision-making function is to determine the number of players violating the ball-player distance requirement from teams A and B. It takes into account the location of the penalty zones of the field, as well as information on which team is attacking (not applicable for a drop-ball situation). Considering the risk of false positives—players identified as violating the distance requirement when they are not, due to distorted ground projections when viewed from a top-central camera—a confidence value is also given for each violation, along with the centroid of the region of the players overlapping with the area of interest.
Algorithm
The algorithm first identifies three situations, the first in which team A attacks, the second in which team B attacks and the third, which is the drop-ball situation. In each situation, the following major steps are taken
- Identification of players violating the corresponding areas of interest by multiplication of the player detection matrices containing team information with the area of interest matrices. This generates the checkA and checkB matrices for teams A and B respectively.
- Performing the regionprops operation on the check matrices to segment the violating players from each team.
- For the defending team, identification of violating players present on their respective penalty zones by multiplying with field zones matrix with the corresponding check matrix. One player is allowed to be within the minimum allowed distance while in the penalty zone.
- Based on the ratios of the total projected areas of each detected player and the projected areas within the areas of interest, calculate a confidence value for each player found to violate the area of interest.
Inputs
- Single-channel binary matrices of the areas of interest for the attacking and defending teams respectively
- A single-channel matrix, labelling player detections on the entire field based on their teams (1 for team A, 2 for team B).
- A single channel matrix labelling the penalty and goal areas of the field.
- A scalar containing information on the attacking team (1 for team A, 2 for team B, 0 for a drop-ball case).
- Single channel matrices for team A and B, labelling players based on their total projected areas in pixels.
- Single channel matrices for team A and B, labelling players with unique IDs to help avoid recounting the same player, especially when the said player is near the edge of the corresponding area of interest.
Outputs
Struct variables for teams A and B, containing:
- The number of violators from each team for the current frame, accounting for the penalty zone exemption for defending teams.
- The confidence of violation for each detected player in the corresponding area of interest
- The centroid of the overlapping area of each detected player with the corresponding area of interest.
General remarks
The decision making function identifies the number of players violating the area of interest from each team, as well as gives a confidence value and the approximate location of the violation. The confidence of violation, given to counter the effect of perspective distortion, can be used in such a way that players violating the area of interest below a confidence value are not considered as violators. Therefore, players who appear to violate the area of interest, while they do not, have a lower chance of being considered as violators. However, using such a confidence measure has a trade-off—it could lead to actual violators also being ignored. The reasoning behind using such a confidence measure was that it would be better to have a slightly lenient system that detects gross violations and lets off minor violations than an overly strict system that could lead to unnecessary interruptions in the game.