The main objective of the model is to tell whether the accident took place or not and also detecting the accident on site. Mainly our focus is on the far moving objects which must be captured by cctv cams as they are located on some height and it becomes some challenging task to detect objects from far images captured and classify them accordingly. We train our model in such a way that it will do the task of accident detection from far sites and cctv footages. The objective to design a model which will tell in run-time about the accident according to inputs fed to the model. Accident detection model is an object detection model which works in run-time and YOLO performs this work very well whose speed is 25 fps and is very much fast. There are different types of vehicles that can be found running on roads and their shapes are also different from each other. And we also see that accidents can occur anywhere and at any time and these accidents are of different types. Maybe it can be face to face collision of vehicles or it may be ramming vehicle from sidewise anyhow. So this model is successful in detecting the accident site with much less loss.
As the main goal for a object detection model is to classify objects whether it is large or small in size.Typically only a small number of instances of the object are present in the image, but there is a very large number of possible locations and scales at which they can occur and that needed to be explored. Each detection is reported with some form of pose information. This could be as simple as the location of the object, or the extent of the object defined in terms of a bounding box. For example an accident detector may compute the locations of the Vehicle and accident in addition to the bounding box of the face.Object detection systems construct a model for an object class from a set of training examples.
DARKNET -53-:YOLO uses convolutional layers. YOO v3 consists of 53 Convolutional layers that are also called DARKNET-53. For detection layer original architecture is stacked with 53 more layers that give us 106 layers of YOLO and they can efficiently perform their task with less loss. Essential elements of this model are Residual block, skip connections, up-sampling. This mode takes input as a batch of image (n{no. Of images},416{width},416{height},3{RGB channel}).We can load a pre trained version of the network trained on more than a million images from the ImageNet database [1]. The pretrained network can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals. As a result, the network has learned rich feature representations for a wide range of images. The network has an image input size of 256-by-256.
This model will label the vehicles and detect the accident site. As our model is defined to detect the accident and classify vehicles it is working for the same. We fed a video having several clips in which there are accidents. Our model works properly by classifying vehicles and accidents clearly even in the video.