blog




  • Essay / Spiking Neural Network and Facial Expression Recognition

    IntroductionSpike Neural Network is considered one of the best neural networks today with its computational model aimed at understanding and reproducing human capabilities. By replicating a special class of artificial neural networks where models of neurons communicate through spike sequences, the researcher believes this technique is best for facial recognition, facial expression recognition or emotion detection. Say no to plagiarism. Get a tailor-made essay on “Why violent video games should not be banned”?Get an original essayThe work of C. Du, Y. Nan, R.Yan (2017) has proven this. Their paper proposed a network architecture of spiking neural networks for facial recognition. This network consists of three parts: feature extraction, encoding and classification. For feature extraction, they used the four-layer HMAX model to extract the facial features and then encoded all the features into appropriate spike trains and the Tempotron learning rule was used for less calculations. They used four databases in the experiment: Yale, Extend Yale B, ORL and FERET. The study by A. Taherkhani (2018) addressed the difficult task of training a population of spiking neurons in a multi-layer network for a precise duration and the delayed learning of the SNN was not performed extensively. The paper proposes a biologically plausible supervised learning algorithm to learn more accurate timed multiple spikes in a multi-layer spiking neural network. It trains the SNN through the synergy between delay learning and weight. The proposed method shows that it can achieve higher accuracy compared to the single-spike neural network. The result shows that a high number of desired spikes can reduce the accuracy of the method. He also said that it is possible to extend the algorithm to multiple layers. However, most layers can reduce the effect of previous layer training on the result. The researcher wants to improve the algorithm in terms of performance and calculation. The article by Q. Fu et.al (2017), which improves the performance of the Spiking neural network learning algorithm. It provides three methods to improve the Spiking neural network learning algorithm. It includes back-propagation of inertia term, adaptive learning and measurement function change method. In the four methods, including the original algorithm which was also used, the result shows that adaptive learning has the highest accuracy rate of 90%, it also shows that the original algorithm has the lowest accuracy rate, so the three methods proposed in the article achieved better performance than the original algorithm. Facial expression recognitionWhen we say "facial expression" in the research field, large Researchers think of P. Ekman and his books on emotion based on a person's facial expression. In his book “Unmasking The Face” in collaboration with WV Friesen, they study facial expression and how to identify emotions based on facial expression. They show photographic pictures of each of six emotions: happiness, sadness, surprise, fear, anger and disgust. The question is: are there universal expressions of emotion? When someone is angry, will we have the same expression regardless of their culture, race or language? Paknikar (2008) defines a person's face as the mirror of our mind. Facial expression and its changes provide us withimportant information about the person's state, truth temperament and personality. He also added that today terrorist activities are growing all over the world and detection of potential troublemakers is a major problem. This is why body language, facial expression and tone of speech are the best ways to know a person's personality. According to Husak (2017), facial expressions are an important factor in observing human behavior. He also introduced the rapid facial movements that appear in stressful situations, usually when a person is trying to hide their face or emotion called "Micro-expressions". In the study of Kabani S., Khan O., Khan and Tadvi (2015), they classified facial expressions into 5 different types like joy, anger, sadness, surprise and excitement. They also used an emotion model that identifies a song based on 7 emotion types; joy-surprise, joy-excitement, joy-anger, namely sad, anger, joy and sad-anger. Hu (2017) stated that efficiency and accuracy are the two major problems in facial expression recognition. Time complexity, computational complexity, and space complexity are used to measure efficiency, but in measuring accuracy, there is space complexity or high computational complexity. They also added that there are few other factors that can affect the accuracy, such as pose, low resolution, subjectivity, scale, and base frame identification. Another indicator for emotion detection that Noroozi et. al (2018) studied body language that can affect the emotional state of a human being. They include facial expression, body posture, gestures and eye movements in body language. They constitute an important marker for the detection of emotions. The group of Yaofu, Yang, and Kuai (2012) used a spiking neuron model for facial expression recognition that uses information represented as spike trains. They also added that the main advantage of this model is its low computational cost. They also did an experiment in which they showed a graphical representation of six universal expressions; joy, anger, sadness, surprise and excitement plus a neutral expression. Note that the subjects have a similar facial expression but they are all racially different and each of them has a variation in expression intensity. After the experiment, they found that in the six expressions, the expression of joy and surprise is easiest to recognize while the expression of fear is the most difficult. In the research of Wi Kiat, Tay (2017), they used an emotion analysis solution via computer vision. to automatically recognize facial expressions using live video. They also studied anxiety and depression, considering both of these elements to be included in emotion. They have their own assumptions that “anxiety” is a subset of the emotion “Fear”. According to SW Chew (September 2013) and his study on facial expression recognition, he stated that an automatic facial expression recognition system contains three fundamental components: Face detection and tracking, mapping signals for more distinct features and classification of unique patterns of features. The article by N. Sarode and S. Bhatia (2010) which is “Facial Expression Recognition”, in their research, they study facial expression because it is the best way to detect emotion. They also used amethod which is the local 2D appearance based approach for facial feature extraction, radial symmetry transformation for the basis of the algorithm and also creates a dynamic spatio-temporal representation of the face. Overall, the algorithm achieves 81.0% robustness. For facial images and databases, the work of JL Kiruba and AD Andrushia (2013) which is “Performance Analysis of Learning Algorithm with Various Facial Expressions on State-of-the-art Neural Network” When using of the state-of-the-art neural network for in their research they also use and compare two facial image databases, the first one is the JAFFE database which contains 213 images of 7 facial expressions posed by 10 Japanese women while the other is the MPI database which contains various emotional and conversational expressions. 55 different facial expressions. Ultimately, the result is that the JAFFE database has the highest overall recognition rate compared to the MPI database. Research by Y. Liu and Y. Chen (2012) stated that automatic facial expression recognition is an interesting and challenging problem. Deriving features from a raw facial image is the essential step of a successful approach. In their system, they proposed the combined features which are convolutional neural network and centralized binary model and then they classified everything using support machine vector. They also practiced two datasets: the extended Cohn-Kanade dataset which achieved 97.6% accuracy and the JAFFE database with an accuracy rate of 88.7% with the help of CNN- CBP. MB Mariappan, M. Suk and B. Prabhakaran (December 2012) created a multimedia content recommendation system based on users' facial expression. The system called “FaceFetch” which understands the user's current emotional state (happiness, anger, sadness, disgust, fear and surprise) through facial expression recognition and generates or recommends multimedia content to the user such as music, movies and other videos that might interest the cloud user with near real-time performance. They used the ProASM feature extractor, which resulted in better accuracy, faster and more robust. The application receives very good responses from all users who have tested the system. The technique proposed and used by T. Matlovic, P. Gaspar, R. Moro, J. Simko and M. Bielikova (October 2016) used facial expression. and electroencephalography for emotion detection. First, they analyzed existing tools that use facial expression recognition for emotion detection. Second, they proposed an emotion detection method using electroencephalography (EEG) that utilizes existing machine learning approaches. They set up an experiment in which they got participants to watch music videos that evoke emotions. Their emotional epoch involves obtaining brain activity from participants that achieved 53% accuracy in classifying emotions. He also said that the potential of emotion-based automatic music is considerable as it allows for a deeper understanding of human emotions.Patel et. Al (2012) described music as the "language of emotion", they also give an example where there is an 80 year old man and a 12 year old girl, different generations, different musical tastes but the same emotion result after listening to music. like they could both be happy after listening to it, but they listen to different generations.