• Search Menu
  • Sign in through your institution
  • Advance Articles
  • Editor's Choice
  • Braunwald's Corner
  • ESC Guidelines
  • EHJ Dialogues
  • Issue @ a Glance Podcasts
  • CardioPulse
  • Weekly Journal Scan
  • European Heart Journal Supplements
  • Year in Cardiovascular Medicine
  • Asia in EHJ
  • Most Cited Articles
  • ESC Content Collections
  • Author Guidelines
  • Submission Site
  • Why publish with EHJ?
  • Open Access Options
  • Submit from medRxiv or bioRxiv
  • Author Resources
  • Self-Archiving Policy
  • Read & Publish
  • Advertising and Corporate Services
  • Advertising
  • Reprints and ePrints
  • Sponsored Supplements
  • Journals Career Network
  • About European Heart Journal
  • Editorial Board
  • About the European Society of Cardiology
  • ESC Publications
  • War in Ukraine
  • ESC Membership
  • ESC Journals App
  • Developing Countries Initiative
  • Dispatch Dates
  • Terms and Conditions
  • Journals on Oxford Academic
  • Books on Oxford Academic

Issue Cover

Article Contents

Introduction, the power of non-verbal communication, in academic settings, the role of body language in interviews and evaluations, cultural considerations, the impact of body language on collaboration, declarations.

  • < Previous

Unspoken science: exploring the significance of body language in science and academia

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Mansi Patil, Vishal Patil, Unisha Katre, Unspoken science: exploring the significance of body language in science and academia, European Heart Journal , Volume 45, Issue 4, 21 January 2024, Pages 250–252, https://doi.org/10.1093/eurheartj/ehad598

  • Permissions Icon Permissions

Scientific presentations serve as a platform for researchers to share their work and engage with their peers. Science and academia rely heavily on effective communication to share knowledge and foster collaboration. Science and academia are domains deeply rooted in the pursuit of knowledge and the exchange of ideas. While the focus is often on the content of research papers, lectures, and presentations, there is another form of communication that plays a significant role in these fields: body language. Non-verbal cues, such as facial expressions, gestures, posture, and eye contact, can convey a wealth of information, often subtly influencing interpersonal dynamics and the perception of scientific work. In this article, we will delve into the unspoken science of body language, exploring its significance in science and academia. It is essential to emphasize on the importance of body language in scientific and academic settings, highlighting its impact on presentations, interactions, interviews, and collaborations. Additionally, cultural considerations and the implications for cross-cultural communication are explored. By understanding the unspoken science of body language, researchers and academics can enhance their communication skills and promote a more inclusive and productive scientific community.

Communication is a multi-faceted process, and words are only one aspect of it. Research suggests that non-verbal communication constitutes a substantial portion of human interaction, often conveying information that words alone cannot. Body language has a direct impact on how people perceive and interpret scientific ideas and findings. 1 For example, a presenter who maintains confident eye contact, uses purposeful gestures, and exhibits an open posture is likely to be seen as more credible and persuasive compared with someone who fidgets, avoids eye contact, and displays closed-off body language ( Figure 1 ).

Types of non-verbal communications.2 Non-verbal communication comprises of haptics, gestures, proxemics, facial expressions, paralinguistics, body language, appearance, eye contact, and artefacts.

Types of non-verbal communications. 2 Non-verbal communication comprises of haptics, gestures, proxemics, facial expressions, paralinguistics, body language, appearance, eye contact, and artefacts.

In academia, body language plays a crucial role in various contexts. During lectures, professors who use engaging body language, such as animated gestures and expressive facial expressions, can captivate their students and enhance the learning experience. Similarly, students who exhibit attentive and respectful body language, such as maintaining eye contact and nodding, signal their interest and engagement in the subject matter. 3

Body language also influences interactions between colleagues and supervisors. For instance, in a laboratory setting, researchers who display confident and open body language are more likely to be perceived as competent and reliable by their peers. Conversely, individuals who exhibit closed-off or defensive body language may inadvertently create an environment that inhibits collaboration and knowledge sharing. The impact of haptics in research collaboration and networking lies in its potential to enhance interpersonal connections and convey emotions, thereby fostering a deeper sense of empathy and engagement among participants.

Interviews and evaluations are critical moments in academic and scientific careers. Body language can significantly impact the outcomes of these processes. Candidates who display confident body language, including good posture, firm handshakes, and appropriate gestures, are more likely to make positive impressions on interviewers or evaluators. Conversely, individuals who exhibit nervousness or closed-off body language may unwittingly convey a lack of confidence or competence, even if their qualifications are strong. Recognizing the power of body language in these situations allows individuals to present themselves more effectively and positively.

Non-verbal cues play a pivotal role during interviews and conferences, where researchers and academics showcase their work. When attending conferences or presenting research, scientists must be aware of their body language to effectively convey their expertise and credibility. Confident body language can inspire confidence in others, making it easier to establish professional connections, garner support for research projects, and secure collaborations.

Similarly, during job interviews, body language can significantly impact the outcome. The facial non-verbal elements of an interviewee in a job interview setting can have a great effect on their chances of being hired. The face as a whole, the eyes, and the mouth are features that are looked at and observed by the interviewer as they makes their judgements on the person’s effective work ability. The more an applicant genuinely smiles and has their eyes’ non-verbal message match their mouth’s non-verbal message, they will be more likely to get hired than those who do not. As proven, that first impression can be made in only milliseconds; thus, it is crucial for an applicant to pass that first test. It paints the road for the rest of the interview process. 4

While body language is a universal form of communication, it is important to recognize that its interpretation can vary across cultures. Different cultures have distinct norms and expectations regarding body language, and what may be seen as confident in one culture may be interpreted differently in another. 5 It is crucial for scientists and academics to be aware of these cultural nuances to foster effective cross-cultural communication and understanding. Awareness of cultural nuances is crucial in fostering effective cross-cultural communication and understanding. Scientists and academics engaged in international collaborations or interactions should familiarize themselves with cultural differences to avoid misunderstandings and promote respectful and inclusive communication.

Collaboration lies at the heart of scientific progress and academic success. Body language plays a significant role in building trust and establishing effective collaboration among researchers and academics. Open and inviting body language, along with active listening skills, can foster an environment where ideas can be freely exchanged, leading to innovative breakthroughs. In research collaboration and networking, proxemics can significantly affect the level of trust and rapport between researchers. Respecting each other’s personal space and maintaining appropriate distances during interactions can foster a more positive and productive working relationship, leading to better communication and idea exchange ( Figure 2 ). Furthermore, being aware of cultural variations in proxemics can help researchers navigate diverse networking contexts, promoting cross-cultural understanding and enabling more fruitful international collaborations.

Overcoming the barrier of communication. The following factors are important for overcoming the barriers in communication, namely, using culturally appropriate language, being observant, assuming positive intentions, avoiding being judgemental, identifying and controlling bias, slowing down responses, emphasizing relationships, seeking help from interpreters, being eager to learn and adapt, and being empathetic.

Overcoming the barrier of communication. The following factors are important for overcoming the barriers in communication, namely, using culturally appropriate language, being observant, assuming positive intentions, avoiding being judgemental, identifying and controlling bias, slowing down responses, emphasizing relationships, seeking help from interpreters, being eager to learn and adapt, and being empathetic.

On the other hand, negative body language, such as crossed arms, lack of eye contact, or dismissive gestures, can signal disinterest or disagreement, hindering collaboration and stifling the flow of ideas. Recognizing and addressing such non-verbal cues can help create a more inclusive and productive scientific community.

Effective communication is paramount in science and academia, where the exchange of ideas and knowledge fuels progress. While the scientific community often focuses on the power of words, it is crucial not to send across conflicting verbal and non-verbal cues. While much attention is given to verbal communication, the significance of non-verbal cues, specifically body language, cannot be overlooked. Body language encompasses facial expressions, gestures, posture, eye contact, and other non-verbal behaviours that convey information beyond words.

Disclosure of Interest

There are no conflicts of interests from all authors.

Baugh AD , Vanderbilt AA , Baugh RF . Communication training is inadequate: the role of deception, non-verbal communication, and cultural proficiency . Med Educ Online 2020 ; 25 : 1820228 . https://doi.org/10.1080/10872981.2020.1820228

Google Scholar

Aralia . 8 Nonverbal Tips for Public Speaking . Aralia Education Technology. https://www.aralia.com/helpful-information/nonverbal-tips-public-speaking/ (22 July 2023, date last accessed)

Danesi M . Nonverbal communication. In: Understanding Nonverbal Communication : Boomsburry Academic , 2022 ; 121 – 162 . https://doi.org/10.5040/9781350152670.ch-001

Google Preview

Cortez R , Marshall D , Yang C , Luong L . First impressions, cultural assimilation, and hireability in job interviews: examining body language and facial expressions’ impact on employer’s perceptions of applicants . Concordia J Commun Res 2017 ; 4 . https://doi.org/10.54416/dgjn3336

Pozzer-Ardenghi L . Nonverbal aspects of communication and interaction and their role in teaching and learning science. In: The World of Science Education . Netherlands : Brill , 2009 , 259 – 271 . https://doi.org/10.1163/9789087907471_019

Email alerts

Citing articles via, looking for your next opportunity, affiliations.

  • Online ISSN 1522-9645
  • Print ISSN 0195-668X
  • Copyright © 2024 European Society of Cardiology
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, body language in the brain: constructing meaning from expressive movement.

research paper on body language pdf

  • 1 Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
  • 2 Mental Health and Integrated Neurobehavioral Development Research Core, Child and Family Research Institute, Vancouver, BC, Canada
  • 3 Psychiatric Epidemiology and Evaluation Unit, Saint John of God Clinical Research Center, Brescia, Italy
  • 4 Department of Psychological and Brain Sciences, University of California, Santa Barbara, CA, USA

This fMRI study investigated neural systems that interpret body language—the meaningful emotive expressions conveyed by body movement. Participants watched videos of performers engaged in modern dance or pantomime that conveyed specific themes such as hope, agony, lust, or exhaustion. We tested whether the meaning of an affectively laden performance was decoded in localized brain substrates as a distinct property of action separable from other superficial features, such as choreography, kinematics, performer, and low-level visual stimuli. A repetition suppression (RS) procedure was used to identify brain regions that decoded the meaningful affective state of a performer, as evidenced by decreased activity when emotive themes were repeated in successive performances. Because the theme was the only feature repeated across video clips that were otherwise entirely different, the occurrence of RS identified brain substrates that differentially coded the specific meaning of expressive performances. RS was observed bilaterally, extending anteriorly along middle and superior temporal gyri into temporal pole, medially into insula, rostrally into inferior orbitofrontal cortex, and caudally into hippocampus and amygdala. Behavioral data on a separate task indicated that interpreting themes from modern dance was more difficult than interpreting pantomime; a result that was also reflected in the fMRI data. There was greater RS in left hemisphere, suggesting that the more abstract metaphors used to express themes in dance compared to pantomime posed a greater challenge to brain substrates directly involved in decoding those themes. We propose that the meaning-sensitive temporal-orbitofrontal regions observed here comprise a superordinate functional module of a known hierarchical action observation network (AON), which is critical to the construction of meaning from expressive movement. The findings are discussed with respect to a predictive coding model of action understanding.

Introduction

Body language is a powerful form of non-verbal communication providing important clues about the intentions, emotions, and motivations of others. In the course of our everyday lives, we pick up information about what people are thinking and feeling through their body posture, mannerisms, gestures, and the prosody of their movements. This intuitive social awareness is an impressive feat of neural integration; the cumulative result of activity in distributed brain systems specialized for coding a wide range of social information. Reading body language is more than just a matter of perception. It entails not only recognizing and coding socially relevant visual information, but also ascribing meaning to those representations.

We know a great deal about brain systems involved in the perception of facial expressions, eye movements, body movement, hand gestures, and goal directed actions, as well as those mediating affective, decision, and motor responses to social stimuli. What is still missing is an understanding of how the brain “reads” body language. Beyond the decoding of body motion, what are the brain substrates directly involved in extracting meaning from affectively laden body expressions? The brain has several functionally specialized structures and systems for processing socially relevant perceptual information. A subcortical pulvinar-superior colliculus-amygdala-striatal circuit mediates reflex-like perception of emotion from body posture, particularly fear, and activates commensurate reflexive motor responses ( Dean et al., 1989 ; Cardinal et al., 2002 ; Sah et al., 2003 ; de Gelder and Hadjikhani, 2006 ). A region of the occipital cortex known as the extrastriate body area (EBA) is sensitive to bodily form ( Bonda et al., 1996 ; Hadjikhani and de Gelder, 2003 ; Astafiev et al., 2004 ; Peelen and Downing, 2005 ; Urgesi et al., 2006 ). The fusiform gyrus of the ventral occipital and temporal lobes has a critical role in processing faces and facial expressions ( McCarthy et al., 1997 ; Hoffman and Haxby, 2000 ; Haxby et al., 2002 ). Posterior superior temporal sulcus is involved in perceiving the motion of biological forms in particular ( Allison et al., 2000 ; Pelphrey et al., 2005 ). Somatosensory, ventromedial prefrontal, premotor, and insular cortex contribute to one's own embodied awareness of perceived emotional states ( Adolphs et al., 2000 ; Damasio et al., 2000 ). Visuomotor processing in a functional brain network known as the action observation network (AON) codes observed action in distinct functional modules that together link the perception of action and emotional body language with ongoing behavioral goals and the formation of adaptive reflexes, decisions, and motor behaviors ( Grafton et al., 1996 ; Rizzolatti et al., 1996b , 2001 ; Hari et al., 1998 ; Fadiga et al., 2000 ; Buccino et al., 2001 ; Grézes et al., 2001 ; Grèzes et al., 2001 ; Ferrari et al., 2003 ; Zentgraf et al., 2005 ; Bertenthal et al., 2006 ; de Gelder, 2006 ; Frey and Gerry, 2006 ; Ulloa and Pineda, 2007 ). Given all we know about how bodies, faces, emotions, and actions are perceived, one might expect a clear consensus on how meaning is derived from these percepts. Perhaps surprisingly, while we know these systems are crucial to integrating perceptual information with affective and motor responses, how the brain deciphers meaning based on body movement remains unknown. The focus of this investigation was to identify brain substrates that decode meaning from body movement, as evidenced by meaning-specific neural processing that differentiates body movements conveying distinct expressions.

To identify brain substrates sensitive to the meaningful emotive state of an actor conveyed through body movement, we used repetition suppression (RS) fMRI. This technique identifies regions of the brain that code for a particular stimulus dimension (e.g., shape) by revealing substrates that have different patterns of neural activity in response to different attributes of that dimension (e.g., circle, square, triangle; Grill-Spector et al., 2006 ). When a particular attribute is repeated, synaptic activity and the associated blood oxygen level-dependent (BOLD) response decreases in voxels containing neuronal assemblies that code that attribute ( Wiggs and Martin, 1998 ; Grill-Spector and Malach, 2001 ). We have used this method previously to show that various properties of an action such as movement kinematics, object goal, outcome, and context-appropriateness of action mechanics are uniquely coded by different neural substrates within a parietal-frontal action observation network (AON; Hamilton and Grafton, 2006 , 2007 , 2008 ; Ortigue et al., 2010 ). Here, we applied RS-fMRI to identify brain areas in which activity decreased when the meaningful emotive theme of an expressive performance was repeated between trials. The results demonstrate a novel coding function of the AON—decoding meaning from body language.

Working with a group of professional dancers, we produced a set of video clips in which performers intentionally expressed a particular meaningful theme either through dance or pantomime. Typical themes consisted of expressions of hope, agony, lust, or exhaustion. The experimental manipulation of theme was studied independently of choreography, performer, or camera viewpoint, which allowed us to repeat the meaning of a movement sequence from one trial to another while varying physical movement characteristics and perceptual features. With this RS-fMRI design, a decrease in BOLD activity for repeated relative to novel themes (RS) could not be attributed to specific movements, characteristics of the performer, “low-level” visual features, or the general process of attending to body expressions. Rather, RS revealed brain areas in which specific voxel-wise neural population codes differentiated meaningful expressions based on body movement (Figure 1 ).

www.frontiersin.org

Figure 1. Manipulating trial sequence to induce RS in brain regions that decode body language . The order of video presentation was controlled such that themes depicted in consecutive videos were either novel or repeated. Each consecutive video clip was unique; repeated themes were always portrayed by different dancers, different camera angles, or both. Thus, RS for repeated themes was not the result of low-level visual features, but rather identified brain areas that were sensitive to the specific meaningful theme conveyed by a performance. In brain regions showing RS, a particular affective theme—hope, for example—will evoke a particular pattern of neural activity. A novel theme on the subsequent trial—illness, for instance—will trigger a different but equally strong pattern of neural activity in distinct cell assemblies, resulting in an equivalent BOLD response. In contrast, a repetition of the hopefulness theme on the subsequent trial will trigger activity in the same neural assemblies as the first trial, but to a lesser extent, resulting in a reduced BOLD response for repeated themes. In this way, regions showing RS reveal regions that support distinct patterns of neural activity in response to different themes.

Participants were scanned using fMRI while viewing a series of 10-s video clips depicting modern dance or pantomime performances that conveyed specific meaningful themes. Because each performer had a unique artistic style, the same theme could be portrayed using completely different physical movements. This allowed the repetition of meaning while all other aspects of the physical stimuli varied from trial to trial. We predicted that specific regions of the AON engaged by observing expressive whole body movement would show suppressed BOLD activation for repeated relative to novel themes (RS). Brain regions showing RS would reveal brain substrates directly involved in decoding meaning based on body movement.

The dance and pantomime performances used here conveyed expressive themes through movement, but did not rely on typified, canonical facial expressions to invoke particular affective responses. Rather, meaningful themes were expressed with unique artistic choreography while facial expressions were concealed with a classic white mime's mask. The result was a subtle stimulus set that promoted thoughtful, interpretive viewing that could not elicit reflex-like responses based on prototypical facial expressions. In so doing, the present study shifted the focus away from automatic affective resonance toward a more deliberate ascertainment of meaning from movement.

While dance and pantomime both expressed meaningful emotive themes, the quality of movement and the types of gestures used were different. Pantomime sequences used fairly mundane gestures and natural, everyday movements. Dance sequences used stylized gestures and interpretive, prosodic movements. The critical distinction between these two types of expressive movement is in the degree of abstraction in the metaphors that link movement with meaning (see Morris, 2002 for a detailed discussion of movement metaphors). Pantomime by definition uses gesture to mimic everyday objects, situations, and behavior, and thus relies on relatively concrete movement metaphors. In contrast, dance relies on more abstract movement metaphors that draw on indirect associations between qualities of movement and the emotions and thoughts it evokes in a viewer. We predicted that since dance expresses meaning more abstractly than pantomime, dance sequences would be more difficult to interpret than pantomimed sequences, and would likewise pose a greater challenge to brain processes involved in decoding meaning from movement. Thus, we predicted greater involvement of thematic decoding areas for danced than for pantomimed movement expressions. Greater RS for dance than pantomime could result from dance triggering greater activity upon a first presentation, a greater reduction in activity with a repeated presentation, or some combination of both. Given our prediction that greater RS for dance would be linked to interpretive difficulty, we hypothesized it would be manifested as an increased processing demand resulting in greater initial BOLD activity for novel danced themes.

Participants

Forty-six neurologically healthy, right-handed individuals (30 women, mean age = 24.22 years, range = 19–55 years) provided written informed consent and were paid for their participation. Performers also agreed in writing to allow the use of their images and videos for scientific purposes. The protocol was approved by the Office of Research Human Subjects Committee at the University of California Santa Barbara (UCSB).

Eight themes were depicted, including four danced themes (happy, hopeful, fearful, and in agony) and four pantomimed themes (in love, relaxed, ill, and exhausted). Performance sequences were choreographed and performed by four professional dancers recruited from the SonneBlauma Danscz Theatre Company (Santa Barbara, California; now called ArtBark International, http://www.artbark.org/ ). Performers wore expressionless white masks so body language was conveyed though gestural whole-body movement as opposed to facial expressions. To express each theme, performers adopted an affective stance and improvised a short sequence of modern dance choreography (two themes per performer) or pantomime gestures (two themes per performer). Each of the eight themes were performed by two different dancers and recorded from two different camera angles, resulting in four distinct videos representing each theme (32 distinct videos in total; clips available in Supplementary Materials online).

Behavioral Procedure

In a separate session outside the scanner either before or after fMRI data collection, an interpretation task measured observers' ability to discern the intended meaning of a performance (Figure 2 ). The interpretation task was carried out in a separate session to avoid confounding movement observation in the scanner with explicit decision-making and overt motor responses. Participants were asked to view each video clip and choose from a list of four options the theme that best corresponded with the movement sequence they had just watched. Responses were made by pressing one of four corresponding buttons on a keyboard. Two behavioral measures were collected to assess how well participants interpreted the intended meaning of expressive performances. Consistency scores reflected the proportion of observers' interpretations that matched the performer's intended expression. Response times indicated the time taken to make interpretive judgments. In order to encourage subjects to use their initial impressions and to avoid over-deliberating, the four response options were previewed briefly immediately prior to video presentation.

www.frontiersin.org

Figure 2. Experimental testing procedure . Participants completed a thematic interpretation task outside the scanner, either before or after the imaging session. Performance on this task allowed us to test whether there was a difference in how readily observers interpreted the intended meaning conveyed through dance or pantomime. Any performance differences on this explicit theme judgment task could help interpret the functional significance of observed differences in brain activity associated with passively viewing the two types of movement in the scanner.

For the interpretation task collected outside the scanner, videos were presented and responses collected on a Mac Powerbook G4 laptop programmed using the Psychtoolbox (v. 3.0.8) extension ( Brainard, 1997 ; Pelli and Brainard, 1997 ) for Mac OSX running under Matlab 7.5 R2007b (the MathWorks, Natick, MA). Each trial began with the visual presentation of a list of four theme options corresponding to four button press responses (“u,” “i,” “o,” or “p” keyboard buttons). This list remained on the screen for 3 s, the screen blanked for 750 ms, and then the movie played for 10 s. Following the presentation of the movie, the four response options were presented again, and remained on the screen until a response was made. Each unique video was presented twice, resulting in 64 trials total. Video order was randomized for each participant, and the response options for each trial included the intended theme and three randomly selected alternatives.

Neuroimaging Procedure

fMRI data were collected with a Siemens 3.0 T Magnetom Tim Trio system using a 12-channel phased array head coil. Functional images were acquired with a T2* weighted single shot gradient echo, echo-planar sequence sensitive to Blood Oxygen Level Dependent (BOLD) contrast (TR = 2 s; TE = 30 ms; FA = 90°; FOV = 19.2 cm). Each volume consisted of 37 slices acquired parallel to the AC–PC plane (interleaved acquisition; 3 mm thick with 0.5 mm gap; 3 × 3 mm in-plane resolution; 64 × 64 matrix).

Each participant completed four functional scanning runs lasting approximately 7.5 min while viewing danced or acted expressive movement sequences. While there were a total of eight themes in the stimulus set for the study, each scanning run depicted only two of those eight themes. Over the course of all four scanning runs, all eight themes were depicted. Trial sequences were arranged such that theme of a movement sequence was either novel or repeated with respect to the previous trial. This allowed for the analysis of BOLD response RS for repeated vs. novel themes. Each run presented 24 video clips (3 presentations of 8 unique videos depicting 2 themes × 2 dancers × 2 camera angles). Novel and repeated themes were intermixed within each scanning run, with no more than three sequential repetitions of the same theme. Two scanning runs depicted dance and two runs depicted pantomime performances. The order of runs was randomized for each participant. The experiment was controlled using Presentation software (version 13.0, Neurobehavioral Systems Inc, CA). Participants were instructed to focus on the movement performance while viewing the videos. No specific information about the themes portrayed or types of movement used was provided, and no motor responses were required.

For the behavioral data collected outside the scanner, mean consistency scores and mean response time (RT; ms) were computed for each participant. Consistency and RT were each submitted to an ANOVA with Movement Type (dance vs. pantomime) as a within-subjects factor using Stata/IC 10.0 for Macintosh.

Statistical analysis of the neuroimaging data was organized to identify: (1) brain areas responsive to the observation of expressive movement sequences, defined by BOLD activity relative to an implicit baseline, (2) brain areas directly involved in decoding meaning from movement, defined by RS for repeated themes, (3) brain areas in which processes for decoding thematic meaning varied as a function of abstractness, defined by greater RS for danced than pantomimed themes, and (4) the specific pattern of BOLD activity differences for novel and repeated themes as a function of danced or pantomimed movements in regions showing greater RS for dance.

The fMRI data were analyzed using Statistical Parametric Mapping software (SPM5, Wellcome Department of Imaging Neuroscience, London; www.fil.ion.ucl.ac.uk/spm ) implemented in Matlab 7.5 R2007b (The MathWorks, Natick, MA). Individual scans were realigned, slice-time corrected and spatially normalized to the Montreal Neurological Institute (MNI) template in SPM5 with a resampled resolution of 3 × 3 × 3 mm. A smoothing kernel of 8 mm was applied to the functional images. A general linear model was created for each participant using SPM5. Parameter estimates of event-related BOLD activity were computed for novel and repeated themes depicted by danced and pantomimed movements, separately for each scanning run, for each participant.

Because the intended theme of each movement sequence was not expressed at a discrete time point but rather throughout the duration of the 10 s video clip, the most appropriate hemodynamic response function (HRF) with which to model the BOLD response at the individual level was determined empirically prior to parameter estimation. Of interest was whether the shape of the BOLD response to these relatively long video clips differed from the canonical HRF typically implemented in SPM. The shape of the BOLD response was estimated for each participant by modeling a finite impulse response function ( Ollinger et al., 2001 ). Each trial was represented by a sequence of 12 consecutive TRs, beginning at the onset of each video clip. Based on this deconvolution, a set of beta weights describing the shape of the response over a 24 s interval was obtained for both novel and repeated themes depicted by both danced and pantomimed movement sequences. To determine whether adjustments should be made to the canonical HRF implemented in SPM, the BOLD responses of a set of 45 brain regions within a known AON were evaluated (see Table 1 for a complete list). To find the most representative shape of the BOLD response within the AON, deconvolved beta weights for each condition were averaged across sessions and collapsed by singular value decomposition analysis ( Golub and Reinsch, 1970 ). This resulted in a characteristic signal shape that maximally described the actual BOLD response in AON regions for both novel and repeated themes, for both danced and pantomimed sequences. This examination of the BOLD response revealed that its time-to-peak was delayed 4 s compared to the canonical HRF response curve typically implemented in SPM. That is, the peak of the BOLD response was reached at 8–10 s following stimulus onset instead of the canonical 4–6 s. Given this result, parameter estimation for conditions of interest in our main analysis was based on a convolution of the design matrix for each participant with a custom HRF that accounted for the observed 4 s delay. Time-to-peak of the HRF was adjusted from 6 to 10 s while keeping the same overall width and height of the canonical function implemented in SPM. Using this custom HRF, the 10 s video duration was modeled as usual in SPM by convolving the HRF with a 10 s boxcar function.

www.frontiersin.org

Table 1. The action observation network, as defined by previous investigations .

Second-level whole-brain analysis was conducted with SPM8 using a 2 × 2 random effects model with Movement Type and Repetition as within-subject factors using the weighted parameter estimates (contrast images) obtained at the individual level as data. A gray matter mask was applied to whole-brain contrast images prior to second-level analysis to remove white matter voxels from the analysis. Six second-level contrasts were computed, including (1) expressive movement observation (BOLD relative to baseline), (2) dance observation effect (danced sequences > pantomimed sequences), (3) pantomime observation effect (pantomimed sequences > danced sequences), (4) RS (novel themes > repeated themes), (5) dance × repetition interaction (RS for dance > RS for pantomime), and (6) pantomime x repetition interaction (RS for pantomime > RS for dance). Following the creation of T-map images in SPM8, FSL was used to create Z-map images (Version 4.1.1; Analysis Group, FMRIB, Oxford, UK; Smith et al., 2004 ; Jenkinson et al., 2012 ). The results were thresholded at p < 0.05, cluster-corrected using FSL subroutines based on Gaussian random field theory ( Poldrack et al., 2011 ; Nichols, 2012 ). To examine the nature of the differences in RS between dance and pantomime, a mask image was created based on the corresponding cluster-thresholded Z-map of regions showing greater RS for dance, and the mean BOLD activity (contrast image values) was computed for novel and repeated dance and pantomime contrasts from each participant's first-level analysis. Mean BOLD activity measures were submitted to a 2 × 2 ANOVA with Movement Type (dance vs. pantomime) and Repetition (novel vs. repeat) as within-subjects factors using Stata/IC 10.0 for Macintosh.

In order to ensure that observed RS effects for repeated themes were not due to low-level kinematic effects, a motion tracking analysis of all 32 videos was performed using Tracker 4.87 software for Mac (written by Douglas Brown, distributed on the Open Source Physics platform, www.opensourcephysics.org ). A variety of motion parameters, including velocity, acceleration, momentum, and kinetic energy, were computed within the Tracker software based on semi-automated/supervised motion tracking of the top of the head, one hand, and one foot of each performer. The key question relevant to our results was whether there was a difference in motion between videos depicting novel and repeated themes. One factor ANOVAs for each motion parameter revealed no significant differences in coarse kinematic profiles between “novel” and “repeated” theme trials (all p 's > 0.05). This was not particularly surprising given that all videos were used for both novel and repeated themes, which were defined entirely based on trial sequence). In contrast, the comparison between danced and pantomimed themes did reveal significant differences in kinematic profiles. A 2 × 3 ANOVA with Movement Type (Dance, Pantomime) and Body Point (Hand, Head, Foot) as factors was conducted for each motion parameter (velocity, acceleration, momentum, and kinetic energy), and revealed greater motion energy on all parameters for the danced themes compared to the pantomimed themes (all p 's < 0.05). Any differences in RS between danced and pantomimed themes may therefore be attributed to differences in kinematic properties of body movement. Importantly, however, because there were no systematic differences in motion kinematics between novel and repeated themes, any RS effects for repeated themes could not be attributed to the effect of motion kinematics.

Figure 3 illustrates the behavioral results of the interpretation task completed outside the scanner. Participants had higher consistency scores for pantomimed movements than danced movements [ F (1, 42) = 42.06, p < 0.0001], indicating better transmission of the intended expressive meaning from performer to viewer. Pantomimed sequences were also interpreted more quickly than danced sequences [ F (1, 42) = 27.28, p < 0.0001], suggesting an overall performance advantage for pantomimed sequences.

www.frontiersin.org

Figure 3. Behavioral performance on the theme judgment task . Participants more readily interpreted pantomime than dance. This was evidenced by both greater consistency between the meaningful theme intended to be expressed by the performer and the interpretive judgments made by the observer (left), and faster response times (right). This pattern of results suggests that dance was more difficult to interpret than pantomime, perhaps owing to the use of more abstract metaphors to link movement with meaning. Pantomime, on the other hand, relied on more concrete, mundane sorts of movements that were more likely to carry meaningful associations based on observers' prior everyday experience. SEM, standard error of the mean.

Expressive Whole-body Movements Engage the Action Observation Network

Brain activity associated with the observation of expressive movement sequences was revealed by significant BOLD responses to observing both dance and pantomime movement sequences, relative to the inter-trial resting baseline. Figure 4 depicts significant activation ( p < 0.05, cluster corrected in FSL) rendered on an inflated cortical surface of the Human PALS-B12 Atlas ( Van Essen, 2005 ) using Caret (Version 5. 61; http://www.nitrc.org/projects/caret ; Van Essen et al., 2001 ). Table 2 presents the MNI coordinates for selected voxels within clusters active during movement observation, as labeled in Figure 4 . Region names were obtained from the Harvard-Oxford Cortical and Subcortical Structural Atlases ( Frazier et al., 2005 ; Desikan et al., 2006 ; Makris et al., 2006 ; Goldstein et al., 2007 ; Harvard Center for Morphometric Analysis; www.partners.org/researchcores/imaging/morphology_MGH.asp ), and Brodmann Area labels were obtained from the Juelich Histological Atlas ( Eickhoff et al., 2005 , 2006 , 2007 ), as implemented in FSL. Observation of body movement was associated with robust BOLD activation encompassing cortex typically associated with the AON, including fronto-parietal regions linked to the representation of action kinematics, goals, and outcomes ( Hamilton and Grafton, 2006 , 2007 ), as well as temporal, occipital, and insular cortex and subcortical regions including amygdala and hippocampus—regions typically associated with language comprehension ( Kirchhoff et al., 2000 ; Ni et al., 2000 ; Friederici et al., 2003 ) and socio-affective information processing and decision-making ( Anderson et al., 1999 ; Adolphs et al., 2003 ; Bechara et al., 2003 ; Bechara and Damasio, 2005 ).

www.frontiersin.org

Figure 4. Expressive performances engage the action observation network . Viewing expressive whole-body movement sequences engaged a distributed cortical action observation network ( p < 0.05, FWE corrected). Large areas of parietal, temporal, frontal, and insular cortex included somatosensory, motor, and premotor regions that have been considered previously to comprise a human “mirror neuron” system, as well as non-motor areas linked to comprehension, social perception, and affective decision-making. Number labels correspond to those listed in Table 2 , which provides anatomical names and voxel coordinates for areas of peak activation. Dotted line for regions 17/18 indicates medial temporal position not visible on the cortical surface.

www.frontiersin.org

Table 2. Brain regions showing a significant BOLD response while participants viewed expressive whole-body movement sequences .

The Action Observation Network “Reads” Body Language

To isolate brain areas that decipher meaning conveyed by expressive body movement, regions showing RS (reduced BOLD activity for repeated compared to novel themes) were identified. Since theme was the only stimulus dimension repeated systematically across trials for this comparison, decreased activation for repeated themes could not be attributed to physical features of the stimulus such as particular movements, performers, or camera viewpoints. Figure 5 illustrates brain areas showing significant suppression for repeated themes ( p < 0.05, cluster corrected in FSL). Table 3 presents the MNI coordinates for selected voxels within significant clusters. RS was found bilaterally on the rostral bank of the middle temporal gyrus extending into temporal pole and orbitofrontal cortex. There was also significant suppression in bilateral amygdala and insular cortex.

www.frontiersin.org

Figure 5. BOLD suppression (RS) reveals brain substrates for “reading” body language . Regions involved in decoding meaning in body language showing were isolated by testing for BOLD suppression when the intended theme of an expressive performance was repeated across trials. To identify regions showing RS, BOLD activity associated with novel themes was contrasted with BOLD activity associated with repeated themes ( p < 0.05, cluster corrected in FSL). Significantly greater activity for novel relative to repeated themes was evidence of RS. Given that the intended theme of a performance was the only element that was repeated between trials, regions showing RS revealed brain substrates that were sensitive to the specific meaning infused into a movement sequence by a performer. Number labels correspond to those listed in Table 3 , which provides anatomical names and voxel coordinates for key clusters showing significant RS. Blue shaded area indicates vertical extent of axial slices shown.

www.frontiersin.org

Table 3. Brain regions showing significant BOLD suppression for repeated themes ( p < 0.05, cluster corrected in FSL) .

Movement Abstractness Challenges Brain Substrates that Decode Meaning

The behavioral analysis indicated that interpreting danced themes was more difficult than interpreting pantomimed themes, as evidenced by lower consistency scores and greater RTs. Previous research indicates that greater difficulty discriminating a particular stimulus dimension is associated with greater BOLD suppression upon repetition of that dimension's attributes ( Hasson et al., 2006 ). To test whether greater difficulty decoding meaning from dance than pantomime would also be associated with greater RS in the present data, the magnitude of BOLD response suppression was compared between movement types. This was done with the Dance × Repetition interaction contrast in the second-level whole brain analysis, which revealed regions that had greater RS for dance than for pantomime. Figure 6 illustrates brain regions showing greater RS for themes portrayed through dance than pantomime ( p < 0.05, cluster corrected in FSL). Significant differences were entirely left-lateralized in superior and middle temporal gyri, extending into temporal pole and orbitofrontal cortex, and also present in laterobasal amygdala and the cornu ammonis of the hippocampus. Table 4 presents the MNI coordinates for selected voxels within significant clusters. The reverse Pantomime × Repetition interaction was also tested, but did not reveal any regions showing greater RS for pantomime than dance ( p > 0.05, cluster corrected in FSL).

www.frontiersin.org

Figure 6. Regions showing greater RS for dance than pantomime . RS effects were compared between movement types. This was implemented as an interaction contrast within our Movement Type × Repetition ANOVA design [(Novel Dance > Repeated Dance) > (Novel Pantomime > Repeated Pantomime)]. Greater RS for dance was lateralized to left hemisphere meaning-sensitive regions. The brain areas shown here have been linked previously to the comprehension of meaning in verbal language, suggesting the possibility they represent shared brain substrates for building meaning from both language and action. Number labels correspond to those listed in Table 4 , which provides anatomical names and voxel coordinates for key clusters showing significantly greater RS for dance. Blue shaded area indicates vertical extent of axial slices shown.

www.frontiersin.org

Table 4. Brain regions showing significantly greater RS for themes expressed through dance relative to themes expressed through pantomime ( p < 0.05, cluster corrected in FSL) .

In regions showing greater RS for dance than pantomime, mean BOLD responses for novel and repeated dance and pantomime conditions were computed across voxels for each participant based on their first-level contrast images. This was done to test whether the greater RS for dance was due to greater activity in the novel condition, lower activity in the repeated condition, or some combination of both. Figure 7 illustrates a pattern of BOLD activity across conditions demonstrates that the greater RS for dance was the result of greater initial BOLD activation in response to novel themes. The ANOVA results showed a significant Movement Type × Repetition interaction [ F (1, 42) = 7.83, p < 0.01], indicating that BOLD activity in response to novel danced themes was greater than BOLD activity for all other conditions in these regions.

www.frontiersin.org

Figure 7. Novel danced themes challenge brain substrates that decode meaning from movement . To determine the specific pattern of BOLD activity that resulted in greater RS for dance, average BOLD activity in these areas was computed for each condition separately. Greater RS for dance was driven by a larger BOLD response to novel danced themes. Considered together with behavioral findings indicating that dance was more difficult to interpret, greater RS for dance seems to result from a greater processing “challenge” to brain substrates involved in decoding meaning from movement. SEM, standard error of the mean.

This study was designed to reveal brain regions involved in reading body language—the meaningful information we pick up about the affective states and intentions of others based on their body movement. Brain regions that decoded meaning from body movement were identified with a whole brain analysis of RS that compared BOLD activity for novel and repeated themes expressed through modern dance or pantomime. Significant RS for repeated themes was observed bilaterally, extending anteriorly along middle and superior temporal gyri into temporal pole, medially into insula, rostrally into inferior orbitofrontal cortex, and caudally into hippocampus and amygdala. Together, these brain substrates comprise a functional system within the larger AON. This suggests strongly that decoding meaning from expressive body movement constitutes a dimension of action representation not previously isolated in studies of action understanding. In the following we argue that this embedding is consistent with the hierarchical organization of the AON.

Body Language as Superordinate in a Hierarchical Action Observation Network

Previous investigations of action understanding have identified the AON as a key a cognitive system for the organization of action in general, highlighting the fact that both performing and observing action rely on many of the same brain substrates ( Grafton, 2009 ; Ortigue et al., 2010 ; Kilner, 2011 ; Ogawa and Inui, 2011 ; Uithol et al., 2011 ; Grafton and Tipper, 2012 ). Shared brain substrates for controlling one's own action and understanding the actions of others are often taken as evidence of a “mirror neuron system” (MNS), following from physiological studies showing that cells in area F5 of the macaque monkey premotor cortex fired in response to both performing and observing goal-directed actions ( Pellegrino et al., 1992 ; Gallese et al., 1996 ; Rizzolatti et al., 1996a ). Since these initial observations were made regarding monkeys, there has been a tremendous effort to characterize a human analog of the MNS, and incorporate it into theories of not only action understanding, but also social cognition, language development, empathy, and neuropsychiatric disorders in which these faculties are compromised ( Gallese and Goldman, 1998 ; Rizzolatti and Arbib, 1998 ; Rizzolatti et al., 2001 ; Gallese, 2003 ; Gallese et al., 2004 ; Rizzolatti and Craighero, 2004 ; Iacoboni et al., 2005 ; Tettamanti et al., 2005 ; Dapretto et al., 2006 ; Iacoboni and Dapretto, 2006 ; Shapiro, 2008 ; Decety and Ickes, 2011 ). A fundamental assumption common to all such theories is that mirror neurons provide a direct neural mechanism for action understanding through “motor resonance,” or the simulation of one's own motor programs for an observed action ( Jacob, 2008 ; Oosterhof et al., 2013 ). One proposed mechanism for action understanding through motor resonance is the embodiment of sensorimotor associations between action goals and specific motor behaviors ( Mitz et al., 1991 ; Niedenthal et al., 2005 ; McCall et al., 2012 ). While the involvement of the motor system in a range of social, cognitive and affective domains is certainly worthy of focused investigation, and mirror neurons may well play an important role in supporting such “embodied cognition,” this by no means implies that mirror neurons alone can account for the ability to garner meaning from observed body movement.

Since the AON is a distributed cortical network that extends beyond motor-related brain substrates engaged during action observation, it is best characterized not as a homogeneous “mirroring” mechanism, but rather as a collection of functionally specific but interconnected modules that represent distinct properties of observed actions ( Grafton, 2009 ; Grafton and Tipper, 2012 ). The present results build on this functional-hierarchical model of the AON by incorporating meaningful expression as an inherent aspect of body movement that is decoded in distinct regions of the AON. In other words, the bilateral temporal-orbitofrontal regions that showed RS for repeated themes comprise a distinct functional module of the AON that supports an additional level of the action representation hierarchy. Such an interpretation is consistent with the idea that action representation is inherently nested, carried out within a hierarchy of part-whole processes for which higher levels depend on lower levels ( Cooper and Shallice, 2006 ; Botvinick, 2008 ; Grafton and Tipper, 2012 ). We propose that the meaning infused into the body movement of a person having a particular affective stance is decoded superordinately to more concrete properties of action, such as kinematics and object goals. Under this view, while decoding these representationally subordinate properties of action may involve motor-related brain substrates, decoding “body language” engages non-motor regions of the AON that link movement and meaning, relying on inputs from lower levels of the action representation hierarchy that provide information about movement kinematics, prosodic nuances, and dynamic inflections.

While the present results suggest that the temporal-orbitofrontal regions identified here as decoding meaning from emotive body movement constitute a distinct functional module within a hierarchically organized AON, it is important to note that these regions have not previously been included in anatomical descriptions of the AON. The present study, however, isolated a property of action representation that had not been previously investigated; so identifying regions of the AON not previously included in its functional-anatomic definition is perhaps not surprising. This underscores the important point that the AON is functionally defined, such that its apparent anatomical extent in a given experimental context depends upon the particular aspects of action representation that are engaged and isolable. Previous studies of another abstract property of action representation, namely intention understanding, also illustrate this point. Inferring the intentions of an actor engages medial prefrontal cortex, bilateral posterior superior temporal sulcus, and left temporo-parietal junction—non-motor regions of the brain typically associated with “mentalizing,” or thinking about the mental states of another agent ( Ansuini et al., 2015 ; Ciaramidaro et al., 2014 ). A key finding of this research is that intention understanding depends fundamentally on the integration of motor-related (“mirroring”) brain regions and non-motor (“mentalizing”) brain regions ( Becchio et al., 2012 ). The present results parallel this finding, and point to the idea that in the context of action representation, motor and non-motor brain areas are not two separate brain networks, but rather one integrated functional system.

Predictive Coding and the Construction of Meaning in the Action Observation Network

A critical question raised by the idea that the temporal-orbitofrontal brain regions in which RS was observed here constitute a superordinate, meaning-sensitive functional module of the AON is how activity in subordinate AON modules is integrated at this higher level to produce differential neural firing patterns in response to different meaningful body expressions. That is, what are the neural mechanisms underlying the observed sensitivity to meaning in body language, and furthermore, why are these mechanisms subject to adaptation through repetition (RS)? While the present results do not provide direct evidence to answer these questions, we propose that a “predictive coding” interpretation provides a coherent model of action representation ( Brass et al., 2007 ; Kilner and Frith, 2008 ; Brown and Brüne, 2012 ) that yields useful predictions about the neural processes by which meaning is decoded that would account for the observed RS effect. The primary mechanism invoked by a predictive coding framework of action understanding is recurrent feed-forward and feedback processing across the various levels of the AON, which supports a Bayesian system of predictive neural coding, feedback processes, and prediction error reduction at each level of action representation ( Friston et al., 2011 ). According to this model, each level of the action observation hierarchy generates predictions to anticipate neural activity at lower levels of the hierarchy. Predictions in the form of neural codes are sent to lower levels through feedback connections, and compared with actual subordinate neural representations. Any discrepancy between neural predictions and actual representations are coded as prediction error. Information regarding prediction error is sent through recurrent feed-forward projections to superordinate regions, and used to update predictive priors such that subsequent prediction error is minimized. Together, these Bayes-optimal neural ensemble operations converge on the most probable inference for representation at the superordinate level ( Friston et al., 2011 ) and, ultimately, action understanding based on the integration of representations at each level of the action observation hierarchy ( Chambon et al., 2011 ; Kilner, 2011 ).

A predictive coding account of the present results would suggest that initial feed-forward inputs from subordinate levels of the AON provided the superordinate temporal-orbitofrontal module with information regarding movement kinematics, prosody, gestural elements, and dynamic inflections, which, when integrated with other inputs based on prior experience, would provide a basis for an initial prediction about potential meanings of a body expression. This prediction would yield a generative neural model about the movement dynamics that would be expected given the predicted meaning of the observed body expression, which would be fed back to lower levels of the network that coded movement dynamics and sensorimotor associations. Predictive activity would be contrasted with actual representations as movement information was accrued throughout the performance, and the resulting prediction error would be utilized via feed-forward projections to temporal-orbitofrontal regions to update predictive codes regarding meaning and minimize subsequent prediction error. In this way, the meaningful affective theme being expressed by the performer would be converged upon through recurrent Bayes-optimal neural ensemble operations. Thus, meaning expressed through body language would be accrued iteratively in temporal-orbitofrontal regions by integrating neural representations of various facets of action decoded throughout the AON. Interestingly, and consistent with a model in which an iterative process accrued information over time, we observed that BOLD responses in AON regions peaked more slowly than expected based on SPM's canonical HRF as the videos were viewed over an extended (10 s) duration. Under an iterative predictive coding model, RS for repeated themes could be accounted for by reduced initial generative activity in temporal-orbitofrontal regions due to better constrained predictions about potential meanings conveyed by observed movement, more efficient convergence on an inference due to faster minimization of prediction error, or some combination of both of these mechanisms. The present results provide indirect evidence for the former account, in that more abstract, less constrained movement metaphors relied upon by expressive dance resulted in greater RS due to larger BOLD responses for novel themes relative to the more concrete, better-constrained associations conveyed by pantomime.

Shared Brain Substrates for Meaning in Action and Language

The middle temporal gyrus and superior temporal sulcus regions identified here as part of a functional module of the AON that “reads” body language have been linked previously to a variety of high-level linguistic domains related to understanding meaning. Among these are conceptual knowledge ( Lambon Ralph et al., 2009 ), language comprehension ( Hasson et al., 2006 ; Noppeney and Penny, 2006 ; Price, 2010 ), sensitivity to the congruency between intentions and actions, both verbal/conceptual ( Deen and McCarthy, 2010 ), and perceptual/implicit ( Wyk et al., 2009 ), as well as understanding abstract language and metaphorical descriptions of action ( Desai et al., 2011 ). While together these studies demonstrate that high-level linguistic processing involves bilateral superior and middle temporal regions, there is evidence for a general predominance of the left hemisphere in comprehending semantics ( Price, 2010 ), and a predominance of the right hemisphere in incorporating socio-emotional information and affective context ( Wyk et al., 2009 ). For example, brain atrophy associated with a primary progressive aphasia characterized by profound disturbances in semantic comprehension occurs bilaterally in anterior middle temporal regions, but is more pronounced in the left hemisphere ( Gorno-Tempini et al., 2004 ). In contrast, neural degeneration in right hemisphere orbitofrontal, insula, and anterior middle temporal regions is associated not only with semantic dementia but also deficits in socio-emotional sensitivity and regulation ( Rosen et al., 2005 ).

This hemispheric asymmetry in brain substrates associated with interpreting meaning in verbal language is paralleled in the present results, which not only link the same bilateral temporal-orbitofrontal brain substrates to comprehending meaning from affectively expressive body language, but also demonstrate a predominance of the left hemisphere in deciphering the particularly abstract movement metaphors conveyed by dance. This asymmetry was evident as greater RS for repeated themes for dance relative to pantomime, which was driven by a greater initial activation for novel themes, suggesting that these left-hemisphere regions were engaged more vigorously when decoding more abstract movement metaphors. Together, these results illustrate a striking overlap in the brain substrates involved in processing meaning in verbal language and decoding meaning from expressive body movement. This overlap suggests that a long-hypothesized evolutionary link between gestural body movement and language ( Hewes et al., 1973 ; Harnad et al., 1976 ; Rizzolatti and Arbib, 1998 ; Corballis, 2003 ) may be instantiated by a network of shared brain substrates for representing semiotic structure, which constitutes the informational scaffolding for building meaning in both language and gesture ( Lemke, 1987 ; Freeman, 1997 ; McNeill, 2012 ; Lhommet and Marsella, 2013 ). While speculative, under this view the temporal-orbitofrontal AON module for coding meaning observed may provide a neural basis for semiosis (the construction of meaning), which would lend support to the intriguing philosophical argument that meaning is fundamentally grounded in processes of the body, brain, and the social environment within which they are immersed ( Thibault, 2004 ).

Summary and Conclusions

The present results identify a system of temporal, orbitofrontal, insula, and amygdala brain regions that supports the meaningful interpretation of expressive body language. We propose that these areas reveal a previously undefined superordinate functional module within a known, stratified hierarchical brain network for action representation. The findings are consistent with a predictive coding model of action understanding, wherein the meaning that is imbued into expressive body movements through subtle kinematics and prosodic nuances is decoded as a distinct property of action via feed-forward and feedback processing across the levels of a hierarchical AON. Under this view, recurrent processing loops integrate lower-level representations of movement dynamics and socio-affective perceptual information to generate, evaluate, and update predictive inferences about expressive content that are mediated in a superordinate temporal-orbitofrontal module of the AON. Thus, while lower-level action representation in motor-related brain areas (sometimes referred to as a human “mirror neuron system”) may be a key step in the construction of meaning from movement, it is not these motor areas that code the specific meaning of an expressive body movement. Rather, we have demonstrated an additional level of the cortical action representation hierarchy in non-motor regions of the AON. The results highlight an important link between action representation and language, and point to the possibility of shared brain substrates for constructing meaning in both domains.

Author Contributions

CT, GS, and SG designed the experiment. CT and GS created stimuli, which included recruiting professional dancers and filming expressive movement sequences. GS carried out video editing. CT completed computer programming for experimental control and data analysis. GS and CT recruited participants and conducted behavioral and fMRI testing. CT and SG designed the data analysis and CT and GS carried it out. GS conducted a literature review, and CT wrote the paper with reviews and edits from SG.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Research supported by the James S. McDonnell Foundation.

Supplementary Material

The Supplementary Material for this article can be found online at: http://dx.doi.org/10.6084/m9.figshare.1508616

Adolphs, R., Damasio, H., Tranel, D., Cooper, G., and Damasio, A. R. (2000). A role for somatosensory cortices in the visual recognition of emotion as revealed by three-dimensional lesion mapping. J. Neurosci. 20, 2683–2690.

PubMed Abstract | Google Scholar

Adolphs, R., Tranel, D., and Damasio, A. R. (2003). Dissociable neural systems for recognizing emotions. Brain Cogn. 52, 61–69. doi: 10.1016/S0278-2626(03)00009-5

PubMed Abstract | CrossRef Full Text | Google Scholar

Allison, T., Puce, A., and McCarthy, G. (2000). Social perception from visual cues: role of the STS region. Trends Cogn. Sci. 4, 267–278. doi: 10.1016/S1364-6613(00)01501-1

Anderson, S. W., Bechara, A., Damasio, H., Tranel, D., and Damasio, A. R. (1999). Impairment of social and moral behavior related to early damage in human prefrontal cortex. Nat. Neurosci. 2, 1032–1037. doi: 10.1038/14833

Ansuini, C., Cavallo, A., Bertone, C., and Becchio, C. (2015). Intentions in the brain: the unveiling of Mister Hyde. Neuroscientist 21, 126–135. doi: 10.1177/1073858414533827

Astafiev, S. V., Stanley, C. M., Shulman, G. L., and Corbetta, M. (2004). Extrastriate body area in human occipital cortex responds to the performance of motor actions. Nat. Neurosci. 7, 542–548. doi: 10.1038/nn1241

Becchio, C., Cavallo, A., Begliomini, C., Sartori, L., Feltrin, G., and Castiello, U. (2012). Social grasping: from mirroring to mentalizing. Neuroimage 61, 240–248. doi: 10.1016/j.neuroimage.2012.03.013

Bechara, A., and Damasio, A. R. (2005). The somatic marker hypothesis: a neural theory of economic decision. Games Econ. Behav. 52, 336–372. doi: 10.1016/j.geb.2004.06.010

CrossRef Full Text | Google Scholar

Bechara, A., Damasio, H., and Damasio, A. R. (2003). Role of the amygdala in decision making. Ann. N.Y. Acad. Sci. 985, 356–369. doi: 10.1111/j.1749-6632.2003.tb07094.x

Bertenthal, B. I., Longo, M. R., and Kosobud, A. (2006). Imitative response tendencies following observation of intransitive actions. J. Exp. Psychol. 32, 210–225. doi: 10.1037/0096-1523.32.2.210

Bonda, E., Petrides, M., Ostry, D., and Evans, A. (1996). Specific involvement of human parietal systems and the amygdala in the perception of biological motion. J. Neurosci. 16, 3737–3744.

Botvinick, M. M. (2008). Hierarchical models of behavior and prefrontal function. Trends Cogn. Sci. 12, 201–208. doi: 10.1016/j.tics.2008.02.009

Brainard, D. H. (1997). The psychophysics toolbox. Spat. Vis. 10, 433–436. doi: 10.1163/156856897X00357

Brass, M., Schmitt, R. M., Spengler, S., and Gergely, G. (2007). Investigating action understanding: inferential processes versus action simulation. Curr. Biol. 17, 2117–2121. doi: 10.1016/j.cub.2007.11.057

Brown, E. C., and Brüne, M. (2012). The role of prediction in social neuroscience. Front. Hum. Neurosci . 6:147. doi: 10.3389/fnhum.2012.00147

Buccino, G., Binkofski, F., Fink, G. R., Fadiga, L., Fogassi, L., Gallese, V., et al. (2001). Action observation activates premotor and parietal areas in a somatotopic manner: an fMRI study. Eur. J. Neurosci. 13, 400–404. doi: 10.1046/j.1460-9568.2001.01385.x

Calvo-Merino, B., Glaser, D. E., Grèzes, J., Passingham, R. E., and Haggard, P. (2005). Action observation and acquired motor skills: an FMRI study with expert dancers. Cereb. Cortex 15, 1243. doi: 10.1093/cercor/bhi007

Cardinal, R. N., Parkinson, J. A., Hall, J., and Everitt, B. J. (2002). Emotion and motivation: the role of the amygdala, ventral striatum, and prefrontal cortex. Neurosci. Biobehav. Rev. 26, 321–352. doi: 10.1016/S0149-7634(02)00007-6

Chambon, V., Domenech, P., Pacherie, E., Koechlin, E., Baraduc, P., and Farrer, C. (2011). What are they up to? The role of sensory evidence and prior knowledge in action understanding. PLoS ONE 6:e17133. doi: 10.1371/journal.pone.0017133

Ciaramidaro, A., Becchio, C., Colle, L., Bara, B. G., and Walter, H. (2014). Do you mean me? Communicative intentions recruit the mirror and the mentalizing system. Soc. Cogn. Affect. Neurosci . 9, 909–916. doi: 10.1093/scan/nst062

Cooper, R. P., and Shallice, T. (2006). Hierarchical schemas and goals in the control of sequential behavior. Psychol. Rev. 113, 887–916. discussion 917–931. doi: 10.1037/0033-295x.113.4.887

Corballis, M. C. (2003). “From hand to mouth: the gestural origins of language,” in Language Evolution: The States of the Art , eds M. H. Christiansen and S. Kirby (Oxford University Press). Available online at: http://groups.lis.illinois.edu/amag/langev/paper/corballis03fromHandToMouth.html

PubMed Abstract

Cross, E. S., Hamilton, A. F. C., and Grafton, S. T. (2006). Building a motor simulation de novo : observation of dance by dancers. Neuroimage 31, 1257–1267. doi: 10.1016/j.neuroimage.2006.01.033

Cross, E. S., Kraemer, D. J. M., Hamilton, A. F. D. C., Kelley, W. M., and Grafton, S. T. (2009). Sensitivity of the action observation network to physical and observational learning. Cereb. Cortex 19, 315. doi: 10.1093/cercor/bhn083

Damasio, A. R., Grabowski, T. J., Bechara, A., Damasio, H., Ponto, L. L., Parvizi, J., et al. (2000). Subcortical and cortical brain activity during the feeling of self-generated emotions. Nat. Neurosci. 3, 1049–1056. doi: 10.1038/79871

Dapretto, M., Davies, M. S., Pfeifer, J. H., Scott, A. A., Sigman, M., Bookheimer, S. Y., et al. (2006). Understanding emotions in others: mirror neuron dysfunction in children with autism spectrum disorders. Nat. Neurosci. 9, 28–30. doi: 10.1038/nn1611

Dean, P., Redgrave, P., and Westby, G. W. M. (1989). Event or emergency? Two response systems in the mammalian superior colliculus. Trends Neurosci . 12, 137–147. doi: 10.1016/0166-2236(89)90052-0

Decety, J., and Ickes, W. (2011). The Social Neuroscience of Empathy . Cambridge, MA: MIT Press.

Google Scholar

Deen, B., and McCarthy, G. (2010). Reading about the actions of others: biological motion imagery and action congruency influence brain activity. Neuropsychologia 48, 1607–1615. doi: 10.1016/j.neuropsychologia.2010.01.028

de Gelder, B. (2006). Towards the neurobiology of emotional body language. Nat. Rev. Neurosci. 7, 242–249. doi: 10.1038/nrn1872

de Gelder, B., and Hadjikhani, N. (2006). Non-conscious recognition of emotional body language. Neuroreport 17, 583. doi: 10.1097/00001756-200604240-00006

Desai, R. H., Binder, J. R., Conant, L. L., Mano, Q. R., and Seidenberg, M. S. (2011). The neural career of sensory-motor metaphors. J. Cogn. Neurosci. 23, 2376–2386. doi: 10.1162/jocn.2010.21596

Desikan, R. S., Ségonne, F., Fischl, B., Quinn, B. T., Dickerson, B. C., Blacker, D., et al. (2006). An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest. Neuroimage 31, 968–980. doi: 10.1016/j.neuroimage.2006.01.021

Eickhoff, S. B., Heim, S., Zilles, K., and Amunts, K. (2006). Testing anatomically specified hypotheses in functional imaging using cytoarchitectonic maps. Neuroimage 32, 570–582. doi: 10.1016/j.neuroimage.2006.04.204

Eickhoff, S. B., Paus, T., Caspers, S., Grosbras, M. H., Evans, A. C., Zilles, K., et al. (2007). Assignment of functional activations to probabilistic cytoarchitectonic areas revisited. Neuroimage 36, 511–521. doi: 10.1016/j.neuroimage.2007.03.060

Eickhoff, S. B., Stephan, K. E., Mohlberg, H., Grefkes, C., Fink, G. R., Amunts, K., et al. (2005). A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data. Neuroimage 25, 1325–1335. doi: 10.1016/j.neuroimage.2004.12.034

Fadiga, L., Fogassi, L., Gallese, V., and Rizzolatti, G. (2000). Visuomotor neurons: ambiguity of the discharge or motor perception? Int. J. Psychophysiol. 35, 165–177. doi: 10.1016/S0167-8760(99)00051-3

Ferrari, P. F., Gallese, V., Rizzolatti, G., and Fogassi, L. (2003). Mirror neurons responding to the observation of ingestive and communicative mouth actions in the monkey ventral premotor cortex. Eur. J. Neurosci. 17, 1703–1714. doi: 10.1046/j.1460-9568.2003.02601.x

Frazier, J. A., Chiu, S., Breeze, J. L., Makris, N., Lange, N., Kennedy, D. N., et al. (2005). Structural brain magnetic resonance imaging of limbic and thalamic volumes in pediatric bipolar disorder. Am. J. Psychiatry 162, 1256–1265. doi: 10.1176/appi.ajp.162.7.1256

Freeman, W. J. (1997). A neurobiological interpretation of semiotics: meaning vs. representation. IEEE Int. Conf. Syst. Man Cybern. Comput. Cybern. Simul. 2, 93–102. doi: 10.1109/ICSMC.1997.638197

Frey, S. H., and Gerry, V. E. (2006). Modulation of neural activity during observational learning of actions and their sequential orders. J. Neurosci. 26, 13194–13201. doi: 10.1523/JNEUROSCI.3914-06.2006

Friederici, A. D., Rüschemeyer, S.-A., Hahne, A., and Fiebach, C. J. (2003). The role of left inferior frontal and superior temporal cortex in sentence comprehension: localizing syntactic and semantic processes. Cereb. Cortex 13, 170–177. doi: 10.1093/cercor/13.2.170

Friston, K., Mattout, J., and Kilner, J. (2011). Action understanding and active inference. Biol. Cybern. 104, 137–60. doi: 10.1007/s00422-011-0424-z

Gallese, V. (2003). The roots of empathy: the shared manifold hypothesis and the neural basis of intersubjectivity. Psychopathology 36, 171–180. doi: 10.1159/000072786

Gallese, V., Fadiga, L., Fogassi, L., and Rizzolatti, G. (1996). Action recognition in the premotor cortex. Brain 119, 593. doi: 10.1093/brain/119.2.593

Gallese, V., and Goldman, A. (1998). Mirror neurons and the simulation theory of mind-reading. Trends Cogn. Sci. 2, 493–501. doi: 10.1016/S1364-6613(98)01262-5

Gallese, V., Keysers, C., and Rizzolatti, G. (2004). A unifying view of the basis of social cognition. Trends Cogn. Sci. 8, 396–403. doi: 10.1016/j.tics.2004.07.002

Goldstein, J. M., Seidman, L. J., Makris, N., Ahern, T., O'Brien, L. M., Caviness, V. S., et al. (2007). Hypothalamic abnormalities in Schizophrenia: sex effects and genetic vulnerability. Biol. Psychiatry 61, 935–945. doi: 10.1016/j.biopsych.2006.06.027

Golub, G. H., and Reinsch, C. (1970). Singular value decomposition and least squares solutions. Numer. Math. 14, 403–420. doi: 10.1007/BF02163027

Gorno-Tempini, M. L., Dronkers, N. F., Rankin, K. P., Ogar, J. M., Phengrasamy, L., Rosen, H. J., et al. (2004). Cognition and anatomy in three variants of primary progressive aphasia. Ann. Neurol. 55, 335–346. doi: 10.1002/ana.10825

Grafton, S. T. (2009). Embodied cognition and the simulation of action to understand others. Ann. N.Y. Acad. Sci. 1156, 97–117. doi: 10.1111/j.1749-6632.2009.04425.x

Grafton, S. T., Arbib, M. A., Fadiga, L., and Rizzolatti, G. (1996). Localization of grasp representations in humans by positron emission tomography. Exp. Brain Res. 112, 103–111. doi: 10.1007/BF00227183

Grafton, S. T., and Tipper, C. M. (2012). Decoding intention: a neuroergonomic perspective. Neuroimage 59, 14–24. doi: 10.1016/j.neuroimage.2011.05.064

Grèzes, J., Decety, J., and Grezes, J. (2001). Functional anatomy of execution, mental simulation, observation, and verb generation of actions: a meta-analysis. Hum. Brain Mapp. 12, 1–19. doi: 10.1002/1097-0193(200101)12:1<1::AID-HBM10>3.0.CO;2-V

Grezes, J., Fonlupt, P., Bertenthal, B., Delon-Martin, C., Segebarth, C., Decety, J., et al. (2001). Does perception of biological motion rely on specific brain regions? Neuroimage 13, 775–785. doi: 10.1006/nimg.2000.0740

Grill-Spector, K., Henson, R., and Martin, A. (2006). Repetition and the brain: neural models of stimulus-specific effects. Trends Cogn. Sci. 10, 14–23. doi: 10.1016/j.tics.2005.11.006

Grill-Spector, K., and Malach, R. (2001). fMR-adaptation: a tool for studying the functional properties of human cortical neurons. Acta Psychol. 107, 293–321. doi: 10.1016/S0001-6918(01)00019-1

Hadjikhani, N., and de Gelder, B. (2003). Seeing fearful body expressions activates the fusiform cortex and amygdala. Curr. Biol. 13, 2201–2205. doi: 10.1016/j.cub.2003.11.049

Hamilton, A. F. C., and Grafton, S. T. (2006). Goal representation in human anterior intraparietal sulcus. J. Neurosci. 26, 1133. doi: 10.1523/JNEUROSCI.4551-05.2006

Hamilton, A. F. D. C., and Grafton, S. T. (2008). Action outcomes are represented in human inferior frontoparietal cortex. Cereb. Cortex 18, 1160–1168. doi: 10.1093/cercor/bhm150

Hamilton, A. F., and Grafton, S. T. (2007). “The motor hierarchy: from kinematics to goals and intentions,” in Sensorimotor Foundations of Higher Cognition: Attention and Performance , Vol. 22, eds P. Haggard, Y. Rossetti, and M. Kawato (Oxford: Oxford University Press), 381–402.

Hari, R., Forss, N., Avikainen, S., Kirveskari, E., Salenius, S., and Rizzolatti, G. (1998). Activation of human primary motor cortex during action observation: a neuromagnetic study. Proc. Natl. Acad. Sci. U.S.A. 95, 15061–15065. doi: 10.1073/pnas.95.25.15061

Harnad, S. R., Steklis, H. D., and Lancaster, J. (eds.). (1976). “Origins and evolution of language and speech,” in Annals of the New York Academy of Sciences (New York, NY: New York Academy of Sciences), 280.

Hasson, U., Nusbaum, H. C., and Small, S. L. (2006). Repetition suppression for spoken sentences and the effect of task demands. J. Cogn. Neurosci. 18, 2013–2029. doi: 10.1162/jocn.2006.18.12.2013

Haxby, J. V., Hoffman, E. A., and Gobbini, M. I. (2002). Human neural systems for face recognition and social communication. Biol. Psychiatry 51, 59–67. doi: 10.1016/S0006-3223(01)01330-0

Hewes, G. W., Andrew, R. J., Carini, L., Choe, H., Gardner, R. A., Kortlandt, A., et al. (1973). Primate communication and the gestural origin of language [and comments and reply]. Curr. Anthropol. 14, 5–24. doi: 10.1086/201401

Hoffman, E. A., and Haxby, J. V. (2000). Distinct representations of eye gaze and identity in the distributed human neural system for face perception. Nat. Neurosci. 3, 80–84. doi: 10.1038/71152

Iacoboni, M., and Dapretto, M. (2006). The mirror neuron system and the consequences of its dysfunction. Nat. Rev. Neurosci. 7, 942–51. doi: 10.1038/nrn2024

Iacoboni, M., Molnar-Szakacs, I., Gallese, V., Buccino, G., Mazziotta, J. C., and Rizzolatti, G. (2005). Grasping the intentions of others with one's own mirror neuron system. PLoS Biol. 3:e79. doi: 10.1371/journal.pbio.0030079

Jacob, P. (2008). What do mirror neurons contribute to human social cognition? Mind Lang. 23, 190–223. doi: 10.1111/j.1468-0017.2007.00337.x

Jenkinson, M., Beckmann, C. F., Behrens, T. E. J., Woolrich, M. W., and Smith, S. M. (2012). Fsl. Neuroimage 62, 782–90. doi: 10.1016/j.neuroimage.2011.09.015

Kilner, J. M. (2011). More than one pathway to action understanding. Trends Cogn. Sci. 15, 352–37. doi: 10.1016/j.tics.2011.06.005

Kilner, J. M., and Frith, C. D. (2008). Action observation: inferring intentions without mirror neurons. Curr. Biol. 18, R32–R33. doi: 10.1016/j.cub.2007.11.008

Kirchhoff, B. A., Wagner, A. D., Maril, A., and Stern, C. E. (2000). Prefrontal-temporal circuitry for episodic encoding and subsequent memory. J. Neurosci. 20, 6173–6180.

Lambon Ralph, M. A., Pobric, G., and Jefferies, E. (2009). Conceptual knowledge is underpinned by the temporal pole bilaterally: convergent evidence from rTMS. Cereb. Cortex 19, 832–838. doi: 10.1093/cercor/bhn131

Lemke, J. L. (1987). “Strategic deployment of speech and action: a sociosemiotic analysis,” in Semiotics 1983: Proceedings of the Semiotic Society of America ‘Snowbird’ Conference , eds J. Evans and J. Deely (Lanham, MD: University Press of America), 67–79.

Lhommet, M., and Marsella, S. C. (2013). “Gesture with meaning,” in Intelligent Virtual Agents , eds Y. Nakano, M. Neff, A. Paiva, and M. Walker (Berlin; Heidelberg: Springer), 303–312. doi: 10.1007/978-3-642-40415-3_27

CrossRef Full Text

Makris, N., Goldstein, J. M., Kennedy, D., Hodge, S. M., Caviness, V. S., Faraone, S. V., et al. (2006). Decreased volume of left and total anterior insular lobule in schizophrenia. Schizophr. Res. 83, 155–171. doi: 10.1016/j.schres.2005.11.020

McCall, C., Tipper, C. M., Blascovich, J., and Grafton, S. T. (2012). Attitudes trigger motor behavior through conditioned associations: neural and behavioral evidence. Soc. Cogn. Affect. Neurosci. 7, 841–889. doi: 10.1093/scan/nsr057

McCarthy, G., Puce, A., Gore, J. C., and Allison, T. (1997). Face-specific processing in the human fusiform gyrus. J. Cogn. Neurosci. 9, 605–610. doi: 10.1162/jocn.1997.9.5.605

McNeill, D. (2012). How Language Began: Gesture and Speech in Human Evolution . Cambridge: Cambridge University Press. Available online at: https://scholar.google.ca/scholar?q=How+Language+Began+Gesture+and+Speech+in+Human+Evolution&hl=en&as_sdt=0&as_vis=1&oi=scholart&sa=X&ei=-ezxVISFIdCboQS1q4KACQ&ved=0CBsQgQMwAA

Morris, D. (2002). Peoplewatching: The Desmond Morris Guide to Body Language . New York, NY: Vintage Books. Available online at: http://www.amazon.ca/Peoplewatching-Desmond-Morris-Guide-Language/dp/0099429780 (Accessed March 10, 2014).

Ni, W., Constable, R. T., Mencl, W. E., Pugh, K. R., Fulbright, R. K., Shaywitz, S. E., et al. (2000). An event-related neuroimaging study distinguishing form and content in sentence processing. J. Cogn. Neurosci. 12, 120–133. doi: 10.1162/08989290051137648

Nichols, T. E. (2012). Multiple testing corrections, nonparametric methods, and random field theory. Neuroimage 62, 811–815. doi: 10.1016/j.neuroimage.2012.04.014

Niedenthal, P. M., Barsalou, L. W., Winkielman, P., Krauth-Gruber, S., and Ric, F. (2005). Embodiment in attitudes, social perception, and emotion. Personal. Soc. Psychol. Rev. 9, 184–211. doi: 10.1207/s15327957pspr0903_1

Noppeney, U., and Penny, W. D. (2006). Two approaches to repetition suppression. Hum. Brain Mapp. 27, 411–416. doi: 10.1002/hbm.20242

Ogawa, K., and Inui, T. (2011). Neural representation of observed actions in the parietal and premotor cortex. Neuroimage 56, 728–35. doi: 10.1016/j.neuroimage.2010.10.043

Ollinger, J. M., Shulman, G. L., and Corbetta, M. (2001). Separating processes within a trial in event-related functional MRI: II. Analysis. Neuroimage 13, 218–229. doi: 10.1006/nimg.2000.0711

Oosterhof, N. N., Tipper, S. P., and Downing, P. E. (2013). Crossmodal and action-specific: neuroimaging the human mirror neuron system. Trends Cogn. Sci. 17, 311–338. doi: 10.1016/j.tics.2013.04.012

Ortigue, S., Sinigaglia, C., Rizzolatti, G., Grafton, S. T., and Rochelle, E. T. (2010). Understanding actions of others: the electrodynamics of the left and right hemispheres. A high-density EEG neuroimaging study. PLoS ONE 5:e12160. doi: 10.1371/journal.pone.0012160

Peelen, M. V., and Downing, P. E. (2005). Selectivity for the human body in the fusiform gyrus. J. Neurophysiol. 93, 603–608. doi: 10.1152/jn.00513.2004

Pellegrino, G., Fadiga, L., Fogassi, L., Gallese, V., Rizzolatti, G., and di Pellegrino, G. (1992). Understanding motor events: a neurophysiological study. Exp. Brain Res. 91, 176–180. doi: 10.1007/BF00230027

Pelli, D. G., and Brainard, D. H. (1997). The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spat. Vis. 10, 433–436. doi: 10.1163/156856897X00366

Pelphrey, K. A., Morris, J. P., Michelich, C. R., Allison, T., and McCarthy, G. (2005). Functional anatomy of biological motion perception in posterior temporal cortex: an fMRI study of eye, mouth, and hand movements. Cereb. Cortex 15, 1866–1876. doi: 10.1093/cercor/bhi064

Poldrack, R. A., Mumford, J. A., and Nichols, T. E. (2011). Handbook of Functional MRI Data Analysis . New York, NY: Cambridge University Press. doi: 10.1017/cbo9780511895029

Price, C. J. (2010). The anatomy of language: a review of 100 fMRI studies published in 2009. Ann. N.Y. Acad. Sci. 1191, 62–88. doi: 10.1111/j.1749-6632.2010.05444.x

Mitz, A. R., Godschalk, M., and Wise, S. P. (1991). Learning-dependent neuronal activity in the premotor cortex: activity during the acquisition of conditional motor associations. J. Neurosci. 11, 1855–1872.

Rizzolatti, G., and Arbib, M. A. (1998). Language within our grasp. Trends Neurosci. 21, 188–194. doi: 10.1016/S0166-2236(98)01260-0

Rizzolatti, G., and Craighero, L. (2004). The mirror-neuron system. Annu. Rev. Neurosci. 27, 169–192. doi: 10.1146/annurev.neuro.27.070203.144230

Rizzolatti, G., Fadiga, L., Gallese, V., and Fogassi, L. (1996a). Premotor cortex and the recognition of motor actions. Cogn. brain Res. 3, 131–141. doi: 10.1016/0926-6410(95)00038-0

Rizzolatti, G., Fadiga, L., Matelli, M., Bettinardi, V., Paulesu, E., Perani, D., et al. (1996b). Localization of grasp representations in humans by PET: 1. Observation versus execution. Exp. Brain Res. 111, 246–252. doi: 10.1007/BF00227301

Rizzolatti, G., Fogassi, L., and Gallese, V. (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nat. Rev. Neurosci. 2, 661–670. doi: 10.1038/35090060

Rosen, H. J., Allison, S. C., Schauer, G. F., Gorno-Tempini, M. L., Weiner, M. W., and Miller, B. L. (2005). Neuroanatomical correlates of behavioural disorders in dementia. Brain 128, 2612–2625. doi: 10.1093/brain/awh628

Sah, P., Faber, E. S. L., De Armentia, M. L., and Power, J. (2003). The amygdaloid complex: anatomy and physiology. Physiol. Rev. 83, 803–834. doi: 10.1152/physrev.00002.2003

Shapiro, L. (2008). Making sense of mirror neurons. Synthese 167, 439–456. doi: 10.1007/s11229-008-9385-8

Smith, S. M., Jenkinson, M., Woolrich, M. W., Beckmann, C. F., Behrens, T. E. J., Johansen-Berg, H., et al. (2004). Advances in functional and structural MR image analysis and implementation as FSL. Neuroimage 23(Suppl. 1), S208–S219. doi: 10.1016/j.neuroimage.2004.07.051

Tettamanti, M., Buccino, G., Saccuman, M. C., Gallese, V., Danna, M., Scifo, P., et al. (2005). Listening to action-related sentences activates fronto-parietal motor circuits. J. Cogn. Neurosci. 17, 273–281. doi: 10.1162/0898929053124965

Thibault, P. (2004). Brain, Mind and the Signifying Body: An Ecosocial Semiotic Theory . London: A&C Black. Available online at: https://scholar.google.ca/scholar?q=Brain,+Mind+and+the+Signifying+Body:+An+Ecosocial+Semiotic+Theory&hl=en&as_sdt=0&as_vis=1&oi=scholart&sa=X&ei=Lf3xVOayBMK0ogSniYLwCA&ved=0CB0QgQMwAA

Tunik, E., Rice, N. J., Hamilton, A. F., and Grafton, S. T. (2007). Beyond grasping: representation of action in human anterior intraparietal sulcus. Neuroimage 36, T77–T86. doi: 10.1016/j.neuroimage.2007.03.026

Uithol, S., van Rooij, I., Bekkering, H., and Haselager, P. (2011). Understanding motor resonance. Soc. Neurosci. 6, 388–397. doi: 10.1080/17470919.2011.559129

Ulloa, E. R., and Pineda, J. A. (2007). Recognition of point-light biological motion: Mu rhythms and mirror neuron activity. Behav. Brain Res. 183, 188–194. doi: 10.1016/j.bbr.2007.06.007

Urgesi, C., Candidi, M., Ionta, S., and Aglioti, S. M. (2006). Representation of body identity and body actions in extrastriate body area and ventral premotor cortex. Nat. Neurosci. 10, 30–31. doi: 10.1038/nn1815

Van Essen, D. C. (2005). A Population-Average, Landmark- and Surface-based (PALS) atlas of human cerebral cortex. Neuroimage 28, 635–662. doi: 10.1016/j.neuroimage.2005.06.058

Van Essen, D. C., Drury, H. A., Dickson, J., Harwell, J., Hanlon, D., and Anderson, C. H. (2001). An integrated software suite for surface-based analyses of cerebral cortex. J. Am. Med. Inform. Assoc. 8, 443–459. doi: 10.1136/jamia.2001.0080443

Wiggs, C. L., and Martin, A. (1998). Properties and mechanisms of perceptual priming. Curr. Opin. Neurobiol. 8, 227–233. doi: 10.1016/S0959-4388(98)80144-X

Wyk, B. C. V., Hudac, C. M., Carter, E. J., Sobel, D. M., and Pelphrey, K. A. (2009). Action understanding in the superior temporal sulcus region. Psychol. Sci. 20, 771. doi: 10.1111/j.1467-9280.2009.02359.x

Zentgraf, K., Stark, R., Reiser, M., Künzell, S., Schienle, A., Kirsch, P., et al. (2005). Differential activation of pre-SMA and SMA proper during action observation: effects of instructions. Neuroimage 26, 662–672. doi: 10.1016/j.neuroimage.2005.02.015

Keywords: action observation, dance, social neuroscience, fMRI, repetition suppression, predictive coding

Citation: Tipper CM, Signorini G and Grafton ST (2015) Body language in the brain: constructing meaning from expressive movement. Front. Hum. Neurosci . 9:450. doi: 10.3389/fnhum.2015.00450

Received: 28 March 2015; Accepted: 28 July 2015; Published: 21 August 2015.

Reviewed by:

Copyright © 2015 Tipper, Signorini and Grafton. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Christine M. Tipper, Mental Health and Integrated Neurobehavioral Development Research Core, Child and Family Research Institute, 3rd Floor - 938 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada, [email protected]

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Healthcare (Basel)

Logo of healthcare

An Analysis of Body Language of Patients Using Artificial Intelligence

Rawad abdulghafor.

1 Department of Computer Science, Faculty of Information and Communication Technology, International Islamic University Malaysia, Kuala Lumpur 53100, Malaysia

Abdelrahman Abdelmohsen

Sherzod turaev.

2 Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain 15551, United Arab Emirates

Mohammed A. H. Ali

3 Department of Mechanical Engineering, Faculty of Engineering, University of Malaya, Kuala Lumpur 50603, Malaysia

Sharyar Wani

Associated data.

Not Applicable.

In recent decades, epidemic and pandemic illnesses have grown prevalent and are a regular source of concern throughout the world. The extent to which the globe has been affected by the COVID-19 epidemic is well documented. Smart technology is now widely used in medical applications, with the automated detection of status and feelings becoming a significant study area. As a result, a variety of studies have begun to focus on the automated detection of symptoms in individuals infected with a pandemic or epidemic disease by studying their body language. The recognition and interpretation of arm and leg motions, facial recognition, and body postures is still a developing field, and there is a dearth of comprehensive studies that might aid in illness diagnosis utilizing artificial intelligence techniques and technologies. This literature review is a meta review of past papers that utilized AI for body language classification through full-body tracking or facial expressions detection for various tasks such as fall detection and COVID-19 detection, it looks at different methods proposed by each paper, their significance and their results.

1. Introduction

One of the languages of communication is body language. Languages are divided into two categories: verbal and nonverbal. Body language is a type of nonverbal communication in which the body’s movements and actions are utilized instead of words to communicate and transmit information. According to [ 1 , 2 ], nonverbal cues such as gestures, body posture, eye movement, facial expressions, touch, and personal space utilization are all examples of body language.

Body language analysis is also necessary to avoid misunderstandings about the meanings and objectives of a single movement that has several meanings. gaze direction; iris extension; hand and leg position; sitting, walking, standing, or lying manner; body posture; and movement are all examples of how a person’s inner state is portrayed. Hands are arguably the richest wellspring of body language information after the face [ 3 ]. For example, one may tell if a person is honest (by turning the hands inside towards the interlocutor) or disingenuous (by turning the hands outside towards the interlocutor) (hiding hands behind the back). During a conversation, using open-handed gestures might convey the image of a more trustworthy individual, a tactic that is frequently employed in discussions and political conversations. It has been demonstrated that persons who make open-handed gestures are liked [ 4 ]. The posture of one’s head may also indicate a lot about one’s emotional state: people are more likely to talk more when the listener supports them by nodding. The rate at which you nod might indicate whether you have patience or not. The head is still at the front of the speaker in a neutral stance. When a person’s chin is elevated, it might indicate dominance or even arrogance. Revealing the neck could be interpreted as a gesture of surrender.

In the last few years, automatic body language analysis has gained popularity. This is due in part to the large number of application domains for this technology, which range from any type of human–computer interaction scenario (e.g., affective robotics [ 5 ]) to security (e.g., video surveillance [ 6 ]), to e-Health (e.g., therapy [ 7 ] or automated diagnosis [ 8 ]) are examples of e-Health, as are language/communication, e.g., sign language recognition [ 9 ]), and amusement (e.g., interactive gaming [ 10 ]). As a result, we can find research papers on a variety of topics related to human behavior analysis, such as action/gesture recognition [ 11 , 12 ], social interaction modeling [ 13 , 14 ], facial emotion analysis [ 15 ], and personality trait identification [ 16 ], to name a few. Ray Birdwhistell conducted research on using body language for emotional identification and discovered that the final message of a speech is altered only 35 percent by the actual words and 65 percent by nonverbal signals [ 17 ]. In addition, according to psychological study, facial expression sends 55 percent of total information and intonation expresses 38 percent in communication [ 4 ].

We will provide a new thorough survey in this study to help develop research in this area. First, we provide a description and explanation of the many sorts of gestures, as well as an argument for the necessity of instinctive body language detection in determining people’s moods and sentiments. Then we look at broad studies in the realm of body language processing. After that, we concentrate on the health care body language analysis study. Furthermore, we will define the automated recognition frame for numerous body language characteristics using artificial intelligence. Furthermore, we will describe an automated gesture recognition model that aids in the better identification of epidemic and pandemic illness external signs.

2. Body Language Analysis

2.1. overview of body language analysis.

Body language interpretations fluctuate from nation to country and from culture to culture. There is substantial debate about whether body language can be considered a universal language for all humans. Some academics believe that most of the interpersonal communication is based on physical symbols or gestures, because the interplay of body language enhances rapid information transfer and comprehension [ 18 ].

Body language analysis is also necessary to avoid misunderstandings about the meanings and objectives of a single movement that has several meanings. A person’s expressive movement, for example, may be caused by a physical limitation or a compulsive movement rather than being deliberate. Furthermore, one person’s bodily movement may not signify the same thing to another. For example, itching may cause a person to massage her eyes rather than weariness. Because of their societal peculiarities, other cultures also require thorough examination. There are certain common body language motions, but there are also movements unique to each culture. This varies depending on the nation, area, and even social category. In this chapter of this study, we will discuss the various aspects of body language analysis and will explain this below.

2.2. Body Language Analysis in Communication

In research from [ 19 ], body language is a kind of nonverbal communication. Humans nearly exclusively transmit and interpret such messages subconsciously. Body language may provide information about a person’s mood or mental condition. Aggression, concentration, boredom, relaxed mood, joy, amusement, and drunkenness are just a few of the messages it might convey. Body language is a science that influences all aspects of our lives. Body language is a technique through which a person may not only learn about other people by observing their body motions but also improve himself properly and become a successful person. Body language is a form of art that allows a person to acquire a new level of fame.

If language is a way of social connection, then body language is unquestionably a reflection of personality development. It can allow for reading other people’s minds, allowing a person to effortlessly mold himself to fit the thinking of others and make decisions for effective and impactful planning. The person’s mental mood, physical fitness, and physical ability are all expressed through body language. It allows you to have a deeper knowledge of individuals and their motives. It builds a stronger bond than a lengthy discussion or dispute. Reading body language is crucial for appropriate social interaction and nonverbal communication.

In human social contact, nonverbal communication is very significant. Every speaking act we perform is accompanied by our body language, and even if we do not talk, our nonverbal behavior continually communicates information that might be relevant. As a result, the following research [ 20 ] seeks to provide a summary of many nonverbal communication components. Nonverbal communication is usually characterized as the opposite of verbal communication: any occurrences with a communicative value that are not part of verbal communication are grouped under the umbrella term nonverbal communication, as well as auditory factors such as speaking styles and speech quality. On the one hand, paralinguistic (i.e., vocal) phenomena such as individual voice characteristics, speech melody, temporal features, articulation forms, and side noise can be found.

Nonvocal phenomena in conversation, on the other hand, include a speaker’s exterior traits, bodily reactions, and a variety of kinesics phenomena that can be split into macro-kinesics and micro-kinesics phenomena. Figure 1 depicts a comprehensive review of the many types of nonverbal communication.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g001.jpg

Overview of the main forms of nonverbal communication. The figure has been taken from [ 20 ].

In this study from [ 21 ], nonverbal conduct encompasses all forms of communication other than speaking. The term “communication” refers to the act of sending and receiving messages. Even though language use is a uniquely human trait, differing perspectives revolve around nonverbal behaviors and the current context. We employ body language without realizing it, as well as see and understand the body language of others. Nonverbal conduct is divided into three categories: verbal–vocal, nonverbal vocal, and nonverbal nonvocal. The link between verbal and nonverbal conduct is demonstrated through several gestures. Nonverbal events have a crucial role in the structure and occurrence of interpersonal communication, as well as the interaction’s movement-to-movement control. Nonverbal cues such as hierarchy and priority among communicators, signaling the flow of interaction, and giving meta-communication and feedback contribute to governing the system.

As shown in [ 22 ], body language is one of the most crucial aspects of communication. Communication that cannot be resolved due to body language stays unfinished. Our physical appearance also has a significant impact on how well we deliver our message. Our thoughts, expressions, postures, and gestures all have a significant impact on the weight of meaning and emotion carried by our phrases and words. Understanding and conveying emotions and thoughts rely heavily on body language. It is important for the proper expression and comprehension of messages during the communication process. It also promotes oral communication and establishes communication integrity. Body language accounts for 55% of how we impress people when speaking, words account for 7%, and discourse accounts for 38%.

It is critical to concentrate on this distinction if you want to be an effective speaker. Because body language is swiftly registered in the subconscious, the audience focuses on it. Even if the audience does not comprehend the spoken language, the audience may grasp the message through body language.

2.3. Body Language in Public Speaking

Although our face is the indicator of our thinking, we cannot deny that words are also quite powerful. We may look to the French and Russian Revolutions for instances of great speeches delivered by leaders. However, we cannot afford to overlook the reality that actions speak louder than words, i.e., body language is more potent than words. We use words to disguise our emotions and sentiments much of the time, but our body language makes them quite evident. Our formal and professional lives are completely reliant on nonverbal communication, which we engage in through our behaviors and body language. People in the office do not speak much yet transmit everything through their body language. Whenever they communicate, they consciously or unconsciously use their body language more than their words. In any conversation, body language is important. The image of Lord Krishna speaking to Arjuna on the fields of Kurukshetra will be read, described, and analyzed in this research study [ 23 ]. It will discuss the significance of body language when speaking in public.

2.4. Body Language Analysis in Teaching

The main goal of the study [ 24 ], was to assess the influence of teachers’ nonverbal communication on teaching success based on the findings of research on the link between teaching quality and teachers’ nonverbal communication and its impact on teaching success. The study results demonstrated that there was a substantial link between the quality, quantity, and technique of nonverbal communication used by instructors when instructing. According to the study’s findings of the research evaluated, the more teachers using verbal and nonverbal communication, the more effective their instruction and the academic achievement of their pupils were.

According to other research, why do certain teachers exude a mystical charisma and charm that sets them different from their colleagues? The Classroom X-Factor investigates the idea of possessing what the public has come to refer to as the “X-Factor” from the perspective of the teacher, providing unique insights into the use of nonverbal communication in the classroom. This study shows how both trainee and practicing teachers may find their own X-Factor to assist shift their perspectives and perceptions of themselves during the live act of teaching, using classroom and curricular provided examples. It also shows how instructors may change the way they engage with their students while simultaneously providing them with significant and powerful learning opportunities. Teachers may generate their own X-Factor by adopting easy strategies derived from psychology and cognitive science, and therefore boost their satisfaction and efficacy as professionals. Facial and vocal expression, gesture and body language, eye contact and smiling, teacher apparel, color and the utilization of space, nonverbal communication, and educational approaches are among the tactics outlined. Furthermore, the study includes a part with fictitious anecdotes that serve to contextualize the facts presented throughout the text [ 25 ].

2.5. Body Language Analysis in Sport

The literature reviewed shows that nonverbal behavior (NVB) changes as a result of situational variables, because either a person shows a nonverbal response to provoking internally and externally circumstances (as is theorized for just some basic emotions conveyed in the face) or because a person intentionally desires to convey certain information nonverbal cues to observers in a given situation. Certain NVBs have been demonstrated to have a range of consequences on later interpersonal results, including cognition, emotion, and behavior, when they are displayed and seen (e.g., [ 26 ] for reviews).

2.6. Body Language Analysis in Leadership

The authors of [ 27 ] examined the possibility of gender disparities in leaders’ nonverbal actions, as well as the impact these differences may have on their relative effectiveness. Nonverbal communication may reveal a leader’s emotions and increase followers’ involvement. Once the leader is aware of his or her gestures and body motions, he or she may compare them to those of more effective leaders. On certain levels, gender inequalities in nonverbal behavior occur. Women are linked to transformative traits such as compassion, love, and concern for others. Men, on the other hand, relate to traits such as aggressiveness, dominance, and mastery.

This demonstrated that productive women do not always exhibit the same nonverbal behaviors as effective males. Nonverbal hesitations, tag questions, hedges, and intensifiers are more likely to be used by fluent speakers. This suggests that leaders who shake their heads are more likely to exhibit higher counts of speaker fluency behaviors. It is also not tied to gender in any way. Another intriguing finding is that the head movement of nodding is linked to the behaviors of upper grin, broad smile, and leaning forward. This demonstrates that these affirming, good behaviors are linked in a major way. Furthermore, the observed leaders’ speech fluency is substantially connected with their head movement shaking.

2.7. Body Language Analysis in Culture

In [ 28 ], the authors discussed a range of body languages used in many civilizations throughout the world. The meanings that may be conveyed through body language are numerous. The following is an example: People from all cultures use the same body language, such as staring and eye control, facial emotions, gestures, and body movements, to communicate their common sense. Distinct cultures have different ways of communicating non-verbally, and varied people have different ways of expressing themselves via gestures. Nonverbal communication, in the same way as traffic, has a purpose and follows a set of norms to ensure that it flows smoothly among people from many diverse cultures.

On the other hand, cultures can use the same body language to communicate diverse meanings. There are three sides to it:

  • Eye contact differs by culture.
  • Other nonverbal signals vary by culture.
  • The right distance between two individuals reveals their distinct attitudes from different civilizations.

Our culture is as much about body language as it is about verbal discourse. Learning the various basic norms of body language in other cultures might help us better understand one other. People from many cultures are able to converse with one another. However, cultural exchanges and cultural shocks caused by our body language are becoming increasingly harsh and unavoidable.

As a result, while communicating in a certain language, it is best to utilize the nonverbal behavior that corresponds to that language. When a person is fully bilingual, he changes his body language at the same time as he changes his language. This facilitates and improves communication.

Lingua franca is a linguistic bridge that connects two persons who speak different native languages.

In this regard, it has been determined in [ 29 ] that we communicate with our vocal organs and that our bodies’ body language can be a lingua franca for multilingual interlocutory.

The findings indicate that the listener was attempting to comprehend the speaker’s gesture. Because the speaker cannot speak English fluently, he was having difficulty achieving precise diction. The speaker ultimately succeeded in expressing his views with gestures towards the end of the video. Furthermore, the interlocutors were involved in the delivery and reception of suggested meaning via gestures and body language. Even though a lingua franca (e.g., English) already existed, body language adds significance to the message.

Furthermore, according to the data collected in this study, the Korean model and a client had a tumultuous history while shooting certain photoshoots. The customer was not pleased with the model’s attitude, which he felt insulted him. Nonetheless, the Korean model apologized in a traditional Korean manner by kneeling to the customer and the judges.

The judges and the client were both impressed by her formal and courteous demeanor. To finish the analysis, this research employed multimodal transcription analysis with Jefferson and Modada transcript notation, as well as YouTube data clips. Some mistakes may continue, which might be an excellent starting point for additional study in the fields of lingua franca and body language to gain a more comprehensive understanding.

2.8. Body Language in Body Motions

Both cognitive-based social interactions and emotion-based nonverbal communication rely heavily on body movements and words. Nodding, head position, hand gestures, eye movements, facial expressions, and upper/lower-body posture, as well as speaking, are recognized to communicate emotion and purpose in human communication.

2.8.1. Facial Expressions

According to new research, facial expressions are changes in the appearance of the face caused by the movement of facial muscles. it is a nonverbal communication route. Emotional facial expressions are both symptoms and communication cues of an underlying emotional state. People seldom convey their feelings by using characteristic expressions connected with certain emotions that are also widely recognized across countries and settings. Furthermore, environmental circumstances have a significant impact on both the expression and detection of emotional responses by observers [ 30 ].

In recent years, as [ 31 ] notes, there has been a surge in interest in both emotions and their regulation, notably in the neurosciences and, more specifically, in psychiatry. Researchers have attempted to uncover patterns of expression in experimental investigations analyzing facial expressions. There is a large amount of data accessible; some of it has been validated, while others have been refuted, depending on the emotion studied and the method employed to assess it. Interpreting data that have not always been completely proven and are based on Paul Ekman’s hypothesis of six main types of expression is a key issue (happiness, anger, disgust, fear, sadness, and surprise).

The sense of happiness, with its expressive element of the “smile,” is the only one of Ekman’s “basic emotions” that is observably linked to the underlying physiological and facial pattern of expression. Regarding Ekman’s other basic patterns of expression, there is much scholarly debate. A better understanding of how emotions are regulated and how the dynamics of emotional facial expression may be described could lead to more basic research in a social situation. Even more crucially, it has the potential to increase knowledge of the interaction and social repercussions of emotional expression deficiencies in people with mental illness, as well as a therapeutic intervention. Innovative study in the realm of emotional facial expression might give thorough solutions to unanswered issues in the field of emotion research.

2.8.2. Gestures

Gestures are generally hand gestures (but they can also include head and facial movements) that serve two purposes: to illustrate speech and to transmit verbal meaning. Gestures are fascinating because they represent a sort of cognitive science; that is, they are motions that express an idea or a mental process [ 32 ].

Whenever a person is pondering what to say, gesturing relieves the cognitive burden. When people have been given a memory job while also explaining how and where to solve a math issue, for example, they recall more objects if they use gestures while describing the arithmetic. When counting objects, being able to point allows for higher precision and speed; when people are not permitted to tell, even nodding allows for greater precision [ 33 ]. Gestures aid in the smoothing of interactions and the facilitation of some components of memory. As a result, gestures can provide valuable insight into speakers’ states of mind and mental representations. Gestures may be divided into two types: those that occur in conjunction with speech and those that exist independently of speech [ 26 ].

3. Body Language Analysis and AI

3.1. overview.

In face-to-face talks, humans have demonstrated a remarkable capacity to infer emotions, and much of this inference is based on body language. Touching one’s nose conveys incredulity, whereas holding one’s head in the hands expresses upset among individuals of comparable cultures. Understanding the meaning of body language appears to be a natural talent for humans. In [ 34 ], the authors presented a two-stage system that forecast emotions related to body language with normal RGB video inputs to assist robots to develop comparable skills. The programmed guessed body language using input movies based on approximated human positions in the first step. After that, the expected body language was transmitted into the second step, which interpreted emotions.

Automated emotion identification based on body language is beneficial in a variety of applications, including health care, internet chatting, and computer-mediated communications [ 35 ]. Even though automated body language and emotions identification algorithms are used in a variety of applications, the body language and emotions of interest vary. Online chatting systems, for example, are focused on detecting people’s emotions, i.e., if they are happy or unhappy, whereas health care applications are concerned with spotting possible indicators of mental diseases such as depression or severe anxiety. Because a certain emotion can only be expressed through the associated body language, many applications necessitate the annotation of various body language and emotions.

3.2. Recognition of Facial Expressions

Facial expressions (FE) are important affect signaling systems that provide information about a person’s emotional state. They form a basic communication mechanism between people in social circumstances, along with voice, language, hands, and body position. AFER (automated FE recognition) is a multidisciplinary field that straddles behavioral science, neuroscience, and artificial intelligence.

Face recognition is a prominent and well-established topic in computer vision. Deep face recognition has advanced significantly in recent years, thanks to the rapid development of machine learning models and large-scale datasets. It is now widely employed in a variety of real-world applications. An end-to-end deep face recognition system produces the face feature for the recognition given a natural picture or video frame as input [ 36 ]. Face detection, feature extraction, and face recognition (seen in Figure 2 ) are the three main phases in developing a strong face recognition system [ 37 , 38 ]. The face detection stage is utilized to recognize and locate the system’s human face picture. The feature extraction stage is used to extract feature vectors for every human face that was found in the previous step. Finally, this face recognition stage compares the retrieved characteristics from the human face with all template face databases to determine the human face identification.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g002.jpg

Face recognition structure. The figure has been taken from [ 39 ].

3.2.1. Face Detection

The face recognition system starts with the identification of human faces in each picture. The goal of this phase is to see if there are any human faces in the supplied image. Face detection might be hampered by fluctuations in lighting and facial expression. Pre-processing activities are carried out to enable the creation of a more robust face recognition system. Many approaches, such as in [ 40 ] and the histogram of oriented gradient (HOG) [ 41 ], are utilized to identify and locate the human face picture. Face detection may also be utilized for video and picture categorization, among other things.

3.2.2. Feature Extraction

The major purpose of this phase is to extract the characteristics of the face photos that were discovered in the detection stage. This stage defines a face using a “signature,” which is a set of characteristic vectors that characterize the major aspects of the face picture, such as the mouth, nose, and eyes, as well as their geometric distribution [ 42 ]. Each face has a unique structure, size, and form that allows it to be recognized. To recognize the face using size and distance, some ways involve extracting the contour of the lips, eyes, or nose [ 37 ]. To extract facial characteristics, approaches such as HOG [ 43 ], independent component analysis (ICA), linear discriminant analysis (LDA) [ 44 ], and scale-invariant feature transform (SIFT) [ 38 ], and local binary pattern (LBP) [ 42 ] are commonly utilized.

3.2.3. Face Recognition

This phase compares the features derived from the backdrop in the feature extraction stage to recognized faces recorded in a database. Face recognition may be used for two different purposes: identification and verification. A test face is compared with a set of faces during the identification process to discover the most likely match. To determine the approval or rejection decision, a test face is compared with a known face in the database during the identification process [ 45 ]. This challenge has been successfully addressed by correlation filters (CFs) [ 46 ], convolutional neural networks (CNN) [ 47 ], and k-nearest neighbor (K-NN) [ 48 ].

3.3. Face Recognition Techniques

Considering the data that have been reported thus far, these authors [ 44 ] believed that three techniques stand out as particularly promising for future development in this area: (i) the development of 3D face recognition, (ii) the use of multimodal fusion techniques of complementary data types, particularly those based on visible and near-infrared images, and (iii) the use of deep learning methods.

3.3.1. D Facial Recognition

Due to the 3D structure of the face, some characteristics are lost in 2D image-based approaches. Two key unsolved issues in 2D face recognition are lighting and position variability. The scientific community has recently focused on 3D facial recognition to tackle unsolved challenges in 2D face recognition and obtain considerably greater accuracy by assessing the geometry of hard features on the face. As a result, various contemporary techniques based on 3D datasets in [ 49 , 50 ] have been created.

3.3.2. Multimodal Facial Recognition

Sensors with the demonstrated capacity to capture not only 2D texture information but also face shape, that is, three-dimensional information, have been created in recent years. As a result, several recent studies have combined the two forms of 2D and 3D information to take use of each and create a hybrid system that increases recognition as a single modality [ 51 ].

3.3.3. Deep Learning Facial Recognition

DL is a wide notion with no precise definition; nonetheless, researchers [ 52 , 53 ] have agreed that DL refers to a collection of algorithms that aim to model high-level abstractions by modeling several processing levels. This field of study, which dates to the 1980s, is a branch of autonomous learning in which algorithms are employed to create deep neural networks (DNN) that are more accurate than traditional procedures. Recently, progress has been made to the point that DL outperforms humans in several tasks, such as object recognition in photos.

3.4. Recognition of Gestures

We reviewed contemporary deep-learning-based algorithms for gesture identification in videos in this part, which are primarily driven by the fields of human–computer, machine–human, and robot interaction.

3.4.1. Convolutional Neural Networks in 2D

Applying 2D CNNs to individual frames and afterward averaging the result for categorization is the first way that immediately springs to mind for identifying a sequence of pictures. In [ 54 ], it has been described a CNN machine learning framework for human posture estimation and constructs a spatial component that tries to make joint predictions by considering the locations of related joints. They train numerous convents to perform binary body-part categorization independently (i.e., presence or absence of that body part). These nets are applied to overlapping portions of the input as sliding windows, resulting in smaller networks with greater performance. However. a CNN-based mega model for human posture estimation has been presented in [ 55 ]. The authors extract characteristics from the input picture using a CNN. These characteristics are subsequently fed into joint point regression and body part identification tasks. For gesture recognition gesture identification (fingers spelling of ASL) using depth pictures, Kang et al. (2015) use a CNN to extract the features from the fully connected layers. Moreover, a deep learning model for estimating hand posture that uses both unlabeled and synthesized created data is offered in [ 56 ]. The key to the developed framework is that instead of embedding organization in the model architecture, the authors incorporate information about the structure into the training approach by segmenting hands into portions. For identifying 24 American Sign Language (ASL) hand movements, CNN and stacked de-noising autoencoder (SDAE) were employed in [ 57 ]. A multiview system for point cloud hand posture identification has been showed in [ 58 ]. It has been created view picture by projecting the hand point cloud onto several view planes and then feature extraction from these views using CNN. A CNN that uses a GMM-skin detector to recognize hands and then align them to the major axes has been presented in [ 59 ]. After that, they used a CNN with pooling and sampling layers, as well as a typical feed-forward NN as a classifier.

Meanwhile, a CNN that retrieves 3D joints based on synthetic training examples for hand position prediction has been presented in [ 60 ]. A neural network turns its output of the convolution layer to heat maps (one for each joint) on top of the final layer, displaying the likelihood for each joint. An optimization problem is used to recover poses from a series of heatmaps.

3.4.2. Features That Are Dependent on Motion

Gesture recognition has been widely utilized using neural networks and CNNs based on body posture and hand estimation as well as motion data. To achieve better results, temporal data must be incorporated into the models rather than geographical data. Two-stream (spatiotemporal) CNNs to learn from a set of training gestures for gesture style detection in biometrics have been studied in [ 61 ]. The spatial network is fed with raw depth data, while the temporal network is fed with optical flow. However, color and motion information to estimate articulated human position in videos were used in [ 62 ]. With an RGB picture and a collection of motion characteristics as input data, the authors present a convolutional network (ConvNet) framework for predicting the 2D position of human joints in the video. The perspective projections of the 3D speed of sliding surfaces are one of the motion characteristics employed in this technique. For gesture identification from depth data, three forms of dynamic-depth image (DDI), dynamic-depth normal image (DDNI), and dynamic-depth motion normal image (DDMNI), as with the input data of 2D networks, were employed in [ 54 ]. The authors used bidirectional rank pooling to create these dynamic pictures from a series of depth photos. These representations are capable of successfully capturing spatiotemporal data. A comparable concept of gesture recognition in continual depth video is proposed in [ 41 ]. They determine the utter and total depth difference between the current frame and the beginning frame for every gesture segment, which is a kind of motion characteristic as the input data of a deep learning network, and then they begin building an improved depth motion map (IDMM) by calculating the utter and total depth difference between the current frame and the beginning frame for each gesture segment; this serves as a kind of motion characteristic as the input data of a deep learning network.

3.4.3. Convolutional Neural Networks in 3D

Many 3D CNNs for gesture recognition have been presented by [ 3 , 49 , 63 ], where a three-dimensional convolutional neural network (CNN) for recognizing driver hand gestures based on depth and intensity data has been presented in [ 3 ]. For the final forecast, the authors use data from several spatial scales. It also makes use of spatiotemporal data enrichment for even more effective training and to avoid overfitting. However, a recurrent mechanism to the 3D CNN to recognize and classify dynamic hand movements has been added in [ 55 ]. A 3D CNN is used to extract spatiotemporal features, a recurrent layer is used for global temporal modeling, and a SoftMax layer is used to forecast class-conditional gesture probabilities. Continuously, a 3D CNN for sign language identification that extracts discriminative spatiotemporal characteristics from a raw video stream has been presented in [ 63 ]. (RGB-D and Skeleton data) of streaming video, containing color information, depth clue, and body joint locations, are utilized as input to the 3D CNN to improve the performance has been offered in [ 64 ]. By merging depth and RGB video, a 3D CNN model for large-scale gesture detection. In a similar vein, an end-to-end 3D CNN based on the model of [ 65 ] and uses it for large-scale gesture detection has been pointed in [ 66 ], the wide range of use cases of CNNs for various gesture recognition tasks across the years proves their effectiveness in such tasks, the presence of an extra dimension makes 3D CNNs unique in that the third dimension can be mapped to a time dimension to process videos or a depth dimension to acquire more useful data for a task as seen in [ 67 ]. Previous literature support this finding by indicating that combining 3D-based CNNs with temporal models such as an RNN yields desirable results and allows the usage of continuous streams such as videos, currently, CNNs are widely utilized for 2D and 3D based image and gesture recognition and detection tasks.

3.4.4. RNN and LSTM Models for Temporal Deep Learning

Interestingly, despite being a promising study area, periodic deep learning models have still not been frequently employed for gesture identification. In [ 68 ], it has been offered a multimodal (depth, skeleton, and voice) gesture recognition system based on RNN, which we are aware of. Each modality is initially processed in small spatiotemporal blocks, wherein discriminative data-specific characteristics are either manually retrieved or learned. After that, RNN is used to simulate large-scale temporal relationships, data fusion, and gesture categorization. Furthermore, in [ 69 ] it has been studied a multi-stream RNN for large-scale gesture detection. [ 70 ] proposes a convolutional long short-term memory recurrent neural network (CNNLSTM) capable of learning gestures of various lengths and complexity. Faced with the same challenge [ 71 ], it has been suggested MRNN, a multi-stream model that combines RNN capabilities with LSTM cells to help handle variable-length gestures. However, in [ 51 ] sequentially supervised long short-term memory (SS-LSTM) has been suggested; wherein auxiliary information is employed as sequential supervision at each time step instead of providing a class label to the output layer of RNNs. To identify sample frames from the video sequence and categorize the gesture, the authors in [ 49 ] have been employed a deep learning architecture. To build the tiled binary pattern, they use a tiled picture formed by sampling the whole movie as that of the input of a reconvened. The trained long-term recurring convolution network then receives these representative frames as input. However, it has been presented in [ 71 ] an EM-based approach for poor supervision that integrates CNNs with hidden Markov-Models (HMMs).

4. Body Language Analysis of Patients and AI

4.1. overview.

Different artificial intelligence (AI) methods and techniques have been used in analyzing the body language of the patients. Machine learning methods showed a high level of flexibility to a variety of pharmacological conditions. We briefly discuss some studies held so far in this area.

4.2. Facial Recognition

More specifically, focusing on facial recognition, a pimple system called the facial action coding system (FACS) was introduced in [ 71 ] to analyze facial muscles and thus identify different emotions. The proposed system automatically tracks faces using video and extracts geometric shapes for facial features. The study was conducted on eight patients with schizophrenia, and the study collected dynamic information on facial muscle movements by going through the specifics of the automated FACS system and how it may be used for video analysis. There are three steps to it. The first stage (image processing) explains how face photos are processed for feature extraction automatically. Next is action unit detection, which explains how we train and evaluate action unit classes. The process finishes (application to video analysis) by demonstrating how to utilize classifiers to analyze movies to gather qualitative and quantitative data on affective problems in neuropsychiatric patients. This study showed the possibility of identifying engineering measurements for individual faces and determining their exact differences for recognition purposes. As a result, Controls 3, 2, and 4 patients were quite expressive, according to the automated evaluation, but patients 1, 2, and 4 were relatively flat. Control 1 and patient 3 were both in the middle of the spectrum. Patients 4 and 3 had the highest levels of inappropriate expressiveness, whereas patients 1 and controls 1–4 had moderate levels.

Three methods were used in [ 31 ] to measure facial expression to determine emotions and identify persons with mental illness. The study’s proposed facial action coding system enabled the interpretation of emotional facial expressions and thus contributed to the knowledge of therapeutic intervention for patients with mental illnesses. This can range from seeing a person engaging in a group in real life to filmed encounters in which facial expressions are recorded under laboratory circumstances after the emotion is elicited experimentally. Using the picture of a filmed face for image processing and capturing precise expression changes (called action units), this technology permits the detection of fundamental emotions throughout time. By utilizing surface electrodes, an electromyography (EMG) approach was created to distinguish the activation of facial muscles as correctly and clearly as feasible. This advancement in technology enabled the detection and independent recording of the actions of the modest visible facial muscles. Automatic face recognition: the quality of commercially available systems for automatic face recognition has significantly increased. The SHORE™ technology, which is the world’s premier face detection system, is the result of years of research and development in the field of intelligent systems. SHORE™ led to the development of a high-performance real-time C++ software library. A significant percentage of people suffer from a nervous system imbalance, which causes paralysis of the patient’s movement and unexpected falls. So, A better understanding of how emotions are regulated and how the dynamics of facial expression of emotion can be explained could lead to a better understanding of the interactive and social consequences of emotional expression deficits in people with mental illness, as well as a therapeutic intervention.

Most patients with any neurological condition have ambulatory disruption at any stage of the disease, which can lead to falls without warning signs, and each patient is unique. As a result, a technique to identify shaky motion is required.

4.3. Fall Detection

A thesis topic in [ 72 ] is about assessing the real-time gait of a Parkinson’s disease patient to actively respond to unstable motions. They devised a technique to monitor a real-time gait analysis algorithm by wearing SHIMMER wireless sensors on the waist, chest, and hip based on real-world data to see which one is the most suited to identify any gait deviation. This approach is efficient, sensitive to identifying miner deviation, and user-configurable, allowing the user to adjust the sampling rate and threshold settings for motion analysis. Researchers can utilize this technique without having to develop it themselves in their research. The initial sampling rate is set to 100 MHz, and it operates with precalculated threshold values. Accelerometers worn on the chest reveal excessive acceleration during falls, and thus it is best to wear them on the waist. Additionally, as illustrated in the aware gait, if a patient takes steps with vigor, her or his gait may become steadier: the patient still has postural instability and falls following the DBS treatment. As a result, even after surgery, such people may have impaired cognition. Another discovery is that people with this condition may tilt left or right when turning.

Because the suggested approach is sensitive to detecting falls, it may be used objectively to estimate fall risk. The same algorithm, with small tweaks, may be used to identify seizures in different conditions, primarily epileptic seizures, and inform health care personnel in an emergency.

In the medical field, fall detection is a big issue. Elderly folks are more likely than others to fall. People over the age of 65 account for more than half of all injury-related hospitalizations. Commercial fall detection devices are costly and need a monthly subscription to operate. For retirement homes and clinics to establish a smart city powered by AI and IoT, a more inexpensive and customizable solution is required. A reliable fall-detection system would detect a fall and notify the necessary authorities.

In [ 73 ], they used edge-computing architecture to monitor real-time patient behavior and detect falls using an LSTM fall detection model. To track human activity, they employed MbientLab’s MetaMotionR wireless wearable sensor devices, which relayed real-time streaming data to an edge device. To analyze the streaming sensor data, we used a laptop as an edge device and built a data analysis pipeline utilizing bespoke APIs from Apache Flink, TensorFlow, and MbientLab. The model is trained by the “MobiAct” dataset, which has been released. The models were shown to be efficient and may be used to analyze appropriate sampling rates, sensor location, and multistream data correction by training them using already public datasets and then improving them. Experiments demonstrated that our architecture properly identified falls 95.8% of the time using real-time sensor data. We found that the optimal location for the sensors is at the waist and that the best data gathering frequency is 50 Hz. We showed that combining many sensors to collect multistream data improves performance.

We would like to expand the framework in the future to include several types of cloud platforms, sensors, and parallel data processing pipelines to provide a system for monitoring patients in clinics, hospitals, and retirement homes. We want to use the MbientLab MetaTracker to construct ML models to identify additional activities and analyze biometrics such as the subject’s heartbeat before and after a fall, sleep pattern, and mobility pattern, as well as track patients’ activity.

4.4. Smart Homes in Health Care

Many individuals, particularly the elderly and ill, can live alone and keep their freedom and comfort in smart houses. This aim can only be achieved if smart homes monitor all activities in the house and any anomalies are quickly reported to family or nurses. As shown in Figure 3 , smart houses feature a multilayered design. The physical layer (environment, objects, and inhabitants), communication layer (wired and wireless sensor network), data processing layer (data storage and machine learning techniques), and interface layer are the four levels (software such as a mobile phone application). Sensors collect data about inhabitants’ activities and the status of the environment, then send it to a server’s data processing layer, where it is evaluated. Users get the results (such as alarms) and interact with the smart home through a software interface. Edge sensors make it easier to monitor various metrics across time, the data is then sent to another device for processing and predictions, this lifts the weight of processing from the sensors to more capable devices, ref. [ 74 ] proposes and architecture for smart cameras that allows them to perform high-level inference directly within the sensor without sending the data to another device.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g003.jpg

Multilayered architecture of a smart home. The figure was taken from [ 75 ].

The most common uses of smart homes in health care are automation tasks aiming at activity recognition for a range of objectives, such as activity reminders for Alzheimer’s patients and remote monitoring of people’s health via regulating their vital signs.

4.4.1. Anomaly Detection Using Deep Learning

The authors of [ 76 ] used raw outputs from binary sensors, such as motion and door sensors, to train a recurrent network to anticipate which sensor would be turned on/off in the next event, and in this on/off mode, how long it would remain. They then expanded this event into k sequences of successive occurrences using beam search to discover the likely range of forthcoming actions. Several novel approaches for assessing the spatio-temporal sequences’ similarity were used to evaluate the inaccuracy of this prediction, i.e., the distance between these potential sequences and the true string of events. The anomaly scores likelihood can be determined by modeling this inaccuracy as a Gaussian distribution. Abnormal activities will be regarded as input sequences that score higher than a specific threshold. The trials showed that this approach can detect aberrant behaviors with a high level of accuracy.

The suggested method’s general scheme is depicted in Figure 4 . The raw sensor events are first preprocessed, which comprises the processes below:

  • The SA value is derived by adding the S and A values together.
  • SA’s character string has been encoded. This encoding can be done in one of two ways: one-hot encoding or word embedding.
  • D is determined by subtracting the current and previous event timestamps.
  • The return of time, periodicity, and cycle are all taken into account while converting timestamps.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g004.jpg

The overall scheme of the proposed method [ 76 ].

4.4.2. Anomaly Detection Using Bayesian Networks

A Bayesian network is a representation of a joint probability distribution of a set of random variables with a possible mutual causal relationship. The network consists of nodes representing the random variables, edges between pairs of nodes representing the causal relationship of these nodes, and a conditional probability distribution in each of the nodes. The main objective of the method is to model the posterior conditional probability distribution of outcome (often causal) variable(s) after observing new evidence [ 77 ].

The goal of [ 75 ] is to identify abnormalities at the proper moment so that harmful situations can be avoided when a person interacts with household products. Its goal is to improve anomaly detection in smart homes by expanding functionality to evaluate raw sensory data and generate suitable guided probabilistic graphical models (Bayesian networks). The idea is to determine the chance of the current sensor turning on and then have the model sound an alarm if the probability falls below a specific threshold. To do this, we create many Bayesian network models of various sizes and analyze them to find the best optimal network with adequate causal links between random variables. The current study is unique in that it uses Bayesian networks to model and train sensory data to detect abnormalities in smart homes. Furthermore, by giving an approach to removing unneeded random variables, identifying the ideal structure of Bayesian networks leads to greater assessment metrics and smaller size. (We look at the first-order Markov property as well as training and evaluating Bayesian networks with various subsets of random variables.)

We use Bayesian network models to analyze sensory data in smart homes to detect abnormalities and improve occupant safety and health. Pre-processing, model learning, model assessment, and anomaly detection are the four primary steps of the proposed technique Figure 5 .

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g005.jpg

The proposed architecture for anomaly detection in smart homes. The figure has been taken from [ 75 ].

4.4.3. Anomaly Detection Using a Graph-Based Approach

Another approach based on data analysis was presented in [ 78 ] for sensor-based smart home settings that have been effectively deployed in the past several years to help elderly persons live more independently. Smart homes are designed to not interfere with inhabitants’ routine activities and to lower the expense of health care connected with their care. Because senior inhabitants are more prone to cognitive health difficulties, analyzing their daily activity using some type of automated tool based on sensor data might offer valuable information about their health state. It is demonstrated that one way to achieve this is to use a graph-based approach to data collected from residents’ activities. It also presents case studies for cognitively impaired participants and discusses how to link these anomalies to the decline in their cognitive abilities, providing clinicians and caregivers with important information about their patients. An unsupervised graph technique has been employed to discover temporal, geographical, and behavioral abnormalities in senior residents’ everyday activities using activity data from smart home sensors. They further hypothesized that these strange actions may indicate a participant’s cognitive deterioration. Data on smart home activities may be created in real-time, as a data stream. They recruited three cognitively challenged people at random for the trial. They would like to change the sample and conduct several trials in the future to see if comparable anomalies may be found. They would also like to look at the graph topology’s resilience to see how much a change in graph topology affects the outcome of anomaly detection. Furthermore, they intend to enlist the help of a doctor as a domain expert to confirm our theory that these abnormalities are true signs of cognitive deterioration (or MCI).

Continuously, it has been planned to expand tests to a real-time data stream in the future. Planned as well is the conversion of real-time sensor logs into graph streams, as well as the search for abnormalities in graph streams, which might allow a real-time health monitoring tool for residents and assist doctors and nurses.

4.5. AI for Localizing Neural Posture

The elderly and their struggle to live independently without relying on others was the subject of a study under assessment. The goal of the study [ 79 ] was to compare automated learning algorithms used to track their biological functions and motions. Using reference features, the support conveyor algorithm earned the greatest accuracy rate of 95 percent among the eight higher education algorithms evaluated. Long periods of sitting are required in several vocations, which can lead to long-term spine injuries and nervous system illnesses. Some surveys aided in the development of sitting position monitoring systems (SPMS), which use sensors attached to the chair to measure the position of the seated individual. The suggested technique had the disadvantage of requiring too many sensors.

This problem was resolved by designing sitting posture monitoring systems (SPMSs) to help assess the posture of a seated person in real-time and improve sitting posture. To date, SPMS studies have required many sensors mounted on the backrest plate and seat plate of a chair. The present study of [ 80 ], therefore, developed a system that measures a total of six sitting postures including the posture that applied a load to the backrest plate, with four load cells mounted only on the seat plate. Various machine learning algorithms were applied to the body weight ratio measured by the developed SPMS to identify the method that most accurately classified the actual sitting posture of the seated person. After classifying the sitting postures with several classifiers, a support vector machine using the radial basis function kernel was used to obtain average and maximum classification rates of 97.20 percent and 97.94 percent, respectively, from nine subjects. The suggested SPMS was able to categorize six sitting postures, including one with backrest loading, and demonstrated that the sitting posture can be classified even when the number of sensors is reduced.

Another posture can we share here is for patients who are in the hospital for an extended period, pressure ulcer prevention is critical. To arrange posture modification for patients, a human body lying posture (HBLP) monitoring system is required. The traditional technique of HBLP monitoring, video surveillance, has several drawbacks, including subject privacy and field-of-view occlusion. With no sensors or wires attached to the body and no limits imposed on the subject, the paper [ 81 ] presented an autonomous technique for identifying the four state-of-the-art HBLPs in healthy adult subjects: supine, prone, left, and right lateral. Experiments using a collection of textile pressure sensors implanted in a cover put beneath the bedsheet were done on 12 healthy persons (ages 27.35 5.39 years). A supervised artificial neural network classification model was given a histogram of directed gradients and local binary patterns. Scaled conjugate gradient back-propagation was used to train the model. To evaluate the classification’s generalization performance, nested cross-validation with an exhaustive outer validation loop was used. Intriguingly, a high testing prediction accuracy of 97.9% was found, with a Cohen’s kappa coefficient of 97.2 percent. In contrast to most previous similar studies, the classification successfully separated prone and supine postures. They discovered that combining body weight distribution information with shape and edge information improves classification performance and the capacity to distinguish between supine and prone positions. The findings are encouraging in terms of unobtrusively monitoring posture for ulcer prevention. Sleep studies, post-surgical treatments, and other applications that need HBLP identification can all benefit from the approach.

In patients with myopathy, peripheral neuropathy, plexopathy, or cervical/lumbar radiculopathy, needle electromyography (EMG) is utilized to diagnose a neurological injury. Because needle EMG is such an intrusive exam, it is critical to keep the discomfort to a minimum during inspections. The Electrodiagnosis Support System (ESS), a clinical decision support system specialized for upper-limb neurological damage diagnosis, has been described in the work [ 82 ]. ESS can help users through the diagnostic process and make the best option for eliminating unwanted examinations, as well as serve as a teaching tool for medical students. Users may input the results of needle EMG testing and get diagnosis findings using ESS’s graphical user interface, which depicts the neurological anatomy of the upper limb. We used the diagnostic data of 133 real patients to test the system’s accuracy.

4.6. AI for Monitoring Patients

In the recent decade, automated patient monitoring in hospital settings has received considerable attention. An essential issue is mental patient behavior analysis, where good monitoring can reduce the risk of injury to hospital workers, property, and the patients themselves.

For this task, a computer vision system for monitoring patients was created in safe rooms in hospitals to evaluate their movements and determine the danger of hazardous behavior by extracting visual data from cameras mounted in their rooms. To identify harmful behavior, the proposed technique leverages statistics of optical flow vectors collected from patient motions. Additionally, the approach uses foreground segmentation and blob tracking to extract the shape and temporal properties of blobs such as arriving and leaving the room, sleeping, fighting, conversing, and attempting to escape as shown in Figure 6 . Preliminary findings suggest that the technology might be used in a real hospital setting to help avoid harm to patients and employees. A more advanced classification framework for merging the characteristics might be used to increase the system performance and attain a practically low error rate.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g006.jpg

Example of activities to be detected. These images have been taken from [ 83 ].

Intelligent sensing sensors and wireless communication networks are used in smart health care equipment and applications. The goal of this integration is to improve patient monitoring and make minor illness detection easier.

The study conducted by [ 84 ] presents a multilevel decision system (MDS) for recognizing and monitoring patient behavior based on sensed data. Wearable sensing devices are implanted in the body to detect physiological changes at set intervals. The data collected by these sensors is utilized by the health care system (HS) to diagnose and predict illnesses. In this suggested MDS, there are two layers of decision-making: the first is aimed to speed up the data collection and fusion process. Data correlation is used to detect certain behaviors during the second-level decision process. Inter-level optimization reduces errors by fusing multi-window sensor data, allowing for correlation. This optimization acts as a bridge between the first and second decision-making stages. The wearable sensor and health care system are depicted as part of the decision-making process. Using multi-window fusion decision-making, the health care system (HS) in Figure 7 performs activity/behavior extraction, data fusion, and feature extraction. It has data streaming characteristics that make it easier to make decisions, even with nonlinear sensor results. Storage, updating, analysis, and correlation of sensor data are carried out in the second decision-making phase. The data from the body-worn wearable sensors is compiled on a smart handheld device (e.g., cellphones, digital gadgets) and sent to the HS over the Internet.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g007.jpg

WS to the health care system has been taken from [ 84 ].

The patient’s behavior and the type of the ailment were recognized based on this information for use in future diagnosis and prediction. MDS also uses flexible information analysis to match patient behavioral analysis and come up with improved recommendations. MDS’s dependability is demonstrated by experimental analysis, which improves the true positive rate, F-measure score, and accuracy fusion latency.

4.7. AI and Patient’s Lower Limb Movement

The qualitative and quantitative study of climbing, running, and human walking is referred to as HUMAN lower limb motion analysis. It is based on kinematic notions as well as human anatomy and physiology, and it is frequently used in augmented virtual reality, foot navigation, and medical rehabilitation, among other applications [ 85 ].

4.7.1. Evaluation of Paraplegics’ Legged Mobility

The inability to walk and stand is one of the most important disabilities caused by paraplegia. Along with a reduction in movement. This research [ 86 ] examined a lower limb exoskeleton for paraplegics who require leg movement. The research offers a single-subject case study on a patient with a T10 motor and sensory complete damage, comparing legged movement using an exoskeleton versus locomotion using knee–ankle–foot orthoses (KAFOs). The timed up-and-go test, the Ten-Meter Walk Test (10 MWT), and the six-minute walk test (6 MWT) are used to measure the subject’s capacity to stand, walk, turn, and sit. The Physiological Cost Index was used to determine the level of effort associated with each evaluation tool. Results indicate that the subject was able to perform the respective assessment instruments 25%, 70%, and 80% faster with the exoskeleton relative to the KAFOs for the timed up-and-go test, the 10 MWT, and the 6 MWT, respectively. Measurements of exertion indicate that the exoskeleton requires 1.6, 5.2, and 3.2 times less exertion than the KAFOs for each respective assessment instrument. The results indicate that the enhancement in speed and reduction in exertion is more significant during walking than during gait transitions.

4.7.2. Estimating Clinically of Strokes in Gait Speed Changing

In persons who have had a stroke, gait speed is routinely used to determine walking capacity. It is unclear how much of a difference in gait speed corresponds to a significant change in walking capacity. The goal of the study [ 87 ] was to quantify clinically significant changes in gait speed using two distinct anchors for “significant”: Perceptions of progress in walking capacity among stroke survivors and physical therapists. After a first-time stroke, the participants received outpatient physical treatment (mean 56 days post-stroke). At admission and discharge, self-selected walking speed was assessed. On a 15-point ordinal global rating of change (GROC) scale, subjects and their physical therapists scored their perceived change in walking ability after discharge. Using receiver operating characteristic curves and the participants’ and physical therapists’ GROC as anchors, estimated relevant change values for gait speed were determined. All the subjects’ initial gait speeds were 0.56 (0.22) m/s on average. Depending on the anchor, the assessed significant change in gait speed was between 0.175 m/s (participants felt the change in walking ability) and 0.190 m/s (physical therapists perceived change in walking ability). Individuals who increase their gait speed by 0.175 m/s or more during the subacute period of rehabilitation are more likely to have a considerable improvement in walking ability. Clinicians and researchers can utilize the estimated clinically relevant change value of 0.175 m/s to establish objectives and analyze the change in individual patients, as well as to compare important changes between groups.

4.7.3. Measuring Parkinson’s Gait Quality

Wearable sensors that monitor gait quality in daily activities have the potential to improve medical evaluation of Parkinson’s disease (PD). Four gait partitioning strategies were examined in the work [ 88 ], two based on machine learning and two based on the thresholds approach, all using the four-phase model. During walking tasks, the approaches were evaluated on 26 PD patients in both ON and OFF levodopa circumstances, as well as 11 healthy volunteers. All the participants wore inertial sensors on their feet. The reference time sequence of gait phases was assessed using force resistive sensors. To determine the accuracy of gait phase estimation was using the goodness index (G). For gait quality evaluation, a new synthetic index termed the gait phase quality index (GPQI) was developed. The results indicated that three of the examined techniques had optimal performance (G 0.25) and one threshold approach had acceptable performance (0.25 G 0.70). The GPQI was shown to be considerably higher in PD patients than in healthy controls, with a modest connection with clinical scale scores. Furthermore, GPQI was shown to be greater in the OFF state than in the ON state in individuals with significant gait impairment. Our findings show that real-time gait segmentation based on wearable sensors may be used to assess gait quality in people with Parkinson’s disease.

4.8. Remark

Recent advancements in low-cost smart home devices and wireless sensor technology have resulted in an explosion of small, portable sensors that can measure body motion rapidly and precisely. Movement-tracking technologies that are both practical and beneficial are now available. Therapists need to be aware of the possible benefits and drawbacks of such new technology. As said in [ 89 ], therapists may be able to undertake telerehabilitation in the future using body-worn sensors to assess compliance with home exercise regimens and the quality of their natural movement in the community. Therapists want technology tools that are simple to use and give actionable data and reports to their patients and referring doctors. Therapists should search for systems that have been evaluated in terms of gold standard accuracy as well as clinically relevant outcomes such as fall risk and impairment severity.

5. AI and COVID-19

5.1. overview.

The medical sector is seeking innovative tools to monitor and manage the spread of COVID-19 in this global health disaster. Artificial intelligence (AI), the Internet of Things (IoT), big data, and machine learning are technologies that can readily track the transmission of this virus, identify high-risk individuals, anticipate new illnesses, and aid in real-time infection management. These technologies might also forecast mortality risk by thoroughly evaluating patients’ historical data.

The study by [ 90 ] examined the role of artificial intelligence (AI) as a critical tool for analyzing, preparing for, and combating COVID-19 (Coronavirus) and other pandemics. AI can aid in the fight against the virus by providing population screening, medical assistance, notification, and infection control recommendations. As an evidence-based medical tool, this technology has the potential to enhance the COVID-19 patient’s planning, treatment, and reported outcomes.

Artificial Intelligence (AI) is an emerging and promising technology for detecting early coronavirus infections and monitoring the state of affected individuals. It can monitor the COVID-19 outbreak at many scales, including medical, molecular, and epidemiological applications. It is also beneficial to aid viral research by evaluating the existing data. Artificial intelligence can aid in the creation of effective treatment regimens, preventative initiatives, and medication and vaccine development.

The basic approach of AI and non-AI-based programs that assist general physicians in identifying COVID-19 symptoms is shown in Figure 8 . The flow diagram below illustrates and contrasts the flow of minimum non-AI versus AI-based therapy. The flow diagram below demonstrates how AI is used in key aspects of high-accuracy therapy, reducing the complexity and time required. With the AI application, the physician is not only focused on the patient’s therapy, but also illness control. AI is used to analyze major symptoms and test results with the highest level of accuracy. It also demonstrates that it minimizes the overall number of steps in the entire process, making it more readily available in nature.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g008.jpg

The general procedure of AI and non-AI-based applications help general physicians to identify the COVID-19 symptoms. This figure has been taken from [ 90 ].

5.2. AI Training Techniques

The medical field makes use of two different paradigms when it comes to AI, supervised where the data is labeled and the model learns to map features to an outcome, the outcome is known beforehand, making it easier to score the model and track its performance. The other technique utilized is unsupervised learning, unlike supervised learning, unsupervised learning uses unlabeled and unstructured data that is fed to a model and giving it the opportunity to learn and extract useful information from the data as it sees fit, these techniques are utilized for various other tasks such as early warning systems and faster cure discovery.

5.2.1. Supervised Learning

Supervised Learning is one of the most often used techniques in the health care system, and it is well-established. This learning approach makes use of labeled data X with a provided target Y to learn how to predict the correct value of Y given input X.

Supervised learning can help provide a solid foundation for COVID-19 planned observation and forecasting. A neurological system might also be developed to extract the visual features of this disease, which would aid in the proper diagnosis and treatment of those who are affected. An Xception, depth-wise based CNN technique for convolution distinct layers has been presented in [ 91 ]. Two convolution layers are at the top, followed by a related layer, four convolution layers, and depth-wise divisible convolution layers. In research from [ 80 ], it was used to identify bed positions using a variety of bed pressure sensors. It can be a beneficial weapon in battling the COVID-19 because of its capabilities and high-efficiency outcome.

5.2.2. Unsupervised Learning

Instead of using verified data as in the previous learning strategy, this learning technique employs names without information signals. This method is widely used to discover covered structures in data and divide them into small groups. Its primary purpose is to present and construct a clear differentiating proof. This is a potential type of estimation for meeting the general AI requirement, although it lags far behind the previously stated learning approach. The autoencoder [ 92 ] and K-means [ 93 ] are the most well-known unsupervised techniques. The quirk acknowledgment [ 94 ] is one of the most widely seen duties of this learning strategy in the medical field. The affiliation data will begin with comparable scattering; if there is any form of interference, as an exception, this data point can be hailed or observed without difficulty. There are many solutions that are relatively cheap and allow deploying AI models in a fast way such as Nividia’s Jetson nano kit, a Raspberry Pi, or Google’s coral, for example. Therefore, this concept may be used to CT scan pictures as well as other medical applications, such as for COVID-19.

The author proposed a new framework for this learning model for opportunistic cameras that record moving data from a stream [ 95 ]. The neural system is then used to predict how the event broadcast will move. This movement is used to attempt and remove any movement that is concealed in the streaming images. To further explain this concept, we depict the applied learning approach in Figure 9 , where the training and testing attributes collected from the patient are denoted by the symbol X. In this case, accuracy is not a priority; instead, the approach’s purpose is to uncover any interesting examples that may be found among the available data. Furthermore, additional information can be used to corroborate or disprove the samples it detects.

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g009.jpg

Network architecture for both the optical flow and egomotion and depth networks. This figure has been taken from [ 95 ].

5.3. Real World Use Cases

The contribution of AI to the fight against COVID-19 is discussed in this paper [ 96 ], as well as the present restrictions on these efforts. (i) early warnings and alerts, (ii) prediction and tracking, (iii) data dashboards, (iv) diagnosis, (v) cures and treatment, and (vi) health care workers’ workloads are being reduced are six areas where AI may help in the battle against COVID-19. The conclusion is that AI has yet to influence COVID-19. Its utilization is restricted by a lack of data as well as an abundance of data. To overcome these limitations, a careful balance between data privacy and public health, as well as rigorous human-AI interaction, will be required. These are unlikely to be addressed in time to be of much use during the current pandemic. Meanwhile, a large-scale collection of diagnostic data on who is infectious will be required to save lives, train AI, and reduce economic losses. In the works [ 93 ], Different AI techniques are utilized for COVID-19 detection, these techniques despite their major differences provide admirable results that helped in making it easier and faster to detect the spread of COVID-19, the discussed techniques will be further explained in detail below.

5.3.1. Early Warnings and Alerts

AI can swiftly identify unusual symptoms and other red flags, alerting patients and health care providers [ 97 ]. It aids in cost-effective decision making by allowing for speedier decision making. Through relevant algorithms, it aids in the development of a novel diagnosis and management strategy for COVID-19 patients. With the use of medical imaging technologies such as computed tomography (CT) and magnetic resonance imaging (MRI) scans of human body parts, AI can assist in the identification of infected patients.

For example, BlueDot 4, a Canadian AI model, demonstrates how a low-cost AI tool (BlueDot was supported by a startup investment of roughly US$ 9 million) may outperform humans at detecting infectious disease epidemics as shown in [ 98 ]. According to reports, BlueDot foresaw the epidemic at the end of 2019, giving a warning to its clients on 31 December 2019, one day before the World Health Organization announced so on 9 January 2020. In [ 99 ], a group of academics worked with BlueDot and compiled a list of the top 20 destinations for travelers flying from Wuhan after the epidemic. They cautioned that these cities might be at the front of the global spread of the disease.

While BlueDot is unquestionably a strong tool, much of the press around it has been exaggerated and undervalues the contribution of human scientists. First, while BlueDot raised an alarm on 31 December 2019, another AI-based model at Boston Children’s Hospital (USA), reading the HealthMap in [ 100 ], raised a warning on 30 December 2019.

5.3.2. Prediction and Tracking

AI may be used to track and forecast the spread of COVID-19 over time and space. A neural network may be built to extract visual aspects of this condition, which would aid inadequate monitoring [ 101 ]. It has the potential to offer daily information on patients as well as remedies to be implemented in the COVID-19 pandemic.

For example, during a previous pandemic in 2015, a dynamic neural network to anticipate the spread of the Zika virus was constructed. Models such as these, on the other hand, will need to be retrained using data from the COVID-19 pandemic. Various projects are underway to collect training data from the present epidemic, as detailed below.

Various issues plague accurate pandemic predictions; see, for example, [ 102 ]. This includes a dearth of historical data on which to train the AI, panic behavior that causes “noise” on social media, and the fact that COVID-19 infections have different features than prior pandemics. Not only is there a paucity of historical data, but there are also issues with employing “big data”, such as information gleaned from social media. The risks of big data and AI in the context of infectious illnesses, as demonstrated by Google Flu Trends’ notable failure, remain valid. “Big data hubris and algorithm dynamics”, as [ 103 ] put it. For example, as the virus spreads and the quantity of social media traffic around it grows, so does the amount of noise that must be filtered out before important patterns can be recognized.

AI estimates of COVID-19 spread are not yet particularly accurate or dependable because of a lack of data, big data hubris, algorithmic dynamics, and loud social media.

As a result, most tracking and forecasting models do not employ AI technologies. Instead, most forecasters choose well-established epidemiological models, often known as SIR models, which stand for susceptible, infected, and removed populations in a certain area. The Institute for the Future of Humanity at Oxford University, for example, uses the GLEAMviz epidemiological model to anticipate the virus’s spread, looking in [ 104 ].

An Epidemic Tracker model of illness propagation is available from Metabiota Looking forward to a San Francisco-based startup [ 105 ]. In a YouTube video, watching, Crawford, an Oxford University mathematician, gives a simple and concise explanation of SIR models [ 106 ].

The Robert Koch Institute in Berlin employs an epidemiological SIR model that incorporates government containment measures including quarantines, lockdowns, and social distancing, the model is explained here [ 107 ]. In [ 108 ] recently, it has been pre-published and made it accessible in R format and enhanced the SIR model that takes into consideration public health interventions against the pandemic and uses data from China.

The Robert Kock Institute’s model has already been utilized in the instance of China to show that containment can be effective in slowing the spread to less than exponential rates [ 107 ].

5.3.3. Data Dashboards

COVID-19 tracking and forecasting has spawned a cottage industry of data dashboards for visualizing the actual and predicted spread. The MIT Technology Review [ 109 ] has ranked these dashboards for tracking and forecasting. HealthMap, UpCode, Thebaselab, NextStrain, the BBC, Johns Hopkins’ CSSE, and the New York Times have the best dashboards, according to them. Microsoft Bing’s COVID-19 Tracker is another important dashboard see Figure 10 .

An external file that holds a picture, illustration, etc.
Object name is healthcare-10-02504-g010.jpg

Microsoft Bing’s COVID-19 Tracker, note(s): Screenshot of Bing’s COVID-19 Tracker, 9 February 2022.

While these dashboards provide an increasing number of cites, and a global overview, have their dashboards in place. For example, South Africa established the COVID-19 ZA South Africa Dashboard which is maintained by the University of Pretoria’s Data Science for Social Impact Research Group [ 110 ].

Tableau has produced a COVID-19 data hub with a COVID-19 Starter Workbook to help with the creation of data visualizations and dashboards for the epidemic [ 111 ].

5.3.4. Diagnosis

COVID-19 diagnosis that is quick and accurate can save lives, prevent disease transmission, and produce data for AI models to learn from. In this case, AI might be helpful, especially when establishing a diagnosis based on chest radiography pictures. In a recent assessment of artificial intelligence applications versus coronaviruses, studies have demonstrated that AI can be as accurate as humans, save radiologists’ time, and provide a diagnosis faster and cheaper than normal COVID-19 tests [ 112 ].

For COVID-19, AI can save radiologists time and help them diagnose the disease faster and more affordably than current diagnostics. X-rays and computed tomography (CT) scans are both options. A lesson on how to diagnose COVID-19 utilizing X-ray pictures using Deep Learning has been provided in [ 113 ]. COVID-19 tests are “in low supply and costly”, he points out, but “all hospitals have X-ray machines”. A method for scanning CT scans with mobile phones has been presented in [ 114 ].

In this context, several projects are in the works. COVID-Net, has been created by [ 115 ], is a deep convolutional neural network (see, for example, [ 116 ]) that can diagnose coronavirus from chest X-RAY pictures. It was trained using data from roughly 13,000 individuals with diverse lung diseases, including COVID-19, from an open repository. However, as the authors point out, it is “far from a production-ready solution”, and they urge the scientific community to continue working on it, especially to “increase sensitivity” (Ibid, p.6). A Deep Learning model has been presented in [ 117 ] to diagnose COVID-19 from CT scans (which has not yet been peer-reviewed), concluding that “The deep learning model showed comparable performance with an expert radiologist, and greatly improve the efficiency of radiologists in clinical practice. It holds great potential to relieve the pressure off frontline radiologists, improve early diagnosis, isolation, and treatment, and thus contribute to the control of the epidemic”. (Ibid, p.1).

Researchers from the Dutch University of Delft, for example, developed an AI model for detecting coronavirus from X-rays at the end of March 2020. On their website available in [ 111 ], this model, dubbed CAD4COVID, is touted as an “artificial intelligence program that triages COVID-19 suspicions on chest X-ray pictures”. It is based on prior AI models for TB diagnosis created by the institution.

Although it has been claimed that a handful of Chinese hospitals have installed “AI-assisted” radiology technologies for example see the report in [ 118 ], the promise has yet to be realized. Radiologists in other countries have voiced worry that there is not enough data to train AI models, that most COVID-19 pictures are from Chinese hospitals and may be biased, and that utilizing CT scans and X-rays might contaminate equipment and spread the disease further.

Finally, once one has been diagnosed with the disease, the question of whether and how severely that one will be affected arises. COVID-19 does not always need rigorous treatment. Being able to predict who will be impacted more severely can aid in the targeting of assistance and the allocation and utilization of medical resources. Only 29 patients at Tongji Hospital in Wuhan, China, the authoers of [ 119 ] used the data of thus patients to develop a prognostic prediction algorithm to forecast the mortality risk of a person who has been infected. Howevere, the authores in [ 120 ] have offered an AI that can predict with 80% accuracy who would suffer acute respiratory distress syndrome after contracting COVID-19 (ARDS).

5.3.5. Faster Cure Discovery

Long before the coronavirus epidemic, AI was praised for its ability to aid in the development of novel drugs see for example [ 121 ]. In the instance of coronavirus, several types of research institutes and data centers have already said that AI would be used to find therapies and a vaccine for the virus. The goal is that artificial intelligence will speed up both the discovery and repurposing of current medications. By assessing the existing data on COVID-19, AI is employed for medication research. It may be used to design and develop medication delivery systems. This technique is utilized to speed up drug testing in real-time when normal testing takes a long time, and so helps to considerably speed up this procedure, which would be impossible for a human to do. For example, Google’s DeepMind, which is best known for its AlphaGo game-playing algorithm, AI has been used in [ 122 ] to anticipate the structure of viral proteins, which might aid in the development of novel treatments. DeepMind, on the other hand, makes it explicit on its website associated with COVID-19, 2020) that “we emphasize that these structure predictions have not been experimentally verified…we can’t be certain of the accuracy of the structures we are providing”.

5.3.6. Repurposing Existing Drugs

Beck et, al. [ 123 ] provides findings from a study that used Machine Learning to determine if an existing medicine, atazanavir, may be repurposed for coronavirus treatment. And, in collaboration with Benevolent AI, a UK AI business, [ 101 ] discovered Baricitinib, a drug used to treat myelofibrosis and rheumatoid arthritis, as a viable COVID-19 therapy. AI can assist in the discovery of effective medications to treat coronavirus patients. It has evolved into a useful tool for developing diagnostic tests and vaccines [ 124 ]. In research from [ 125 ], AI aids in the creation of vaccines and therapies at a much faster rate than before, as well as clinical trials during vaccine development.

5.4. AI and Health Care Workers’ Workloads Reduction

Health care workers are overworked because of a sudden and significant increase in the number of patients during the COVID-19 epidemic. In this case, artificial intelligence (AI) is employed to lessen the burden on health care staff. Hence, in research from [ 85 ] utilizing the classification of confirmed instances of coronavirus (new version of COVID-19) as one of the pandemic illnesses, a severe problem in the sustainable development process was studied. As a result, binary classification modeling was employed as one of the artificial intelligence ways using the group method of data handling (GMDH) kind of neural network. The suggested model was built using the Hubei province of China as a case study, with certain significant characteristics such as minimum, average, and maximum city density, relative humidity, daily temperature, and wind speed as input datasets, and the number of verified cases as output dataset for 30 days.

The suggested binary classification model outperforms the competition in terms of predicting confirmed instances. In addition, regression analysis was performed, and the trend of confirmed cases was compared to daily weather parameter changes (humidity, average temperature, and wind).

The relative maximum day temperature and humidity had the greatest influence on the verified cases, according to the findings. The confirmed cases were impacted positively by the relative humidity in the primary case study, which averaged 77.9%, and adversely by the highest daily temperature, which averaged 15.4 °C.

Offering the greatest training to students and clinicians on this emerging illness by utilizing digital techniques and decision science [ 126 ]. AI can improve future patient care and handle more possible difficulties, reducing doctors’ burden.

5.5. Remark

From an epidemiological, diagnostic, and pharmacological standpoint, AI has yet to play a substantial part in the fight against coronavirus. Its application is limited by a shortage of data, outlier data, and an abundance of noise. It is vital to create unbiased time series data for Artificial intelligence training. While the expanding number of worldwide activities in this area is promising, more diagnostic testing is required. Not just for supplying training data for AI models, but also better controlling the epidemic and lowering the cost in terms of human lives and economic harm.

Data is crucial in determining if AI can be used to combat future diseases and pandemics. As in [ 96 ], it has been previously stated that the risk is public health reasons will override data privacy concerns. Long after the epidemic has passed, governments may choose to continue the unparalleled surveillance of their population. As a result, worries regarding data privacy are reasonable.

6. Significance of the Study (Body Language Symptoms for COVID-19)

Communication is one of the most crucial skills a physician should have, according to patient surveys. However, communication encompasses more than just what is spoken. From the time a patient first visits a physician, his or her nonverbal communication, or body language, determines the course of therapy. Bodily language encompasses all nonverbal forms of communication, including posture, facial expression, and body movements. Being aware of such habits can help doctors gain more access to their patients. Patient involvement, compliance, and the result can all be influenced by effective nonverbal communication [ 127 ].

Pandemic and epidemic illnesses are a worldwide threat that might kill millions of people. Doctors have limited abilities to recognize and treat victims. Human and technological resources are still in short supply when it comes to epidemic and pandemic conditions. To better the treatment process and when the patient is unable to travel to the treatment location, remote diagnosis is necessary, and the patient’s status should be automatically examined. Altering facial wrinkles, movements of the eyes and eyebrows, some protrusion of the nose, changing the lips, and the appearance of certain motions of the hands, shoulders, chest, head, and other areas of the body are all characteristics of pandemic and epidemic illnesses. Artificial intelligence technology has shown promise in understanding these motions and cues in some cases. As a result, the concept of allocating body language to identifying epidemic diseases in patients early, treating them early, and assisting doctors in recognizing them arose owing to the speed with which they spread and people died. It should be emphasized that the COVID-19 sickness, which horrified the entire world and revolutionized the world’s life, was the major and crucial motivator for the idea of this study after we studied the body language analysis research in health care and defined the automatic recognition frame using artificial intelligence to recognize various body language elements.

As researchers in information technology and computer science, we must contribute to discussing an automatic gesture recognition model that helps better identify the external symptoms of epidemic and pandemic diseases for helping mankind.

7. Conclusions

In this paper, we reviewed the recent literature analyzing patients’ body language using deep learning techniques. Since most of this research is ongoing, we focused on the body language analysis research in health care. In such recent works, most of the research in health care has been considered to define the automatic recognition frame using artificial intelligence to recognize various body language elements. It will be interesting to discuss an automatic gesture recognition model that helps better identify the external symptoms of epidemic and pandemic diseases.

The body language analysis of patients using artificial intelligence for identifying the external symptoms of epidemic and pandemic diseases is a motivating issue for future research to improve the process of treatment, including for when the patient is inaccessible to the place of the treatment, remote diagnosis is required, and the patient’s condition should be analyzed automatically.

Acknowledgments

The authors would like to thank the Research Management Center, Malaysia International Islamic University for funding this work by Grant RMCG20-023-0023. Also, the authors would like to thank the United Arab Emirates University for funding this work under UAEU Strategic Research Grant G00003676 (Fund No.: 12R136) through Big Data Analytics Center.

Funding Statement

This research was funded by Malaysia International Islamic University Research Management Center Grant RMCG20-023-0023, and United Arab Emirates University Strategic Research Grant G00003676 (Fund No.: 12R136) through Big Data Analytics Center.

Author Contributions

Conceptualization, R.A. and A.A.; Methodology, R.A., S.T. and S.W.; Validation, R.A. and M.A.H.A.; Formal analysis, R.A. and S.T.; Resources, S.T.; Data curation, R.A. and A.A.; Writing – original draft, A.A.; Writing – review & editing, R.A., S.T., M.A.H.A. and S.W.; Visualization, M.A.H.A.; Supervision, R.A. and S.W.; Project administration, R.A. and S.T.; Funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.

Institutional Review Board Statement

Informed consent statement, data availability statement, conflicts of interest.

The authors declare no conflict of interest.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Help | Advanced Search

Computer Science > Artificial Intelligence

Title: meta-task planning for language agents.

Abstract: The rapid advancement of neural language models has sparked a new surge of intelligent agent research. Unlike traditional agents, large language model-based agents (LLM agents) have emerged as a promising paradigm for achieving artificial general intelligence (AGI) due to their superior reasoning and generalization capabilities. Effective planning is crucial for the success of LLM agents in real-world tasks, making it a highly pursued topic in the community. Current planning methods typically translate tasks into executable action sequences. However, determining a feasible or optimal sequence for complex tasks at fine granularity, which often requires compositing long chains of heterogeneous actions, remains challenging. This paper introduces Meta-Task Planning (MTP), a zero-shot methodology for collaborative LLM-based multi-agent systems that simplifies complex task planning by decomposing it into a hierarchy of subordinate tasks, or meta-tasks. Each meta-task is then mapped into executable actions. MTP was assessed on two rigorous benchmarks, TravelPlanner and API-Bank. Notably, MTP achieved an average $\sim40\%$ success rate on TravelPlanner, significantly higher than the state-of-the-art (SOTA) baseline ($2.92\%$), and outperforming $LLM_{api}$-4 with ReAct on API-Bank by $\sim14\%$, showing the immense potential of integrating LLM with multi-agent systems.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

BibTeX formatted citation

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

IMAGES

  1. Eliminating Unattractive Body Language

    research paper on body language pdf

  2. Body Language.

    research paper on body language pdf

  3. (PDF) Body Language PpD (Personal and professional Development

    research paper on body language pdf

  4. Body language 101 by Hanif Raah pdf free download

    research paper on body language pdf

  5. The Dictionary of Body Language PDF

    research paper on body language pdf

  6. Reearch paper about body language

    research paper on body language pdf

VIDEO

  1. Body language

  2. The Power of Body Language

  3. Research Paper: Body Paragraphs

  4. WR: Research Paper

  5. Importance of body language

  6. G20 body language: Reading between the lines

COMMENTS

  1. (PDF) A Review of Communication, Body Language and ...

    ISSN: 1475-7192. Gizem Öneri Uzun, Near East University, Ataturk Faculty of Education, Ni cosia, Cyprus, Mersin 10, Turkey, A Review of Communication, Body Language and Communication Conflict ...

  2. Unspoken science: exploring the significance of body language in

    While the focus is often on the content of research papers, lectures, and presentations, there is another form of communication that plays a significant role in these fields: body language. Non-verbal cues, such as facial expressions, gestures, posture, and eye contact, can convey a wealth of information, often subtly influencing interpersonal ...

  3. (PDF) Language and Body Language

    PDF | On Apr 6, 2018, Vijendra Pratap Singh published Language and Body Language | Find, read and cite all the research you need on ResearchGate

  4. PDF The Silent Language of Leaders: How Body Language Can Help--or Hurt

    Body language is the management of time, space, appearance, posture, gesture, vocal prosody, touch, smel , facial expression, and eye contact. The latest research in neuroscience and psychology has proven that body language is crucial to leadership effectiveness—and this book wil show you

  5. PDF Body Language: An Effective Communication Tool

    %PDF-1.6 %âãÏÓ 49 0 obj > endobj xref 49 22 0000000016 00000 n 0000000963 00000 n 0000001026 00000 n 0000001228 00000 n 0000001349 00000 n 0000042972 00000 n 0000043159 00000 n 0000066971 00000 n 0000067163 00000 n 0000087565 00000 n 0000087759 00000 n 0000088744 00000 n 0000088912 00000 n 0000089902 00000 n 0000090075 00000 n 0000101809 00000 n 0000102007 00000 n 0000102993 00000 n ...

  6. Body language in the brain: constructing meaning from expressive

    Abstract. This fMRI study investigated neural systems that interpret body language—the meaningful emotive expressions conveyed by body movement. Participants watched videos of performers engaged in modern dance or pantomime that conveyed specific themes such as hope, agony, lust, or exhaustion. We tested whether the meaning of an affectively ...

  7. (PDF) The Body Language of Culture

    After a thorough overview of the key elements of body language, the author discusses the most interesting ways of learning body language. The paper is closed by an in-depth conclusion reiterating ...

  8. Understanding Body Language Does Not Require Matching the Body's

    It has been suggested that processing body posture information is partly an abstract ability and so involves more than just visual perception (Tipper, Signorini, & Grafton, 2015).Therefore, it is not surprising that BL processing triggers several areas of brain activation (de Gelder, 2006).Suggested neural correlates of BL processing (e.g., de Gelder, 2006) seem to involve three interconnected ...

  9. PDF The Role of Body Language in Cross Cultural Communication

    Facial expressions are among the most universally recognized forms of body language. Research by Paul Ekman has identified six basic emotions—happiness, sadness, anger, fear, disgust, and surprise—that can be expressed through facial cues. For example, a smile typicallysignifies happiness, while a furrowed brow might indicate anger or ...

  10. PDF Body Language as a Communicative Aid amongst Language Impaired Students

    The research is significant in many ways. First, it will help in bridging the gap between disability and language ... Body language refers to a non-verbal, physical behaviour used in the conveyance of information as opposed to words (https//en.m.wikipedia.org). The state of mind is exposed in language and certain body postures give clues

  11. PDF Chapter 1 Defining Body Language

    he science of body language is a fairly recent study, dating primarily from around 60 years ago, although body language itself is, of course, as old as humans. Psychologists, zoologists, and social anthropologists have con-ducted detailed research into the components of body language - part of the larger family known as non-verbal behaviour.

  12. Body Language Analysis in Healthcare: An Overview

    More extensive research is needed using artificial intelligence (AI) techniques in disease detection. This paper presents a comprehensive survey of the research performed on body language processing. Upon defining and explaining the different types of body language, we justify the use of automatic recognition and its application in healthcare.

  13. Body Language: An Effective Communication Tool

    46. 1 Excerpt. Body Language is a significant aspect of modern communications and relationships. Body language describes the method of communicating using body movements or gestures instead of, or in addition to, verbal language. The interpretation of body language, such as facial expressions and gestures, is formally called kinesics.

  14. How to effectively use body language

    Each "Communication Corner" essay is self-contained; however, they build on each other. For best results, before reading this essay and doing the exercise, go to the first essay "How an Ugly Duckling Became a Swan," then read each succeeding essay. Learn how to use body language when delivering a speech or presentation. This installment of

  15. Language, Gesture, and Emotional Communication: An Embodied View of

    Abstract. Spoken language is an innate ability of the human being and represents the most widespread mode of social communication. The ability to share concepts, intentions and feelings, and also to respond to what others are feeling/saying is crucial during social interactions. A growing body of evidence suggests that language evolved from ...

  16. (PDF) Review and Analysis of Patients' Body Language From an Artificial

    Analytics Center. ABSTRACT Body language is a nonverbal communication process consisting of movements, postures, gestures, and expressions of the body or body parts. Body language expresses human ...

  17. Frontiers

    This fMRI study investigated neural systems that interpret body language—the meaningful emotive expressions conveyed by body movement. Participants watched videos of performers engaged in modern dance or pantomime that conveyed specific themes such as hope, agony, lust, or exhaustion. We tested whether the meaning of an affectively laden performance was decoded in localized brain substrates ...

  18. PDF The Benefits of Using Effective Body Language in Public Speaking

    Using effective body language to connect with your audience can help you deliver a successful speech. For example, making eye contact, smiling, and using gestures can help you establish a connection with your audience and make them feel more engaged with your speech (Courtney and Smallwood, 2020). 2.

  19. PDF Interpreting women's body language: Diverse perspectives in the eyes of men

    Lu, H. 44 Consortia Academia Publishing (A partner of Network of Professional Researchers and Educators) Interpreting women's body language: Diverse perspectives in the eyes of men 1. Introduction As of the year 2020, there are an estimated 7.8 billion people in the world. There is a multitude of various cultures, languages, and traditions within 7.8 billion of them.

  20. An Analysis of Body Language of Patients Using Artificial Intelligence

    It is important for the proper expression and comprehension of messages during the communication process. It also promotes oral communication and establishes communication integrity. Body language accounts for 55% of how we impress people when speaking, words account for 7%, and discourse accounts for 38%.

  21. (PDF) The Power of Body Language in Education: A Study of Teachers

    Body language has a significant impact on both the teaching and learning processes in education. Nonverbal cues, including as facial expressions, gestures, and posture, might af fect student ...

  22. [2405.16510] Meta-Task Planning for Language Agents

    The rapid advancement of neural language models has sparked a new surge of intelligent agent research. Unlike traditional agents, large language model-based agents (LLM agents) have emerged as a promising paradigm for achieving artificial general intelligence (AGI) due to their superior reasoning and generalization capabilities. Effective planning is crucial for the success of LLM agents in ...

  23. Financial Statement Analysis with Large Language Models

    Open PDF in Browser. Add Paper to My Library ... analysts, chain-of-thought, financial statement analysis, large language models. JEL Classification: G12, G14, G41, M41. ... Financial Statement Analysis with Large Language Models (May 20, 2024). Chicago Booth Research Paper Forthcoming, Fama-Miller Working Paper, Available at SSRN: https://ssrn ...

  24. Understanding the Meaning and Significance of Body Language

    through body language is regarded as efficacious and worthwhile. In the personal and professional lives of the individuals, they experience situations. and instances, when they experience problems ...

  25. (PDF) The role of using body language in teaching English

    Body gestures are of tremendous importance for human- computer interaction, in particular for user interfaces for mixed reality. In this paper, we suggest a way to unify ef- forts from various ...