A method to quantify autonomic nervous system function in healthy, able-bodied individuals

The autonomic nervous system (ANS) maintains physiological homeostasis in various organ systems via parasympathetic and sympathetic branches. ANS function is altered in common diffuse and focal conditions and heralds the beginning of environmental and disease stresses. Reliable, sensitive, and quantitative biomarkers, first defined in healthy participants, could discriminate among clinically useful changes in ANS function. This framework combines controlled autonomic testing with feature extraction during physiological responses. Twenty-one individuals were assessed in two morning and two afternoon sessions over two weeks. Each session included five standard clinical tests probing autonomic function: squat test, cold pressor test, diving reflex test, deep breathing, and Valsalva maneuver. Noninvasive sensors captured continuous electrocardiography, blood pressure, breathing, electrodermal activity, and pupil diameter. Heart rate, heart rate variability, mean arterial pressure, electrodermal activity, and pupil diameter responses to the perturbations were extracted, and averages across participants were computed. A template matching algorithm calculated scaling and stretching features that optimally fit the average to an individual response. These features were grouped based on test and modality to derive sympathetic and parasympathetic indices for this healthy population. A significant positive correlation (p = 0.000377) was found between sympathetic amplitude response and body mass index. Additionally, longer duration and larger amplitude sympathetic and longer duration parasympathetic responses occurred in afternoon testing sessions; larger amplitude parasympathetic responses occurred in morning sessions. These results demonstrate the robustness and sensitivity of an algorithmic approach to extract multimodal responses from standard tests. This novel method of quantifying ANS function can be used for early diagnosis, measurement of disease progression, or treatment evaluation. This study registered with Clinicaltrials.gov, identifier NCT04100486. Registered September 24, 2019, https://www.clinicaltrials.gov/ct2/show/NCT04100486.

The potential data acquired from this range of tests, however, will fast exceed the size and homogeneity requirement for standard statistical evaluation. One early study developed a composite scoring scale that was able to detect generalized autonomic dysfunction, but it failed on discriminating by disorder or diagnosis (Low, 1993). While much attention has focused on measuring heart rate variability (HRV) as a proxy for vagal tone (Akselrod et al., 1981;Beckers et al., 2001;Ducla-Soares et al., 2007;Mainardi, 2009;Pagani et al., 1984), there is less evidence of the usefulness of this relationship (Billman, 2013;Ernst, 2017). HRV measures quantify fluctuations between inter-beat intervals (IBI); different time-and frequency-domain indices have been linked to short term (~five minutes) and 24-h metrics of sympathetic or parasympathetic activity (Shaffer & Ginsberg, 2017;Stavrakis et al., 2020). Although ultra-short term (less than five minutes) HRV measures are not common, it is reported that the root mean square of successive R-R interval differences (RMSSD), in as low as 30 s samples, may be a reliable surrogate for parasympathetic activity (Baek et al., 2015;Kang et al., 2016;Munoz et al., 2015;Salahuddin et al., 2007). Perhaps because HRV depends on underlying heart and respiratory rates that can be easily altered by diet, exercise, psychological stress, diurnal factors, and medication, as well as age and gender, its reliability as a surrogate marker has remained controversial (Kyriakou et al., 2019;Perini & Veicsteinas, 2003;Stein et al., 1997;Umetani et al., 1998;Yamamoto et al., 1991;Young & Benton, 2018;Zhang, 2007). Besides HRV, other ANS assessment techniques require drug induced response (baroreflex sensitivity), additional equipment (imaging or tilt tables), or invasive intraneural microelectrodes (direct muscle and sympathetic nerve activity measurement), making these assessments more difficult to administer (Goldberger et al., 2019;Stavrakis et al., 2020).
Thus, the dearth of effective, reliable, and reproducible data-driven approaches to quantify non-invasive recording modalities while administering a full battery of tests provokes an application of signal processing, machine learning techniques, and decoding algorithms on a set of commonly used clinical measurement of physiological signals in an attempt to derive a better understanding of autonomic function and dysregulation. Herein, we test whether these metrics of autonomic function and responses are sensitive and can reliably identify subtle ANS deviations in health, able bodied control participants. The findings demonstrated significant deviations of ANS responses correlated with body mass index (BMI), and also showed trends related to circadian rhythm. While BMI is not a gold standard index for autonomic activity, it is related to hemodynamic alterations that are linked with the ANS dysfunction (Guarino et al., 2017). If these noninvasive, quantitative autonomic function metrics can be used to retrieve or predict BMI from deviations in cardiovascular activity in a small sample size, this method might well enable objective measures of other disease states linked with ANS dysfunction and provide a useful tool for diagnosis and disease management.

Human participants
This study recruited and enrolled 21 healthy, ablebodied participants between the ages of 18-60 years and a BMI less than 30. The mean age (±SD) of the participants was 29.9 (±6.5) years, with sixteen males and five females. The mean BMI (±SD) was 24.4 (±2.9). Exclusion criteria were: history of cardiac arrhythmia, coronary artery disease, autoimmune disease, chronic inflammatory disease, anemia, malignancy, depression, neurologic disease, diabetes mellitus, renal disease, dementia, psychiatric illness including active psychosis, or any other chronic medical condition, treatment with anti-cholinergic medication, current tobacco, nicotine, or other recreational drug use, pre-existing neurological disease, pregnancy, and implantable electronic devices. Participants were asked to fast and refrain from caffeine for at least four hours prior to testing. This study was approved by the Northwell Health Institutional Review Board, IRB #19-0461 and registered with Clinicaltrials. gov, identifier NCT04100486.

Autonomic testing sessions
Testing sessions occurred in a laboratory setting with shielded from external noise, light, or distractions, with an average humidity of 22% and average temperature of 22.1°C. Lighting was set such that significant pupil changes were detected, averaging approximately 22 lm (Lux Light Meter Pro, developed by Marina Polyanskaya). Each participant attended four one-hour testing sessions: two in the morning (starting between 7 and 11 am) and two in the afternoon (starting between, 1 and 6 pm) over the course of two weeks. Five autonomic tests were performed during each session, with the order randomized for each participant and for each testing session (Fig. 1a); at least 1 min elapsed between tests, and if a test was incorrectly performed or there was a potential loss of data or signal, that test was repeated. In the squat test, the participant actively stood still and calm for one minute, sat in a single deep squat for one minute, and then stood again for the final minute, all in succession. For the cold pressor test, the participant immersed his or her right hand in ice water (< 5°C) for at least 30 s and up to three minutes; after the first 30 s, the participant was allowed to remove their hand at their own will. During deep breathing, the participant followed a visual cue on a dimly lit computer monitor to time their respiratory rate to six breaths per minute for seven minutes. The diving reflex test was administered by placing a refrigerated gel-filled compress on the participant's face for one minute, followed by one minute of recovery. Lastly, the Valsalva maneuver was performed by a forceful attempted exhalation, by bearing down and "expelling" air while keeping the mouth and nose shut for 15 s, followed by one minute of recovery. Each session also included a five-minute period of continuous transcutaneous auricular vagus nerve stimulation (taVNS), a noninvasive method of applying stimulation at the cymba conchae of the ear to target the auricular branch of the vagus nerve, applied during rest randomly within the series of the five autonomic tests; while physiological The tests include a set of sympathetic tests (standing-squatting-standing [one minute of each, in succession] and cold pressor test [immersion of hand in ice water for up to three minutes]), a set of parasympathetic tests (deep breathing [respiratory rate of six breaths per minute for seven minutes] and diving reflex test [refrigerated gel-filled compresses on the face for one minute with one minute of recovery]), and Valsalva maneuver (restricted and forced exhalation for 15 s with one minute of recovery). (b) Physiological signals for each participant were recorded by a six lead electrocardiogram (in red, wires attached to four foam adhesive electrodes placed at each shoulder and each ankle) and a respiratory belt (in green, around the torso). Recorded from the left hand were noninvasive blood pressure (in blue, small inflatable cuff on middle phalanx of middle finger) and electrodermal activity (in gold, dry, metal electrodes on distal phalanx of index and ring finger). Eye tracking glasses (in purple) were placed to record pupil diameter and gaze location signals were collected for this stimulation period, the results are not reported as a part of this work (Debnath et al., n.d.). No patients experienced any vagal syncope events during testing sessions.

Physiological recording
In each session, the participant's cardiovascular data were continuously captured by noninvasive sensors transmitted through a data acquisition system (Power-Lab 16/35, ADInstruments, Sydney, Australia) ( Fig. 1b): six lead electrocardiography (ECG) (Lead Wires and Octal BioAmp, ADInstruments, Sydney, Australia), respiration (Respiratory Belt Transducer, ADInstruments, Sydney, Australia), and blood pressure (Human NIBP Nano System, Finapres Medical Systems, Enschede, The Netherlands). The six lead ECG wires were attached to four foam adhesive electrodes placed at each shoulder and each ankle, while the respiratory belt was wrapped and tightened around the torso. BP was recorded by a small non-invasive and inflatable cuff around the middle phalanx of the middle finger on the left hand. Additional sensors were attached to capture electrodermal activity (EDA) (dry, metal electrodes on two fingers) and pupil diameter (Tobii Pro Glasses 2, Tobii Pro AB, Stockholm, Sweden). All of the recordings were transmitted through that data acquisition system and software (LabChart, ADInstruments, Sydney, Australia) at a sampling rate of 1 kHz. All signals were marked and synchronized with experimental cues aligned with each test. Each participant's BMI was recorded and monitored as the main characteristic to be retrieved from autonomic signals. Continuous autonomic data from each participant and multiple non-invasive sensors, including ECG, BP, respiration, EDA, and pupil diameter, were measured at multiple time points relative to rest and onset of the autonomic tests. Individual responses were characterized by an approach based on extracting unique features from recorded data to identify significant responses and compute physiologically relevant and plausible indices that are representative of deviations from resting ANS function; these methods were not linked with previously developed indices, tools, or features.

Signal processing
All signal processing, analysis, and statistical evaluation were performed using MATLAB R2019a (MathWorks, Natick, MA). A 60 Hz notch filter was applied to remove line noise from raw cardiovascular data signals. Pupil diameter data was calibrated based on gaze location ( Supplementary Fig. 1); for all tests, participants were reminded to keep their eyes open and continue looking forward. Since illumination and perceived brightness can influence pupil size, measured pupil diameter was normalized to extract the autonomic response. Participants were not focused on any near objects, so pupil reactions due to accommodation were expected to be minimal and removed through averaging. The RGB values of the gaze location were converted to relative luminance values, and this was linearly fit to pupil diameter. By dividing by the slope of the line, the pupil sizes were normalized to account for effects of luminance.
Only modalities with a significant response to an autonomic test were considered for analysis. The mean μ and standard deviation σ were calculated for the baseline, taken in the same sitting posture before each autonomic test. A threshold for significance was determined as 1σ over baseline, a common way to define a threshold in signal detection theory (Merfeld, 2011); only peaks of the average response for a modality and test above this threshold were considered for analysis.

Feature extraction
Each participant's HR, mean arterial pressure (MAP), and RMSSD responses to the squat test, cold pressor test, diving reflex test, and Valsalva maneuver were characterized by a template matching method (Fig. 2). The comparisons between individual responses and the average response template in each epoch were quantified to reflect the autonomic response. An individual response may be delayed or occur sooner than the average response. Individual responses can also be shorter or longer in duration and smaller or larger in amplitude than an average response. To quantify these deviations, the parameters that minimize the following objective function were estimated: In equation (1), the normalized sum of the squared error is minimized. f(x) is an average response template centered at time x 0 and y is the trimmed individual response. The parameter a scales the duration of the response, and the parameter b delays or advances the response. The parameter c scales the amplitude of the template and will be reported as Vs, and d vertically offsets the template and will be reported as Vo. Since the average response of the template is normalized by the baseline period before each autonomic test, Vs and Vo can be combined and reported as a single parameter V = Vs + Vo that describes the net scaling of the individual response with respect to the template for each physiological signal for each test. The parameter H is reported to represent the parameter a to define the duration scaling for each individual response. Simulations were run based on the feature extraction process to ensure robustness to both initial conditions and additive noise and confirm curve fit performance.
For the deep breathing test, a robust parasympathetic test (Shields, 2009), only HR changes were considered; this is due to limited parasympathetic innervation to blood vessels that affect total resistance, and any BP response is primarily due to changes in HR and stroke volume. The differences between maximum and minimum HR for each breathing cycle in the first two minutes of the test were averaged for each individual. This value was divided by the average peak-to-peak HR over all individuals to calculate a scale-like feature.
Once features were calculated, they were separated by the type of autonomic response: sympathetic or parasympathetic. Based on literature (Kinoshita et al., 2006;Marfella et al., 1994;Sandroni et al., 1991;Shields, 2009;Victor et al., 1987) and observed data, responses were categorized by modality for each test, shown in Table 1.
While both branches of the autonomic nervous system are active during each autonomic test, as reflected by changes in HR and BP, features were sorted as primarily driven by one branch. For example, responses that led to an increase in HR were considered as driven by a sympathetic nervous system increase (as opposed to a parasympathetic withdrawal), while slowed HRs were attributed as primarily induced by the parasympathetic nervous system. In the squat test, during the phase from squat to stand, only scale (V) was used to represent sympathetic activity while only stretch (H) was used to represent parasympathetic activity (Droguett & Santos, 2015;Du et al., 2005). During deep breathing, the calculated scale is used to represent parasympathetic activity. For all other tests and modalities, both scale and stretch features were used in tandem as surrogates for sympathetic or parasympathetic responses.

Modeling and validation
Classified signals were correlated with measured characteristics of the patient cohort, particularly participant BMI. Linear regression models were applied to fit the average of collected features from all modalities and testing epochs for each patient. Significant trends were validated by 10 repeats of 7-fold cross-validation; data from three patients were left out of the modeling for each fold, and samples were reshuffled for every repetition. The p-value was determined by a t-statistic; values less than 0.05 were considered significant.

Results
Monitoring physiological signals to compute average responses Cardiovascular, pupil dilation, and EDA signals were synchronously recorded for each participant, with raw recordings for ECG and BP used to calculate HR, HRV (RMSSD), and MAP (Fig. 3). The individual responses of calculated signals during each test were averaged to determine a response template (Fig. 4, Supplementary Fig.  2). For each test and each modality, baseline-normalized responses were averaged across all participants and sessions; taVNS did not have any significant effect on physiological responses (Debnath et al., n.d.). For HRV, we calculated the RMSSD feature for the three tests with intervals longer than 60 s, since this HRV measure requires 60 s of activity to be calculated accurately (Baek et al., 2015;Salahuddin et al., 2007)

Extracted features from individual responses
From all the physiological modalities we monitored, participants' cardiovascular measures (HR, MAP, and RMSS D) registered average responses with a peak above 1σ of their baseline, thus deemed significant responses (as detailed in Methods), while the pupil dilation and EDA measures did not show a consistent, significant response and were discarded from further processing and feature extraction ( Supplementary Fig. 2). Figure 5 shows examples of individual responses, the calculated features, and the corresponding transformation of the average signal based on those features. The blue traces in each panel represent an individual's phasic response within a single Fig. 3 Monitoring and calculating physiological signals. 15 s of raw signals from these sensors are shown in the upper panel, with six channels of the electrocardiogram, one channel of respiration, one channel of finger pressure, one channel of electrodermal activity, and two channels corresponding to the left and right pupil diameter. Calculated signals in the lower panel include heart rate from the electrocardiogram, heart rate variability (RMSSD) from interbeat intervals, and the mean arterial pressure Fig. 4 Average heart rate, mean arterial pressure, and RMSSD during autonomic testing (76 sessions). The individual calculated responses (gray lines) were accumulated and averaged (black line) to extract an average response for each modality during each test. Each column represents a different calculated signal (heart rate, mean arterial pressure, and RMSSD). The dotted black traces correspond to a 95% confidence interval. RMSS D was not calculated for the cold pressor test and Valsalva maneuver due to time constraints necessary to accurately convey heart rate variability. Squat Test: vertical lines reflect changes in posture from standing to squatting and then squatting to standing. Cold Pressor Test: the first vertical line reflects when the participant immersed their hand into the ice water. The second vertical line represents the maximum of three minutes. The average trace only represents the individual traces available at that time point, as participants removed their hand at their own discretion. All participants kept their hand in the ice water for at least 30 s. Deep Breathing: vertical lines reflect when the deep breathing rate (6 breaths/ minute) began and ended. In the third column for the RMSSD, only the first two minutes were analyzed. Diving Reflex: vertical lines reflect when the refrigerated gel-mask was placed on and removed from the participant's face. A five second removal period is designated before the one minute of recovery. Valsalva Maneuver: vertical lines reflect the phases of the effort, from baseline, five seconds designated for inhalation and preparation, 15 s of the Valsalva maneuver, 10 s at the end of maneuver, and a final minute of recovery  Closer inspection of individual responses can provide insight into physiological meaning of the vertical and horizontal scaling, as well as the vertical offset. As evident in the examples plotted in Fig. 5a and b, the features extracted during the squat test reveal certain response properties, specifically in the transition from squatting to standing (at t = 0), for HR and MAP, respectively. In the first example (Fig. 5a), HR response was faster (H < 1) and larger (Vs > 1) than the average response during this squatting-standing transition. Meanwhile, in the second example (Fig. 5b), the MAP response was also faster (H < 1) and larger (Vs > 1) than the average response.
Similar interpretations can be made from the responses during the diving reflex test (Fig. 5c-d), where t = 0 indicates the time that the refrigerated mask was placed on the participant's face; 5c shows an example of a HR response, and 5d shows an example of a response in MAP. In both cases, the individual's responses were faster (H < 1) and larger (Vs > 1) than the average response.
For the deep breathing test, the vertical scaling feature of the HR response scales the average peak-to-peak HR over all individuals based on the maximum and minimum HRs in each breathing cycle within the first two minutes of the test. The response in the specific example of Fig. 5e was more than twice as large as the average response (Vs = 2.27). As mentioned in the methods, since deep breathing was visually guided, all peaks were entrained to the respiratory cycle, and there was no need for horizontal scaling feature extraction.
Similarly, the features and fit of the RMSSD measure of HRV during the first two-minute interval of deep breathing reveal the vertical and horizontal scaling using the template matching algorithm; the RMSSD response features, as calculated for the specific example in Fig. 5f, reveal a faster (H < 1) and larger (Vs > 1) than the average RMSSD response.

Sympathetic and parasympathetic features and their correlations to BMI and session time
The strength of this analysis is that all of the features from all of the tests could be interpreted as vertical and horizontal scaling measurements. The data features were grouped by vertical and horizontal scaling and by sympathetic and parasympathetic drive. Next, the features were averaged for each participant over all four testing sessions (see Table 1). These averages (and corresponding standard deviations) were correlated with physiological characteristics (BMI and age). Figure 6 represents sympathetic features (top row), and parasympathetic features (bottom row), left and right columns represent H (duration scale) and V (amplitude scale), respectively. Each data point is the average (±SD) calculated feature for a single participant. A regression demonstrated a significant increase in sympathetic amplitude scaling with increasing BMI (p = 0.000377). Furthermore, this result was validated by 10 repeats of 7-fold cross-validation; the data left out was used to reproduce BMI based on the trend line produced. The mean absolute percentage error was approximately 11.7% with a standard deviation of 0.579.
To test for an effect of circadian rhythm, the grouped features were compared across morning and afternoon sessions. The average feature for each individual was calculated over two AM sessions and two PM sessions; in Fig. 7a, each trace represents a single individual, where blue and orange traces represents higher values in AM or PM sessions, respectively. While the differences were not significant, these trends are summarized in Fig. 7b; most (14 and 12 participants, respectively) participants had longer (H > 1) sympathetic and parasympathetic responses in PM sessions. For vertical scaling features, larger (V > 1) sympathetic responses were observed in PM sessions for 14 participants; parasympathetic vertical scaling features were larger in AM sessions for a higher number of participants (11 participants). Two

Discussion
Recent studies have been focused on developing quantitative standards based on biomarkers to aid with diagnosis, prognosis, and estimates of treatment efficacy (Lötsch & Ultsch, 2018). To this end, this work sought to develop a reproducible and sensitive metric to quantify significant changes in ANS function by determining features that represented duration and amplitude scales compared to the average response. Instead of a priori developed indices, a template matching method was used to estimate indices that characterized physiologicallyrelevant indices that represent slight deviations in sympathetic and parasympathetic individual responses. These data show that sympathetic amplitude responses significantly increased and parasympathetic responses decreased with increasing BMI, and as such are consistent with past work that actually demonstrated significant sympathetic overactivity and decreased parasympathetic activity in patients with increased BMI greater than 25 (da Silva et al., 2009;Guarino et al., 2017;Molfino et al., 2009). While higher BMI may yield increased energy expenditure necessary for body weight, it has occurred to others that the activation of the sympathetic ANS may be an initial or primary driving force in weight maintenance regulation (Molfino et al., 2009). From a therapeutic standpoint, the links between high BMI and baroreceptor dysfunction, hypertension, organ damage, and cardiovascular disease suggest consideration of sympathetic inhibition.
Our data also show trends in ANS responses in relation with circadian rhythm. Sympathetic responses were both longer in duration and larger in amplitude during afternoon testing. For the parasympathetic features, longer duration responses occurred in afternoon testing, while larger amplitude responses were observed in morning testing sessions. Most past studies relating diurnal variations of autonomic function compare sympathetic and parasympathetic markers between day and night, and have demonstrated, especially in hypertensive patients, sympathetic activity spikes present in early morning (6:00 AM to 8:00 AM), with little change for the rest of the day and nighttime (Kario et al., 2004;Marfella et al., 2003;Middlekauff & Sontz, 1995;Panza et al., 1991). Although we did not test night time conditions and the recorded differences are not significant, our data suggest systematic variability cycles within the typical 24-h period, and as such, may also offer therapeutic opportunities with novel temporal characteristics.
Past studies of the ANS report HR and BP changes in response to stressor tests (Bellenger et al., 2016;Borresen & Lambert, 2008;Castiglioni et al., 2011;Freeman, 2006;Gonzaga et al., 2017;Martinez et al., 2010;McGuire et al., 2005;Michael et al., 2017;Pagani & Lucini, 2001;Radtke et al., 2016;Rossi et al., 2015;Taylor et al., 2003;Thayer et al., 2010;Van de Borne et al., 1994;Ziegler et al., 1992), but there is a dearth of studies that measure many of the remaining signals simultaneously and during controlled autonomic perturbations. The approach to ANS measurement taken in this work shows that parallel expansion of the modalities of raw physiological signals measured broadens the analysis from simple vitals to numerous measures that include temporal measures of HRV, pupillometry, respiration, and EDA.
This initial application of signal processing and machine learning on a set of standard physiological measures presents some challenges. First, this work depended only on cardiovascular changes during all autonomic tests. Other recordings, especially pupil diameter and EDA, were measured and analyzed in preliminary studies not discussed, but the peak average response did not meet the rigorous threshold of 1σ over baseline. Additionally, other autonomic measurements like photoplethysmogram (PPG), seismocardiogram (SCG), respiratory effort, pre-ejection period (PEP) of the heart, and other time-domain and frequency-domain HRV measures were not recorded. The reliability of our HRV measure, therefore, suggest that recording and analyzing these other responses might increase the reliability in defining a normal ANS index. The focus on a small healthy cohort between the ages of 18 and 45 and BMI less than 30 provided a baseline that reflects the significant differences and supports the plausibility that future recording efforts with larger and more diverse sample populations will be fruitful. Specifically, significant responses were not always measured for this healthy population in sympathetic or parasympathetic duration or amplitude scales, but there may be observable changes in patient populations or individuals outside of these age and BMI constraints. Relative to the average template from a healthy cohort, any patterns of significantly elevated, diminished, accelerated, or slowed responses to the same autonomic tests in a patient cohort could be used to diagnose specific dysautonomias. The limitations of our sample extend to the limitation of the recording modalities collected but not analyzed. For example, an impaired EDA has been linked to early stages of diabetic neuropathy (Khalfallah et al., 2012;Petrofsky & McLellan, 2009), and impaired HR, EDA, and pupil dilation have been linked with post-traumatic stress disorder (PTSD) (Mckinnon et al., 2020;Pole, 2007), responses that would not have been observed here. Nevertheless, the methods that focus on a healthy cohort and cardiovascular signals generated a reasonable set of healthy responses that may be used to construct a normal template. Such a range of normal responses might well be used to quantify autonomic variations in actual patient populations.
A third limitation is that this method relies on phasic response to a specific battery of autonomic tests, as opposed to continuous changes in recording modalities. Because the method is not ongoing, it cannot provide a continuously updating biomarker to estimate ANS function and balance. Additionally, this battery focuses on cardiovascular responses due to peripheral signaling in the body, and may explain the lack of robust and significant responses in EDA and pupil diameter. Future work may also include other tests, such as a cognitive aptitude assessment, social stress test or mental arithmetic (Bauerly et al., 2019;Duschek et al., 2009;Gurel et al., 2020a;Tornatzky & Miczek, 1993), that can induce more central nervous system mediated responses and, therefore, increase the utility of EDA, pupil diameter, and other recorded modalities. Recent work has also shown that biomarkers, like HR and PPG amplitude, can be used to predict responses to transcutaneous cervical vagus nerve stimulation (tcVNS) and model dynamic characteristics of an adapting ANS (Gazi et al., 2020). While it is unclear how this may scale for other conditions or interventions, modeling biomarker responses can be applied to continuously monitoring vital-sensing devices to calculate real-time risk scores and further comprehensive index values related to autoimmune health.
This method was developed toward creating datadriven approaches to comprehensively and objectively quantify the ANS. Modern methods of computational science, including machine learning and artificial intelligence techniques, have been used to decode complex clinical and experimental data by detecting patterns, classifying signals, and extracting information to inform diagnostic and treatment actions (Debnath et al., 2020;Norgeot et al., 2019;Wiens & Shenoy, 2018). Continuous data from many sensors, including those in this study and adding electroencephalography (EEG) or other neural recording devices, can be used to train such a model on recordings from healthy, able-bodied individuals to characterize ANS balance. With so many features extracted from each patient, advanced classification strategies can be built to explore non-linear relationships between features and calculate weighted indices to quantify the sympathetic and parasympathetic branches in patient populations. Since the battery of autonomic tests can be completed within 30 min and all sensors are noninvasively placed, clinical translation can be simplified. Future studies could focus on varying patient populations, as disturbances in autonomic regulation have been described in a variety of diseases, including those resulting from focal injury, such as spinal cord injuries (Krassioukov et al., 2012) and stroke (Dütsch et al., 2007), and diffuse disorders, such as sepsis and infection (Badke et al., 2018;Ferreira & Bissell, 2018), rheumatoid arthritis (Koopman et al., 2016a;Koopman et al., 2016b), Crohn's disease (Engel et al., 2015), and diabetes mellitus (Serhiyenko & Serhiyenko, 2018;Verrotti et al., 2014). Additionally, dysautonomias have been described in numerous cardiovascular conditions (Broadstone et al., 1991;Carthy, 2013;Kishi, 2012;Shen & Zipes, 2014;Vinik et al., 2013) and central nervous system disorders, including Alzheimer's disease (Femminella et al., 2014), Parkinson's disease (Goldstein, 2014), Huntington's disease (Diago et al., 2017;Kobal et al., 2010), and psychiatric conditions including depression, schizophrenia, and PTSD (Jung et al., 2019), among others. Another possible application could be evaluation of targeted neuromodulation; specifically, there is clinical interest in stimulating the vagus nerve, which is involved with responses in cardiovascular, pulmonary, gastrointestinal, renal, hepatic, and endocrine systems (Chavan et al., 2017;Pavlov et al., 2018). Vagus nerve stimulation (VNS) has been used in previous studies for multiple conditions, including refractory epilepsy (Rong et al., 2014;Stefan et al., 2012), depression (Kong et al., 2018;Rong et al., 2016), PTSD (Bremner et al., 2020), prediabetes (Huang et al., 2014), tinnitus (Shim et al., 2015), stroke (Redgrave et al., 2018), and others, including oromotor dysfunction, rheumatoid arthritis, and obesity (Guiraud et al., 2016). These studies have used a range of electrical stimulation settings and sites, and there is no optimal dose or set of parameters (Badran et al., 2018); the mechanism of VNS and responses are not well understood. While there have been studies that report mixed or no reported significant effects of VNS on HRV, pupil diameter, and evoked potentials, the preliminary effects on clinical populations are clear (Burger et al., 2020;Gurel et al., 2020b;Libbus et al., 2017). The taVNS protocol used in this study (pulse width of 300 μs at a continuous rate of 30 Hz for 5 min) did not produce any significant effects (Debnath et al., n.d.), but shortterm taVNS (3.4 s ON, 26-27 s OFF) elicited robust pupil dilation and alpha oscillations compared to sham stimulation, showing that significant pupillary and EEG markers can be observed (Sharon et al., 2020). A set of biomarkers or calculated features to accurately and consistently quantify these changes and relate them to the ANS in a patient-centered approach can be extremely helpful in a number of clinical applications.

Conclusions
The sympathetic and parasympathetic parameters determined in this study will be valuable to diagnose autonomic function and underlying disorders, as well as predict responses to targeted modulated therapies. While the use of autonomic modulation has shown promise in treating cardiovascular, autoimmune, and nervous system disorders, the template matching method in this work can offer additional insight towards the effects of stimulation and medication in patient populations.
Additional file 1 Supplementary Fig. 1. Normalizing pupillometry data. Gaze location was used to calibrate the pupil diameter data. The RGB values of the fixation point in each frame were converted to relative luminance (L = 0.2126R + 0.7152G + 0.0722B, based on the luminosity function) and then linearly fit to raw pupil diameter measurements. By dividing by the slope of the line, the effects of brightness were minimized. Additionally, all sessions were completed in a quiet room with lowered ambient light.
The individual calculated responses (gray lines) were accumulated and averaged (black line) to extract an average response for each modality during each test. Each column represents a different calculated signal (electrodermal activity and pupil dilation). The dotted black traces correspond to a 95% confidence interval. RMSSD was not calculated for the cold pressor test and Valsalva maneuver due to time constraints necessary to accurately convey heart rate variability. Squat Test: vertical lines reflect changes in posture from standing to squatting and then squatting to standing. Cold Pressor Test: the first vertical line reflects when the participant immersed their hand into the ice water. The second vertical line represents the maximum of three minutes. The average trace only represents the individual traces available at that time point, as participants removed their hand at their own discretion. All participants kept their hand in the ice water for at least 30 s. Deep Breathing: vertical lines reflect when the deep breathing rate (6 breaths/ minute) began and ended. In the third column for the RMSSD, only the first two minutes were analyzed. Diving Reflex: vertical lines reflect when the refrigerated gel-mask was placed on and removed from the participant's face. A five second removal period is designated before the one minute of recovery. Valsalva Maneuver: vertical lines reflect the phases of the effort, from baseline, five seconds designated for inhalation and preparation, 15 s of the Valsalva maneuver, 10 s at the end of maneuver, and a final minute of recovery.