[["We have read the commentary by Tariot et al. \\[[@B1]\\] with interest and are delighted that our paper has provoked additional discussion.\n\nWith regard to the presentation of the multiple metagraphs, our method of labeling did not include the term 'favors'. In figure 5 for example, the labels along the x-axis are either 'control' or 'experimental', unlike the new figure provided by Tariot et al., where the word 'favors' precedes each term. This word was omitted from the graphs in order to avoid indicating that there was any favorable response towards a particular group. The graph instead indicates that the control group showed higher Clinician\\'s Interview-Based Impression of Change Plus Caregiver Input (CIBIC-Plus) values, and this was correctly interpreted in our discussion \\[[@B2]\\] as a nonsignificant trend to favor combination therapy. A revised graph now contains a clarification in the legend (fig. [4](#F4){ref-type=\"fig\"}). The confidence interval plots and labels were set to be consistent throughout the paper for better appreciation of the results between the various assessments used.\n\nThe observation regarding the standard deviation value is valid and a correction has been made. The corrected p value however still does not reach statistical significance (\\<0.05) after this adjustment. Lastly, we closely inspected the remaining calculations and found similar deviations involving values obtained from the study by Tariot et al. \\[[@B3]\\], and these have also been corrected (fig. [1](#F1){ref-type=\"fig\"}, [2](#F2){ref-type=\"fig\"}, [3](#F3){ref-type=\"fig\"}, [4](#F4){ref-type=\"fig\"}). While these corrections do not lead to a change in the paper\\'s final conclusions or recommendations, it shows that in the patients in the mild-to-severe group of Alzheimer\\'s disease (AD), combination therapy does not reveal any benefit in cognitive, behavioral or functional assessments. The authors regret this error and are thankful to Tariot et al. \\[[@B1]\\] for contributing to the accuracy of the study.\n\nIn our approach, we chose to include all levels of dementia severity in a single analysis as a first step. In addition to the explanation provided in the discussion section of the article \\[[@B2]\\] and the concern about heterogeneity that might result from the inclusion of all severity levels in the analysis, we add that our systematic review assessed many study types such as cohorts and open-label studies, some of which did use combination therapy in mild-to-moderate cases. As their results were not suitable for meta-analyses, it became necessary to analyze randomized controlled trials that included mild-to-severe cases. However, due to the broad clinical spectrum and the high I^2^ scores obtained, it is more important to look at the subgroup analyses. A recent notable study also assessed mild-to-moderate AD patients that were already on cholinesterase inhibitor (ChEI). Patients were randomized to vitamin E, memantine or the two treatments in combination, and the results showed that only the vitamin E arm had slower functional decline compared with the placebo group (ChEI therapy) \\[[@B4]\\].\n\nThe noteworthy meta-analysis by Atri et al. \\[[@B5]\\] mentioned in the commentary assessed patients with moderate-to-severe AD, and among the methodological differences was the inclusion of patients only on donepezil as a ChEI and the exclusion of those on \\<10 mg/day. Our study\\'s goal was to assess for a class effect arising from ChEI or memantine and, if enough studies were available, to conduct subgroup analyses with each individual ChEI. Despite the differences, both studies find a statistically significant effect in cognition and functional outcome in the moderate-to-severe groups, favoring combination therapy. A separate study included data from Tariot et al. \\[[@B3]\\], Porsteinsson et al. \\[[@B6]\\], and a third unpublished trial MEM-MD-50 \\[[@B7]\\] in a meta-analysis of moderate-to-severe cases of AD. While the authors also found statistical significance in favor of a combination therapy on cognition, the study similarly came to the conclusion that more evidence was required \\[[@B8]\\].\n\nWhile there were limitations in the study by Howard et al. \\[[@B9]\\] and a concern about the longer follow-up duration, the study provided a comparison of combination therapy with a memantine monotherapy arm. Thus, it helped in determining which treatment arm the benefits can be attributed to and was also in line with our a priori research objective. Even though the I^2^ scores were low in the moderate-to-severe subgroup analyses, this does not exclude clinical or methodological heterogeneity. Including the study by Howard et al. in the meta-analysis was helpful, since patients are not treated for only 24 weeks in clinical practice. Given the broad spectrum from mild-to-severe stages of AD when donepezil can be used, it is likely that clinicians encounter patients who have been treated for much longer periods of time. Analyzing the two studies together provided us with a broader answer about the potential application of combination therapy; starting with a more generalized comparison followed by subgroup analyses to explore the causes of heterogeneity is an advantageous strategy \\[[@B10]\\] when conducting meta-analyses. Further subgroup analysis based on the duration of therapy was not done due to the small number of studies, which is a main reason for our conservative recommendation.\n\nAll these issues were taken into account when we drew our initial conclusions. Although there is no strong evidence against combining ChEI with memantine, we would like to take this opportunity to reiterate that original research exploring combination therapy is still required before confident recommendations can be made.\n\n![Metagraphs of cognitive outcomes of mild-to-severe (3 studies) and moderate-to-severe (2 studies) subgroups. DMvsD = Combination therapy with donepezil and memantine versus monotherapy with donepezil, denoted by Roman numeral I; DMvsM = combination therapy with donepezil and memantine versus monotherapy with memantine, denoted by Roman numeral II. In Porsteinsson et al. \\[[@B6]\\], Mini-Mental State Exam (MMSE) scores were pooled in the results, denoted as lower case 'a' and Alzheimer\\'s Disease Assessment Scale-Cognitive Subscale (ADAS-cog) scores were pooled in the analysis, denoted as lower case 'b'. Howard et al. \\[[@B9]\\] used MMSE, and Tariot et al. \\[[@B3]\\] used Severe Impairment Battery (SIB). No change of significance occurred after correction.](dee-0004-0125-g01){#F1}\n\n![Metagraphs of functional outcomes of mild-to-severe (3 studies) and moderate-to-severe (2 studies) subgroups. DMvsD = Combination therapy with donepezil and memantine versus monotherapy with donepezil, denoted by Roman numeral I; DMvsM = combination therapy with donepezil and memantine versus monotherapy with memantine, denoted by Roman numeral II. Scales used in each study: 23-item Alzheimer Disease Cooperative Study-Activities of Daily Living Scale (ADCS-ADL~23~) in Porsteinsson et al. \\[[@B6]\\], 19-item Alzheimer Disease Cooperative Study-Activities of Daily Living Scale (ADCS-ADL~19~) in Tariot et al. \\[[@B3]\\], and Bristol Activities of Daily Living Scale (BADLS) in Howard et al. \\[[@B9]\\]. Standardized mean differences were used to calculate effect sizes. A change in significance occurred in mild to severe I (p value changed from 0.01 to 0.08) and moderate to severe I (p value changed from 0.008 to 0.05).](dee-0004-0125-g02){#F2}\n\n![Metagraphs of behavioral outcomes of mild-to-severe (3 studies) and moderate-to-severe (2 studies) subgroups. DMvsD = Combination therapy with donepezil and memantine versus monotherapy with donepezil, denoted by Roman numeral I; DMvsM = combination therapy with donepezil and memantine versus monotherapy with memantine, denoted by Roman numeral II. Neuropsychiatric Inventory (NPI) scale was used in each study, and mean differences were used in determining effect sizes. A change in significance occurred in mild to severe I (p value changed from 0.03 to 0.08). No change of significance occurred in the moderate-to-severe subgroup analysis.](dee-0004-0125-g03){#F3}\n\n![Metagraph of performance on CIBIC-Plus, available from 2 studies. No change of significance occurred after correction. The scores were lower in the control group with combination therapy, suggesting therapy favors the experimental group if p values reached significance.](dee-0004-0125-g04){#F4}\n"], ["1. Introduction {#sec1}\n===============\n\nA long segmental bone defects repair is one of the challenging problems in orthopaedic surgery. Although allogenic bone grafts are a current major option \\[[@B1]--[@B3]\\], this technique is associated with problems of significant failure rates, poor mechanical properties, and immunological rejection \\[[@B2]\\]. Porous materials are of significant importance for bone tissue engineering applications because of the good biological fixation to surrounding tissue through bone tissue \\[[@B4]\\].\n\nPorous titanium and titanium alloys have been investigated as they provide favourable mechanical properties with an elastic modulus closed to that of natural bone under a load bearing condition \\[[@B5]\\].\n\nSurface characteristics of porous titanium are important determinants in its scaffold properties since the surface condition of titanium has been reported to play a critical role in bone formation associated with superior osteoblast adhesion and subsequent cell behaviors \\[[@B6]--[@B10]\\].\n\nRecent studies have raised a concern that degradation of biological ability with increase in adsorption of organic impurities on titanium-based biomaterials \\[[@B6], [@B7]\\]. This reduces hydrophilicity, adsorption of cell-binding proteins, and subsequent cell functions. Titanium-based biomaterials therefore need to be fabricated as clean surface without particular contamination such as titanium carbide.\n\nThere are a number of approaches in fabrication of porous titanium and titanium alloys such as sintering loose titanium powder or fibers, slurry sintering, and also rapid prototyping \\[[@B3], [@B4], [@B11], [@B12]\\]. Initial surface contamination of sintered porous titanium would be unavoidable event as this is generally processed within the carbon graphite molds under a high thermal pressure \\[[@B3], [@B4]\\].\n\nThis study aimed to develop a contamination-free porous titanium scaffold by a plasma-activated sintering within an originally developed TiN-coated graphite molds. The surfaces of porous titanium sheets with or without a coated graphite mold were characterized. The cell adhesion property of the porous titanium sheet was also evaluated in this study.\n\n2. Materials and Methods {#sec2}\n========================\n\n2.1. Specimen Preparation {#sec2.1}\n-------------------------\n\nJIS grade II titanium (KS-50, Kobe Steel, Tokyo, Japan) block was used as starting material. Narrow titanium fibers were made by being turned with diameter approximetly 0.4-0.5\u2009mm. The surface of graphite molds were coated with thickness of 1.0\u2009mm TiN by presintering under vacuum with a condition of 1\u2009MPa predisplacement and was processed by plasma-activated sintering under a vacuum at pressure of 20\u2009MPa at 3800\u2009A for 15\u2009sec.\n\nElemental titanium fibers were filled in the *\u03d5* 30\u2009mm \u00d7 1.0\u2009mm graphite mold with or without a TiN coating. The titanium fibers were subjected to the predisplacement with 1\u2009MPa pressure in the graphite molds. The porous titanium sheet *\u03d5* 30\u2009mm \u00d7 0.5\u2009mm was processed by plasma-activated sintering under a vacuum at pressure of 20\u2009MPa, at 3800\u2009A for 15\u2009sec.\n\n2.2. Surface Characterization {#sec2.2}\n-----------------------------\n\n### 2.2.1. Scanning Electron Microscopy {#sec2.2.1}\n\nThe surface topographies of the coatings on the specimens were then observed by SEM (S-2360N, Hitachi, Tokyo, Japan).\n\n### 2.2.2. Thin-Film X-Ray Diffraction (TF-XRD) {#sec2.2.2}\n\nThe crystalline phases of the titanium samples before the tests were detected by TF-XRD (XRD-6100, Shimadzu, Kyoto, Japan) with CuK*\u03b1* radiation. The XRD was operated at 40\u2009kV and 40\u2009mA with a scanning speed of 0.02\u00b0/4\u2009s and a scanning range of 20\u00b0--60\u00b0.\n\n### 2.2.3. Number of Cells on Titanium Samples {#sec2.2.3}\n\n10 \u00d7 10 \u00d7 0.5\u2009mm polished titanium plates and titanium sheets were subjected to cell culture. An osteoblastic cell line, MC3T3-E1, was obtained from the RIKEN Cell Bank (Tsukuba, Japan). Cells were cultured in *\u03b1* minimal essential medium (Gibco) containing 10% fetal bovine serum (Gibco) and 1% antibiotic (penicillin, Gibco) under a 5% CO~2~ atmosphere at 37\u00b0C. Cells were suspended in the medium at 1 \u00d7 10^5^\u2009cells/mL and used for experiments. A 1-mL quantity of floating cells was cultured onto titanium samples at 37\u00b0C under 5% CO~2~ for 1\u2009d. A cell-counting kit (Dojindo, Kumamoto, Japan) was used for the measurement of cell adhesion. After incubation, each specimen was moved to another well and washed 3 times with PBS (Gibco) to remove nonadherent cells. Adherent cells were mixed with 1\u2009mL of medium and 100\u2009*\u03bc*L of reagent solution. After 1\u2009hr of incubation, the absorbance at 450\u2009nm was measured. The number of adherent cells was calculated from the activity of the original cell suspension.\n\n2.3. Stress Fiber Formation and Cell Morphology {#sec2.3}\n-----------------------------------------------\n\nSpecimens were placed in 24-well culture plates with 1\u2009mL floating cells each. Subsequently, the specimens were incubated at 37\u00b0C in 5% CO~2~ for 1 hr. Adherent cells on each specimen after 1\u2009hr of cultivation were dehydrated after being washed with PBS. The cells were fixed with 3.7% formaldehyde in PBS and permeabilized by treatment with 0.1% Triton X-100 (Sigma, Tokyo, Japan) in PBS for 1\u2009min. The cells were then incubated for 3\u2009hrs in a rhodamine-conjugated phalloidin solution. After the cells were washed with water, stress fiber formation and cell morphology were observed with the use of a fluorescence microscope (E-600, Nikon, Tokyo, Japan).\n\n2.4. Statistical Analysis {#sec2.4}\n-------------------------\n\nResults are expressed as mean \u00b1 SD (*n* = 6) within each sample. The normal distribution of each value was confirmed using the Kolmogorov-Smirnov test. The appropriateness of the hypothesis of homogeneous variances was investigated by means of Bartlett\\'s test. Data were statistically analysed by ANOVA followed by a post hoc Tukey test. A *P* value of less than .01 was considered significant.\n\n3. Results {#sec3}\n==========\n\n3.1. Porous Titanium Sheet {#sec3.1}\n--------------------------\n\nAs shown in [Figure 1](#fig1){ref-type=\"fig\"}, the porous titanium sheet was structured under the high thermal pressure. The titanium sheet processed with TiN coating was observed to be clean whereas a particular contamination was observed on the titanium sheet processed without TiN coating.\n\n3.2. Surface Characterization {#sec3.2}\n-----------------------------\n\nThe XRD analysis revealed that the titanium sheet prepared without TiN coated graphite mold showed distinctive TiC peaks while the sheet with coated graphite mold showed TiC peak ([Figure 2](#fig2){ref-type=\"fig\"}). The peaks attributable to rutile TiO~2~ were detectable on the titanium sheet prepared with TiN coated graphite mold. The peaks attributable to TiO~2~ were not detectable on titanium sheet processed without TiN coating.\n\n3.3. Adherent Cells on Titanium Samples {#sec3.3}\n---------------------------------------\n\nThe number of adherent cells on titanium sheet processed with TiN coating was significantly (*P* = .002) higher than that without TiN coating after 1\u2009d ([Figure 3](#fig3){ref-type=\"fig\"}). Adherent cells on the titanium sheet with TiN coating had begun to show stress fibers and widely extend, while the stress fiber formation and cell extension on titanium sheet without TiN coating were not distinctive ([Figure 4](#fig4){ref-type=\"fig\"}).\n\n4. Discussion {#sec4}\n=============\n\nThe plasma-activated sintering is a rapid sintering method associated with self-heating phenomena within the powder. This is capable of sintering metal or ceramic powders rapidly to its full density at a relatively lower temperature compared to the conventional furnace sintering methods. The carbon graphite mold has been employed in the plasma-activated sintering due to its electroconductive property and thermostability \\[[@B3], [@B4], [@B11], [@B13]\\]. The direct heating of graphite mold and the large spark pulse current provide a very high thermal efficiency \\[[@B14]\\].\n\nThe peak of TiC was detected on the titanium sheet processed with graphite mold without TiN coating. Since the titanium fiber elements were directly in contact with the carbon graphite mold during processing; a particular contamination such as TiC is unavoidable event in this condition.\n\nAlternatively, the TiC peak was not detectable on the titanium sheet processed within the TiN-coated carbon graphite mold. This modified plasma-activated sintering with the TiN-coated graphite mold would be useful to fabricate the contamination-free titanium sheet.\n\nRecent study suggested that adsorption of organic impurities on titanium surface are responsible for reducing the initial cells adhesion, subsequent proliferation002C and differentiation \\[[@B6]--[@B8]\\]. Amount of carbon absorbed on the titanium sheet seems to be an important part in determining the initial affinity level for osteoblasts and new bone formation.\n\nThe present study demonstrated that number of adherent cells on the modified titanium sheet was much greater than that of bare titanium plate. Additionally, stress fiber formation and the extension of the cells were observed on the titanium sheets. The initial adhesion of cells induces stress fiber formation, phosphorylation of focal adhesion kinase, and activation of other intracellular signal transduction molecules thereby affecting cell proliferation, differentiation and new bone formation \\[[@B15]\\]. Thus, the contamination-free surface of modified titanium would be useful for new bone generation at a segmental bone defect in comparison with the unmodified sintered porous titanium.\n\nIn conclusion, the TiN-coated carbon graphite mold is a new method for processing a contamination-free porous titanium sheet. This modified titanium sheet is expected to be a new tissue engineering material in orthopedic bone repair.\n\nThis work was supported by MEXT, Haiteku (2009), a grant-in-aid for Scientific Research (B) from the Japan Society for the Promotion of Science, and a grant-in-aid for the Encouragement of Young Scientists (B) from The Ministry of Education, Culture, Sports, Science and Technology of Japan.\n\n![A representative SEM picture of the titanium sheet processed without a TiN-coated graphite mold (a) and with TiN-coated graphite mold (b).](JTE2010-425402.001){#fig1}\n\n![XRD spectra of titanium sheets processed without (a) or with a TiN-coated graphite mold (b).](JTE2010-425402.002){#fig2}\n\n![Number of adherent cells on titanium sheet processed with or without TiN coated graphite mold after 1\u2009d.](JTE2010-425402.003){#fig3}\n\n![Fluorescence microscope images of adherent cells on titanium sheet processed with or without TiN coated graphite mold.](JTE2010-425402.004){#fig4}\n\n[^1]: Academic Editor: Wojciech Chrzanowski\n"], ["INTRODUCTION {#sec1-1}\n============\n\nAutism is a neurodevelopmental disorder, with a complex pathophysiology. Its main characteristics are: Limited social interests, stereotypic behaviors, limited social communications, and impaired social interactions. Some studies suggest that neuroinflammation may play a casual role in autism.\\[[@ref1]\\] There is no curative therapeutic intervention for autism.\\[[@ref2][@ref3]\\] In addition, more evidence must be provided for many of the suggested interventional approaches in children with autism. N-acetylcysteine (NAC) is a precursor to glutathione (g-glutamylcysteinylglycine, GSH). NAC is a relatively safe and available medication for children and adolescents.\\[[@ref4]\\] Here, we report the case of a boy with autism, where his autistic behaviors decreased after taking NAC, for managing his nail-biting behavior.\n\nCASE REPORT {#sec1-2}\n===========\n\nThe patient is an eight-year-old boy who was referred to Hafez Hospital at Shiraz, Iran, in 2011. He was diagnosed with autism disorder according to the DSM- IV diagnostic criteria.\\[[@ref5]\\] There was marked limited verbal communication and skills, stereotypic behaviors, restricted interest, and a significant impairment in social relationships and interactions. He also displayed hyperactivity and inattentiveness in preschool. From two years ago, oral risperidone 2 mg/day and oral thioridazine 10 mg/day were administered, to control his hyperactivity and inattentiveness. No neurological or significant medical problems were found. His laboratory examination was unremarkable.\n\nAfter a written informed consent form was provided by the parents, for participation of the child in a clinical trial (Irct registration number: IRCT201103023930N3), oral NAC 800 mg per day (Pharma Chimi, Iran) was added for management of his severe nail-biting behavior. There was a significant reduction in his nail-biting behavior. In addition, the parents noticed that there was a marked reduction in his autism symptoms 30 days after the onset of NAC administration. The visual analog scale showed that his social interaction was significantly increased and he responded to social interactions better than at the baseline. According to his parents' report, the score of social impairment on the visual analog scale decreased from 10 to 6 in the two-month trial. The parents also reported that his verbal skills and communications increased from 5 to 9 after using the visual analog scale. The aggressive behaviors decreased from 10 to 3. His aggression toward his sister also significantly decreased. The parents had not marked a complaint about their children's aggressive behavior on the last visit. In addition, his hyperactivity and limited interests were reduced after taking NAC. His preoccupation with a toy car was decreased and he was interested in playing with different toys. He had persuaded his parents to cut his hair every day before administration of NAC. They had to even cut his hair more than once a day. Now, his interest and persuasion to cut his hair was markedly deceased. Moreover, the severity and frequency of his blinking tic decreased. The parents did not report any side effect, except a mild abdominal pain. He had never experienced this type of significant improvement in the last few years, even after taking risperidone. They also reported that nothing worsened after the administration of NAC. He did not take any other antioxidant or glutathione prodrug during the period of study or for weeks before.\n\nDISCUSSION {#sec1-3}\n==========\n\nTo the best of the authors' knowledge, this is the first case report about the effect of NAC for treating autism. N-acetylcysteine (NAC) is a precursor of glutathione (GSH), which has antioxidant effects. NAC is administered for treating different psychiatric disorders.\\[[@ref6]\\] Autism is a neurodevelopmental disorder. Its etiology and neurobiology are not exactly known. Oxidative stress is increased and detoxification is decreased in autism.\\[[@ref7]\\]\n\nEven as the level of oxidized GSH is increased in autism, the levels of reduced glutathione (GSH), methionine, and cysteine are decreased.\\[[@ref8][@ref9]\\] Methylation capacity is also decreased in autism.\\[[@ref10]\\] Moreover, the transsulfuration abnormality is associated with autism symptoms.\\[[@ref11]\\] Besides, improvement of the transmethylation/transsulfuration pathways is associated with the reduction of autism symptoms.\\[[@ref12]\\] Cystine, glutamate, and glycine are required for the production of GSH. However, cystine has a rate- limiting role. A low cystine level decreases the production of GSH and may make cells prone to oxidative stress.\n\nOxidative stress plays a significant role in the pathophysiology of autism. Therefore, the demand for cystine is increased in autism. Hence, it is speculated that medications that increase the level of cystine may decrease some symptoms of autism.\\[[@ref13]\\] Meanwhile, NAC plays a significant role in restoring GSH levels.\\[[@ref6]\\]\n\nMoreover, NAC can decrease inflammation through lowering oxidative stress.\\[[@ref6]\\] This is an explanation for improvement of our patient. It is noticeable that there is a marked neuroinflammation in autism and the interventions directed to decrease neuroinflammation are suggested for treating autism.\\[[@ref14]--[@ref16]\\]\n\nThe effect of N-Acetylcysteine on the glutamate level is another explanation for the improvement of this patient. N-acetylcysteine decreases high glutamate levels.\\[[@ref17]\\] The high levels of glutamate and the NMDA receptor is proposed as a target for treating autism.\\[[@ref18]\\] N-acetylcysteine may target the imbalance of oxidative stress in autism.\\[[@ref19][@ref20][@ref21]\\]\n\nIt should be remembered that this was just a case without any control group. Moreover, biochemical assessment was not conducted. Therefore, these results cannot be generalized to other children with autism. However, this case report encourages conducting further long-term controlled clinical trials with a larger sample size, in order to investigate the possible role of NAC for managing autism.\n\nCONCLUSION {#sec1-4}\n==========\n\nAlthough the anti-inflammatory and anti-oxidative roles of NAC are suggested for the reduction of symptoms, its exact mechanism is not clear. However, this case report suggests that NAC may play a potential role for treating autism.\n\n**Source of Support:** Nil\n\n**Conflict of Interest:** None declared.\n"], ["INTRODUCTION\n============\n\nRNA interference (RNAi) is induced by double-stranded RNA (dsRNA) and results in gene silencing through sequence-specific degradation of the target RNA ([@B1]). RNAi provides plants and animals a defense mechanism against viruses ([@B2; @B3; @B4]) and retrotransposons ([@B5],[@B6]). The ribonuclease Dicer processes the long dsRNA replication intermediates into small interfering RNAs (siRNAs) of \u223c22 nucleotides (nt) ([@B7; @B8; @B9]). These siRNAs are incorporated into the RNA-induced silencing complex (RISC) that finds complementary RNA sequences, resulting in cleavage of the target RNA ([@B10],[@B11]). The central catalytic component of RISC is an Argonaute protein, which contains the signature domains PAZ and PIWI responsible for binding the siRNA strand ([@B12]).\n\nTransfection of synthetic siRNAs into cells or intracellular expression of short hairpin RNAs (shRNAs), which are processed into siRNA duplexes by Dicer, are powerful tools to suppress gene expression ([@B13; @B14; @B15]). Randomly selected siRNAs against a target show a large variation in their efficacy ([@B16]). Empirical rules on siRNA duplex features have been reported and improve design of effective siRNAs. The asymmetry rule for siRNA duplex ends requires that the 5\u2032 end of the antisense strand forms a less stable end with its complement than the 5\u2032 end of the sense strand ([@B17],[@B18]). Related to this rule is the described requirement of high A/U content at the 5\u2032 end of the antisense strand and high G/C at the 5\u2032 end of the sense strand ([@B19],[@B20]). In addition, a number of position-specific nucleotides, an unstructured guide-RNA, and an accessible target site have been reported to positively effect siRNA efficiency ([@B19],[@B21; @B22; @B23]).\n\nRNAi can be used as a therapeutic strategy against human pathogenic viruses such as HIV-1 ([@B24]). HIV-1 replication can be inhibited transiently by transfection of synthetic siRNAs targeting viral RNA sequences or cellular co-factors ([@B25; @B26; @B27; @B28]). Furthermore, long-term inhibition of HIV-1 replication has been demonstrated in transduced cell lines stably expressing antiviral siRNAs or shRNAs ([@B29; @B30; @B31; @B32; @B33; @B34]). However, HIV-1 escape variants with nucleotide substitutions or deletions in the siRNA target sequence do emerge after prolonged culturing ([@B31],[@B35],[@B36]). The emergence of RNAi-resistant variants may be blocked by a combination-shRNA therapy, which simultaneously targets multiple conserved viral RNA sequences ([@B34],[@B37]).\n\nWe demonstrated that HIV-1 can also become resistant against RNAi by placing the target sequence in a stable RNA structure, which prevents binding of the siRNA ([@B36]). We also suggested that such structure-based target occlusion occurs in the RNA genomes of lentiviral vectors with a shRNA-cassette ([@B59]). By inserting these cassettes, the target sequence will automatically be present in the vector genome, and self-targeting by the shRNA should reduce the lentiviral production level. However, since the target sequence in the genome is also located in this perfect shRNA hairpin, it is protected against RNAi, ensuring a normal vector titer. Indeed, when the target in the lentiviral genome is unstructured, the titer is significantly reduced by the shRNA ([@B38]).\n\nThe inhibitory effect of target RNA structure on RNAi efficiency has been described in several studies ([@B23],[@B39],[@B40]). These studies compared the efficiency of different siRNAs on a fixed target, and found a correlation between target availability and RNAi efficiency. Schubert *et al*. suggested that the local free energy of base pairing in the target region determines RNAi efficiency ([@B41]). Ideally, one should test this concept by a mutational analysis of one target instead of comparing different siRNAs with intrinsically different RNAi efficacies. In this scenario, mutations that affect the RNA structure should not affect the target sequence itself, such that the same siRNA inhibitor can be used. In this study, we set out to determine the exact hairpin stability at which RNAi suppression occurs by systematically destabilizing a 21-base pair (bp) hairpin structure that occludes the complete target sequence. We monitored the effects on siRNA binding *in vitro* and RNAi efficiency *in vivo*. The 3\u2032 end of the mRNA target sequence is initially recognized by bases 2--5 of the antisense/guide strand siRNA, therefore named the 'seed' sequence ([@B42],[@B43]). Thus, one may expect a more prominent effect of an accessible target 3\u2032 end, which primed us to address positional effects when destabilizing the target hairpin. The results demonstrate a clear correlation between the overall stability of the target hairpin and RNAi efficiency, but positional effects were also apparent.\n\nMATERIALS AND METHODS\n=====================\n\nPlasmid constructs\n------------------\n\nThe luciferase plasmids pGL3-wt, pGL3-T1 to pGL3-T7 ([Figure 1](#F1){ref-type=\"fig\"}B) and pGL3-A to pGL3-G ([Figure 3](#F3){ref-type=\"fig\"}A) were constructed by annealing of forward (fwd) and reverse (rev) oligonucleotides (Supplementary Data, Table 1) and ligation into the EcoRI and PstI sites of the firefly luciferase expression vector pGL3-Nef ([@B36]). The pSUPER-shPol vector ([@B34]) encodes an effective shRNA against a conserved 19-nt HIV-1 region (Pol1; ACAGGAGCAGAUGAUACAG) under the control of an H1 polymerase III promoter ([@B13]). The plasmid pRL-CMV (Promega) expresses Renilla luciferase under control of the CMV promoter. Figure 1.Target RNA structure influences RNAi efficiency. (**A**) The HIV-based target sequences (54\u2009nt) were cloned downstream of the firefly luciferase gene in the pGL3 reporter plasmid. These reporter constructs were co-transfected into C33A cells with the shRNA expressing plasmid pSUPER-shRNA-Pol. The respective H1 (polymerase III) and SV40 (polymerase II) promoter units are indicated by a black box, the arrow marks the transcription initiation site. (**B**) The predicted RNA structures (Mfold program) of the wild-type (wt) shPol and mutated hairpins (T1--T7). The 19-nt target sequence is highlighted as a gray box and the mutated nucleotides are encircled. The thermodynamic stability (\u0394*G* in kcal/mol) of the target hairpins is indicated (54\u2009nt total; CCCC + indicated hairpin + UUU). (**C**) Luciferase expression upon transfection of the reporter constructs with increasing amounts of pSUPER-shRNA-Pol. The firefly luciferase activity was normalized to that of the Renilla luciferase to correct for variation in transfection efficiency. The level of expression observed in the absence of shRNA-Pol was set at 100% for each reporter construct. This level did not vary significantly for the different constructs. The mean values of six independent experiments are shown (\u00b1 SD). (**D**) The thermodynamic stability of the target hairpins is plotted against the level of luciferase expression as observed in [Figure 1](#F1){ref-type=\"fig\"}C with 10\u2009ng pSUPER-shRNA-Pol. (**E**) Semi-quantitative RT-PCR on RNA isolated from cells transfected with 100\u2009ng pGL3-target hairpin variants with or without 10\u2009ng pSUPER-shRNA-Pol. Actin levels serve as a control.\n\nCell culture and luciferase assays\n----------------------------------\n\nC33A cervix carcinoma cells were grown as a monolayer in Dulbecco\\'s modified Eagle\\'s medium supplemented with 10% FCS, minimal essential medium nonessential amino acids, 100\u2009units/ml penicillin, and 100\u2009units/ml streptomycin at 37\u00b0C and 5% CO~2~. C33A cells were grown in 1\u2009ml culture medium in 2\u2009cm^2^ wells to 60% confluence and transfected by the calcium phosphate method. The pGL3-variant (100\u2009ng) was mixed with 0.5\u2009ng pRL-CMV, 0.1--100\u2009ng pSUPER-shPol and pBluescriptII (KS^+^) (Stratagene) to have 1\u2009\u03bcg of DNA in 15\u2009\u03bcl water. The DNA was mixed with 25\u2009\u03bcl of 2\u00d7 HBS and 10\u2009\u03bcl of 0.6\u2009M\u2009CaCl~2~, incubated at room temperature for 20\u2009min and added to the culture medium. The culture medium was refreshed after 16\u2009h, and cells were lysed after another 24\u2009h. Firefly and Renilla luciferase activities were measured with the Dual-luciferase Reporter Assay System (Promega) as described previously ([@B36]).\n\nRT-PCR\n------\n\nC33A cells (2\u2009cm^2^) transfected with 100\u2009ng pGL3-variants and 0 or 10\u2009ng pSUPER-shPol were lysed 2 days after transfection. Total RNA was isolated with TRIZOL\u00ae reagent (Invitrogen) according to the manufacturer\\'s protocol. Contaminating genomic DNA was removed by DNase treatment using the TURBO DNA-free\u2122 kit (Ambion). First strand cDNA was synthesized using 1\u2009\u00b5g of total RNA, Thermoscript\u2122 reverse transcriptase (Invitrogen), and primers. The gene-specific primers used were EWr6 (5\u2032-GCCCCGACTCTAGACTGCAG-3\u2032) for Firefly luciferase and 3\u2032HC-b-ACTIN (5\u2032-TGTGTTGGCGTACAGGTCTTTG-3\u2032) for actin.\n\nPCR amplification (25 cycles) was performed on 2, 0.4, 0.08 or 0.016\u2009\u00b5l RT product with Firefly luciferase primers EWr6 and GL3pcr-RT (5\u2032-GCTGAATTGGAAT-CCATCTT-3\u2032) or actin primers 3\u2032HC-b-ACTIN and 5\u2032HC-b-ACTIN (5\u2032-GGGAAATCGT-GCGTGACATTAAG-3\u2032). The PCR products, respectively 398 and 275\u2009bp, were run on a 1.5% agarose gel.\n\n*In vitro* transcription and electrophoretic mobility shift assay (EMSA)\n------------------------------------------------------------------------\n\nThe pGL3-variant plasmids were used as template for PCR amplification with primers EWr8 (5\u2032- TCC*TAATACGACTCACTATAGG*TTCCCC[ACAGGAGCAGATGA]{.ul}-3\u2032; T7 RNA-polymerase promoter in italics) and EWr9 (5\u2032- GACTCTAGACTGCAGAAA[AC]{.ul} -3\u2032). The resulting PCR product contains a T7 RNA-polymerase promoter upstream of the hairpin (hairpin nt underlined). DNA products were purified from agarose gel using QiaexII Gel extraction kit (Qiagen). RNA transcripts were produced by *in vitro* transcription with the Megashortscript T7 transcription kit (Ambion), and transcripts were checked for integrity and isolated from an 8% acrylamide gel. RNA concentrations were determined by spectrophotometry.\n\nThe siRNA-Pol antisense/guide oligonucleotide CUGUAUCAUCUGCUCCU-GU (Eurogentec) was 5\u2032 end labeled with the kinaseMax kit (Ambion) and 1\u2009\u03bcl \\[\u03b3-^32^P\\] ATP (0.37\u2009MBq/\u03bcl, Amersham Biosciences). The target hairpin RNAs were denatured in 30\u2009\u03bcl water at 85\u00b0C for 3\u2009min followed by snap cooling on ice. After addition of 10\u2009\u03bcl 4\u00d7 MO buffer (final concentration: 125\u2009mM KAc, 2.5\u2009mM\u2009MgAc, 25\u2009mM HEPES, pH 7.0), the RNA was renatured at 37\u00b0C for 30\u2009min. The transcripts were diluted in 1\u00d7 MO buffer to a final concentration varying from 0 to 1.0\u2009\u03bcM in MO buffer. Unlabeled tRNA (1\u2009\u03bcg) was added as competitor to each reaction to minimize aspecific RNA interactions. The 5\u2032-labeled oligonucleotide (1.0\u2009nM) was added and the samples (20\u2009\u03bcl) were incubated for 30\u2009min at 37\u00b0C. After adding 4\u2009\u03bcl non-denaturing loading buffer (50% glycerol with bromophenol blue), the sample was analyzed on a non-denaturing 4% acrylamide gel. Electrophoresis was performed at 150\u2009V at room temperature and the gel was subsequently dried. Quantification of the free and bound oligonucleotide was performed with a Phosphor Imager (Molecular Dynamics).\n\n*In silico* RNA analysis\n------------------------\n\nThe structure and stability of the target hairpins cloned into the pGL3-variants was predicted with the RNA Mfold program ([@B44],[@B45]) at . The indicated \u0394*G* in [Figure 1](#F1){ref-type=\"fig\"}B and [Figure 3](#F3){ref-type=\"fig\"}A are derived by importing the hairpin sequences into the program (54\u2009nt total: 5\u2032 CCCC + hairpin sequences + UUU) and did not contain luciferase sequences. The presence of the predicted hairpin structures in the context of the luciferase reporter construct was verified by importing longer sequences (150\u2009nt total; 52\u2009nt + hairpin sequences + 51\u2009nt) into Mfold.\n\nRESULTS\n=======\n\nTarget hairpin destabilization triggers RNAi\n--------------------------------------------\n\nWe investigated the effect of target RNA structure on RNAi efficiency. As a model system we used a very potent shRNA inhibitor that is directed against the Pol gene of HIV-1 ([Figure 1](#F1){ref-type=\"fig\"}A; left) and that has been tested extensively against HIV-1 and appropriate reporter genes ([@B34]). Such a luciferase reporter with the HIV-1 Pol target sequence in the 3\u2032 UTR is shown in [Figure 1](#F1){ref-type=\"fig\"}A (right). Next, we made the target inaccessible by inclusion in a perfect hairpin of \u0394*G* = \u221236.6\u2009kcal/mol ([Figure 1](#F1){ref-type=\"fig\"}B; wild type (wt), the target sequence is marked in gray). In fact, this hairpin structure is identical to the shRNA itself. The top 2\u2009bp and the 5-nt loop are standard in the optimized pSUPER system ([@B13]). We systematically destabilized this target hairpin in mutants T1--T7 by introducing nt substitutions in the descending strand of the stem (encircled in [Figure 1](#F1){ref-type=\"fig\"}B), thus leaving the target sequence intact. The mutations were chosen such that the predicted thermodynamic stability (\u0394*G*) decreases gradually. We first destabilized the hairpin by replacing stable G-C by weak G-U base pairs (mutants T1--T3), followed by more gross destabilizations, e.g. by introducing mismatches (mutants T4--T7). The \u0394*G* value was reduced in a step-wise manner to \u22127.2\u2009kcal/mol for mutant T7.\n\nTo accurately quantify the RNAi efficiency against these differentially structured targets, we placed them downstream of the luciferase reporter gene ([Figure 1](#F1){ref-type=\"fig\"}A; right). These constructs were co-transfected into cells with increasing amounts of the shRNA-Pol expression vector and luciferase expression was measured after 48\u2009h ([Figure 1](#F1){ref-type=\"fig\"}C). Expression of the reporter construct with the target sequence embedded in the wt hairpin was completely resistant against shRNA-Pol. The same expression pattern was observed for the T1 construct, but T2 already showed some susceptibility for RNAi-mediated inhibition with higher amounts of shRNA-Pol, with a maximal inhibition of 34% (66% residual luciferase expression). The next reporter constructs (T3,T4) showed a significant drop in luciferase expression (64% inhibition). Inhibition of the remaining destabilized target hairpins (T5--T7) was very effective, showing more than 80% inhibition. This is similar to the maximal inhibition level that can be obtained with this potent shRNA inhibitor against a reporter with the 19-nt target sequence in an unstructured setting \\[([@B34]) and results not shown\\]. To verify that the reduction of luciferase expression is due to mRNA degradation, we performed a semi-quantitative RT-PCR on cellular RNA ([Figure 1](#F1){ref-type=\"fig\"}E). Consistent with the luciferase assays, the levels of mRNA are increasingly diminished for transfections with the constructs T2 to T7 and shRNA-Pol. The near absence of PCR product for construct T0, with or without shRNA-Pol, indicates an inefficient RT reaction through a perfect hairpin. There were no PCR products when RNA was used as input for the PCRs (results not shown).\n\nWe plotted the measured level of luciferase expression against the predicted stability of the target hairpins ([Figure 1](#F1){ref-type=\"fig\"}D). The results suggest an inverse linear correlation between RNAi-susceptibility and target hairpin stability in the \u221230/\u221215\u2009kcal/mol range. The curve shows two plateaus. A reduction in hairpin stability from \u221236 to \u221230\u2009kcal/mol does not significantly induce RNAi-mediated inhibition (\\<20% inhibition), and further destabilization above \u221215\u2009kcal/mol shows no significant improvement of the already maximal inhibition of \u223c86%.\n\nTarget hairpin destabilization triggers siRNA binding\n-----------------------------------------------------\n\nTo demonstrate that the increased RNAi efficiency on destabilized target hairpins is due to more efficient binding of the siRNA, we performed *in vitro* binding experiments by means of electrophoretic mobility shift assays (EMSA). For this, we used short T7 transcripts with the complete hairpin and a 19-nt RNA oligonucleotide, which corresponds to the antisense/guide strand of the siRNA-Pol (complementary to boxed sequence in [Figure 1](#F1){ref-type=\"fig\"}B). The siRNA was radioactively labeled and incubated with increasing amounts of target transcript (wt, T1--T7), and subsequently analyzed on a non-denaturing acrylamide gel ([Figure 2](#F2){ref-type=\"fig\"}A). Binding of siRNA to the target RNA leads to duplex formation that results in a band shift on the gel. Unbound siRNA and the siRNA/target RNA duplex were quantified to calculate the percentage of binding ([Figure 2](#F2){ref-type=\"fig\"}B). Figure 2.Target RNA structure influences siRNA binding. (**A**) Radioactively labeled oligonucleotide simulating the siRNA antisense strand (processed from shRNA-Pol) was incubated with increasing amounts of hairpin target RNA variants (wt, T1--T7). SiRNA/target RNA duplex formation was analyzed by EMSA. (**B**) Free and bound siRNA oligonucleotide was quantified to calculate the level of duplex formation (bound siRNA/free + bound siRNA). (**C**) The thermodynamic stability of the target hairpins is plotted against the level of duplex formation with 0.2\u2009\u00b5M target RNA. Figure 3.Position-specific destabilization of target hairpin triggers RNAi differentially. (**A**) Predicted RNA structures of the wt and mutant hairpins A--G. The target sequence is highlighted in the gray box and the mutated nucleotides are encircled. The thermodynamic stability (\u0394*G* in kcal/mol) of the target hairpins is provided for each structure (54\u2009nt total; CCCC + indicated hairpin + UUU). (**B**) Luciferase expression observed after transfection of the reporter constructs with 10\u2009ng pSUPER-shRNA-Pol is plotted against the thermodynamic stability of the target hairpins. The mean values of five independent experiments are shown (\u00b1 SD). The gray dotted line represents the trend line observed for the initial set of mutants in Figure 1D. The right graph zooms in on a smaller \u0394*G* segment.\n\nWe performed the binding experiment multiple times with 0.2\u2009\u00b5M target RNA because efficient binding can be observed, yet most variants stay within the linear range of the binding assay. We plotted these binding percentages against the predicted stability of the target hairpins ([Figure 2](#F2){ref-type=\"fig\"}C). A general trend can be observed that is the opposite of the graph in [Figure 1](#F1){ref-type=\"fig\"}D: reduced hairpin stability results in more efficient binding of the siRNA to the target RNA. Thus, a decrease in the stability of the target hairpin increases RNAi efficiency due to more efficient binding of the siRNA. The largest improvement in RNA--RNA interaction and RNAi efficiency is observed for mutant T3 in comparison with T2, indicating that a threshold stability is passed by going from \u0394*G* = \u221227.1 to \u221221.7\u2009kcal/mol.\n\nAccessibility of the 3\u2032 end of the target sequence is beneficial for RNAi\n-------------------------------------------------------------------------\n\nWe globally determined the stability at which hairpin structures become inhibitory to the RNAi machinery. However, not all domains within the 19-nt target sequence may contribute equally to siRNA binding and the RNAi mechanism. For instance, it has previously been suggested that the 3\u2032 end of the target sequence is initially recognized by the siRNA within RISC ([@B43]). To test this, we made a second set of Luc-target constructs ([Figure 3](#F3){ref-type=\"fig\"}A, mutants A--G). By introducing clustered mutations in the target hairpin, we destabilized either the 3\u2032 end, the center or the 5\u2032 end of the target sequence. Modest G-U changes were introduced in mutants A (3\u2032), B (center) and C (5\u2032). More gross destabilizing mutations were introduced in mutants D (3\u2032), E (center) and F (5\u2032). However, it is apparent that the two mutations in D have a more modest effect on the \u0394*G* value because a realignment of the sequences trigger an alternative folding of the top of the hairpin. We therefore constructed the additional mutant G with three mutations to obtain a hairpin with a destabilized 3\u2032 target end that is comparable in \u0394*G* to hairpins E and F. Target hairpins A through G were cloned in the luciferase reporter and co-transfected into cells with increasing amounts of the shRNA-Pol expression vector to quantify the RNAi efficiency (results not shown).\n\nThe luciferase values obtained with 10\u2009ng shRNA-Pol were plotted against the predicted hairpin stability ([Figure 3](#F3){ref-type=\"fig\"}B, left) and we zoom in on a smaller \u0394*G* range ([Figure 3](#F3){ref-type=\"fig\"}B; right graph). The target hairpins A, B and C follow the general trend that we described previously (gray dotted trend line). Independent of where the hairpin is destabilized, the introduction of G-U base pairs is a too modest manipulation to trigger RNAi activity. The target hairpins E, F and G have more dramatic changes that reduce the overall hairpin stability to \u221225/\u221226\u2009kcal/mol, which should become susceptible to RNAi according to the previous results. However, mutants F (5\u2032) and E (center) remain largely insensitive, but mutant G with a free 3\u2032 end shows increased RNAi sensitivity when compared to the trend line. Even the D mutant with a more modest destabilization of the target 3\u2032 end shows reasonable RNAi activity and clearly drops below the trend line. These results confirm the importance of initial recognition of the 3\u2032 target end, which explains the deviations from the general trend.\n\n*In vitro* siRNA-target RNA binding does not accurately mimic the RNAi mechanism\n--------------------------------------------------------------------------------\n\nWe performed *in vitro* binding experiments to study the A--G mutants for their ability to bind the siRNA. The radioactively labeled siRNA was incubated with increasing amounts of the target transcripts A--G and analyzed on gel ([Figure 4](#F4){ref-type=\"fig\"}A). The shifts representing the siRNA/target RNA duplex and the free siRNA bands were quantified to calculate the percentage of binding ([Figure 4](#F4){ref-type=\"fig\"}B). Figure 4.*In vitro* siRNA binding does not prefer an accessible 3\u2032 end of the target. (**A**) Radioactively labeled oligonucleotide simulating the siRNA antisense strand (processed from shRNA-Pol) was incubated with increasing amounts of target variants A--G. SiRNA/target RNA duplex formation was analyzed by EMSA. (**B**) Free and bound siRNA oligonucleotide was quantified to calculate the level of duplex formation (bound siRNA/free + bound siRNA). (**C**) The thermodynamic stability of the target hairpins is plotted against the level of duplex formation with 0.2\u2009\u00b5M target RNA. The gray dotted line represents the trendline observed in [Figure 2](#F2){ref-type=\"fig\"}C in the experiments with the initial set of mutants.\n\nThe percentage of binding with 0.2\u2009\u00b5M target RNA was plotted against the predicted stabilities of the target hairpins ([Figure 4](#F4){ref-type=\"fig\"}C). Remarkably, these *in vitro* binding results differ significantly from the *in vivo* RNAi results. The target hairpins D and G (both 3\u2032), which are efficiently targeted by RNAi in the luciferase assay ([Figure 3](#F3){ref-type=\"fig\"}B), are inefficient in siRNA binding. In contrast, the target hairpins F (5\u2032) and G (center) showed a slightly increased binding efficiency, although these construct were relatively more RNAi resistant in the luciferase assay. These results may reflect the oversimplification of the *in vitro* binding assay and point to a contribution of the RISC/siRNA complex in the recognition and binding of the target sequence *in vivo*.\n\nDISCUSSION\n==========\n\nIt has been proposed that RNAi efficiency is influenced by the local RNA structure of the targeted sequence. We investigated this phenomenon in detail by placement of the target sequence in a perfect hairpin structure (\u0394*G* = \u221236.6\u2009kcal/mol), which indeed resisted RNAi. Subsequently we destabilized this tight target structure resulting in a gradual exposure of the target sequence. Destabilization of the hairpin structure has little effect on RNAi activity until a threshold is reached (\u0394*G* \u2248 \u221230\u2009kcal/mol). Beyond this threshold we demonstrate an inverse correlation between hairpin stability and RNAi-mediated inhibition. Maximal RNAi efficiency was observed with hairpins of \u0394*G* \u2265 \u221215\u2009kcal/mol. *In vitro* binding experiments suggested that the increase of RNAi-mediated inhibition is due to efficient siRNA binding to the destabilized target RNA hairpins.\n\nWhen we introduced position-specific mutations in the target hairpin, we observed RNAi efficiencies that deviate from this trend. Hairpins with an opened 5\u2032 end or central part of the target sequence show less RNAi activity than predicted based on their overall stability. In contrast, hairpins with an opened 3\u2032 end are more susceptible to RNAi than expected. These results are consistent with the current notion that the 3\u2032 region of the target is initially recognized and bound by the RISC/siRNA complex ([@B43]). This model is supported by structural data on RISC bound to the siRNA strand. The 3\u2032 end of the siRNA is recognized and bound in a pocket by the PAZ domain of the Argonaute protein ([@B46]). The 5\u2032 end of the siRNA is anchored at the PIWI domain of Argonaute and these 5\u2032 nucleotides are readily accessible for base pairing to complementary 3\u2032 nucleotides of the target RNA ([@B47],[@B48]). The importance of the target 3\u2032 end was also revealed in experiments that selected for viruses that resist RNAi-mediated inhibition. We described a unique HIV-1 escape variant that acquired a mutation outside the 19-nt target, which forces the RNA into an alternative structure that occludes the 3\u2032 end of the target ([@B36]).\n\nBesides the *in vivo* RNAi measurements, we also tested the different RNA targets for their ability to interact with the siRNA *in vitro*. The overall \u0394*G* effect of stable target hairpins is confirmed in this simplified *in vitro* setting, demonstrating that RNAi resistance is due to the inability of the siRNA to interact with the base-paired stem of the hairpin. We realize that the siRNA does not act by itself *in vivo* as it is part of RISC, of which the helicase activity may affect local structure in the target RNA ([@B49]). In fact, we observed an interesting discrepancy between the *in vivo* and *in vitro* results for the 5\u2032/center/3\u2032-destabilized hairpins. We observed that an accessible 3\u2032 target is key for RNAi activity, but this effect was not seen *in vitro*. This result may indicate an important contribution of RISC in the siRNA-target RNA annealing step.\n\nThus, target RNA structure is an important factor when selecting a suitable target sequence, as it can have a negative effect on RNAi efficiency. For instance, it has been shown that the TAR hairpin of the HIV-1 genome is an unsuitable target because of its tight structure ([@B31],[@B40],[@B50]). On the other hand, it is obvious that an accessible sequence does not automatically make a good siRNA target ([@B31]), as the matching siRNA may not meet the criteria of an effective siRNA ([@B51]). It has been proposed to include a calculation of the amount of hydrogen bonds within the target sequence as a parameter for efficient target sequences ([@B39]). We provide a \u0394*G* threshold at which an hairpin RNA structure becomes inaccessible, and we differentiate between different target positions. When designing antiviral siRNAs one may also consider ways to obstruct viral escape via folding of an alternative target RNA structure ([@B36]). The local RNA region should be screened for the absence of alternative foldings that occlude the 3\u2032 end of the target and that can be selected by one or two mutations. If not available, the genetic threshold for structure-based escape might prove too high, even for a fast evolving virus like HIV-1.\n\nRNA structure-mediated resistance against RNAi is in fact beneficial when expressing highly structured shRNAs or miRNAs in cells. For instance, the incorporation of shRNA cassettes in a lentiviral vector is potentially problematic, because the shRNA will target the viral RNA genome during vector production, thus reducing the titer. Such self-targeting has not been reported ([@B52],[@B53]), we think because the target is not accessible as part of the perfectly base-paired shRNA hairpin. The apparent absence of such self-targeting is particularly important for the development of multi-shRNA lentiviral vectors without titer reduction. However, placing many tight RNA structures in the vector genome may negatively influence the titer by other means. For instance, reverse transcription is very sensitive to excessively stable RNA structure ([@B54]) and RNA polymerase II transcription may pause at sites where the RNA products folds stable hairpin structures ([@B55]). We did indeed observe that four shRNA cassettes reduce the lentiviral vector titer (ter Brake, unpublished data). Destabilizing the introduced shRNAs may avoid such vector problems, and provide additional benefits for cloning and sequencing of inverted repeat sequences ([@B56]). In our target model system, we mutated the antisense strand of the shRNA hairpin, leaving the sense target sequence intact. In the case of a true shRNA expression cassette, modifications will be made in the sense (target) strand to leave the guide/antisense siRNA strand unaltered. The obvious advantage will be reduced complementarity between the target and the siRNA inhibitor. The impact of such mutations on self-targeting is likely to depend on the position and type of mismatches that are introduced ([@B57],[@B58]). It is therefore impossible to make general rules for shRNA design and destabilization as each hairpin RNA structure will have its unique characteristics as target and effector in the RNAi mechanism. Here we demonstrate a \u0394*G* window for shRNA-Pol destabilization without activating RNAi self-targeting, which may provide a guideline for other shRNAs. Positional effects should be considered, and hairpins may be destabilized to \u0394*G* = \u221225\u2009kcal/mol as long as the target 3\u2032 end remains base-paired. It is too early to define more general guidelines for structured RNA motifs other than the man-made, perfectly base-paired shRNA hairpins, as natural RNA structures differ significantly in their topology and architecture.\n\nSUPPLEMENTARY DATA\n==================\n\nSupplementary Data are available at NAR Online.\n\n###### \\[Supplementary Material\\]\n\nThis research was sponsored by The Netherlands Organisation for Health Research and Development (ZonMw; VICI grant) and The Netherlands Organization for Scientific Research (NWO-CW; TOP grant). Funding to pay the Open Access publication charges for this article was provided by the VICI grant.\n\n*Conflict of Interest statement*. None declared.\n"], ["Introduction {#s1}\n============\n\nSepsis is a serious systemic inflammatory response caused by bacterial, viral and fungal infection, and is one of the major causes of death in critical care patients ([@B34]). Data from developed countries estimate more than 30 million cases of sepsis and about 20 million cases of severe sepsis globally every year, which may result in more than 5 million deaths. Sepsis detection mainly depends on clinical diagnosis, but early clinical characteristics are not specific, and the lack of timely and reliable early warning diagnosis indicators is an important reason behind the high mortality among sepsis patients ([@B30]).\n\nDuring the pathophysiological process of sepsis, pathogens and their toxins invade the vascular circulation, activate the host's cellular immune system, generate cytokines and cause systemic inflammatory response syndrome ([@B35]). Sepsis can affect various organs and systems of the body, causing necrosis and dysfunction of tissues and cells, and even multiple organ dysfunction syndrome ([@B17]). Vascular endothelial cell damage, platelet function and immune dysfunction are key factors affecting disease progression ([@B18]; [@B16]). Thus, resolving knowledge of pathogen invasion in sepsis will benefit diagnosis and treatment to reduce mortality of patients.\n\nTreatment of sepsis is mostly aimed at pathogen control, including the early use of broad-spectrum antibiotics ([@B50]). Removing the pathogen toxin is also very important. Lipopolysaccharide (LPS) (Gram-negative bacteria), lipoteichoic acid (LTA) (Gram-positive bacteria) and mannan (PLM) (fungi) all bind to toll-like receptors (TLRs) of innate immune cells ([@B45]). The lipid part of LPS can bind to TLR4 on many cells, LTA can bind to TLR2 and TLR6, and PLM can bind to TLR2, TLR4 and TLR6, activating the nuclear transcription factor-B (NF-B) signal pathway and the subsequent pro-inflammatory/anti-inflammatory cytokine response ([@B36]). Blocking the pathogen toxin is therefore a critical step for preventing disease progression in sepsis patients.\n\nConversely, studies have shown that the susceptibility and prognosis of sepsis are genetically related to the host ([@B6]; [@B33]). Heterogeneous host genome variation plays an important role in sepsis patients, suggesting another critical factor for diagnosis and treatment of sepsis. Genomic variation can produce gene expression differences among individuals. The location of expression quantitative trait loci (eQTLs) has been analyzed to resolve the relationship between sepsis and the host immune response and various signaling pathways ([@B8]). However, most reported single nucleotide polymorphisms (SNPs) are in non-coding regions and have no obvious biological significance. The development of genetics and epigenetics research has revealed that some SNPs can influence RNA modification to control RNA secondary structure or regulate RNA protein interaction, modifying the function of enhancers, silencers and exon splicing ([@B32]; [@B24]; [@B42]). These studies show that some SNPs as well as eQTLs affect the stability and function of RNA, which is the basic mechanism of host genome variation associated with the disease.\n\nMethylation of the sixth N atom on the adenine base is part of an important regulatory mechanism affecting RNA stability and functionality ([@B47]). In eukaryotes, m^6^A is the most common post-transcriptional modification of RNA; it can affect the specific recognition of a protein complex and RNA and thus affect the intracellular transport, shear processing, degradation and translation of mRNA, finally regulating gene expression ([@B11]; [@B1]). Key regulatory factors of m^6^A RNA modification, such as METTL3, METTL 14 and FTO, can control the occurrence and development of many diseases by regulating the level of m^6^A RNA modification ([@B21]; [@B20]; [@B41]; [@B48]).\n\nm^6^A-SNPs can be considered an important functional genetic variation, providing new clues for understanding the molecular mechanism of genetic variation. A large number of m^6^A-SNPs related to bone mineral density, periodontitis, rheumatoid arthritis and coronary heart disease have been identified through integration analysis of the m^6^A-SNP list ([@B26]; [@B27]; [@B28]; [@B22]). They integrated GWAS data and m6A methylation to scope its role in disease. However, there is no report on the role of m^6^A modification in the development of sepsis. Identification of m^6^A-eQTLs associated with sepsis will extend our knowledge of the genetic mechanism of m^6^A modification in sepsis. We therefore used the open eQTLs data set to explore the potential role of m^6^A-eQTLs in the pathogenesis of sepsis.\n\nMaterials and Methods {#s2}\n=====================\n\nList of m^6^A-SNPs {#s2_1}\n------------------\n\nA list of m^6^A-SNPs was obtained from the m^6^Avar database (), with high (13,703 SNPs), medium (54,222) and low (245,076) confidence levels ([@B49]). SNPs were obtained from dbSNP and TCGA. Other molecular interaction databases including starBase2 and CLIPdb were used during construction of this database.\n\nSepsis-Related eQTLs {#s2_2}\n--------------------\n\neQTL data were obtained through published articles ([@B8]). In this study, 265 peripheral blood leukocyte samples from patients with sepsis were collected and their transcriptional profiles were detected. Genomic SNPs determining gene expression differences between cases were mapped as eQTLs, divided into cis- and trans-eQTLs. cis-eQTLs were restricted to 1 Mb regions between the SNP and associated probe. There were 644,390 SNPs associated with 17,347 mapping probes in sepsis patients.\n\nIdentification of m^6^A-eQTLs in Sepsis and Gene Ontology (GO) Enrichment Analysis {#s2_3}\n----------------------------------------------------------------------------------\n\nm^6^A-SNPs obtained from the m6avar database were compared with previously described cis-eQTLs and trans-eQTLs to obtain m^6^A-cis-eQTLs and m^6^A-trans-eQTLs. A manffoman map was constructed according to p-values from statistical analysis of the original eQTLs. Since there were far fewer m^6^A-trans-eQTLs, corresponding genes were only analyzed for m^6^A-cis-eQTL locations. The number of m^6^A-cis-eQTLs possessed by these genes was determined. GO enrichment analysis was performed using the DAVID TOOL () with the whole human genome as background. The GAD-DISEASE CLASS, GO biological process, GO molecular function, GO cellular component and KEGG pathways were used to classify these genes according their functional annotation.\n\nResults {#s3}\n=======\n\nSepsis-Related m^6^A-EQTLs Distribution Pattern {#s3_1}\n-----------------------------------------------\n\nThe database defined an SNP as one in which the typical sequence motif for m^6^A modification, such as DRACH, was changed ([@B49]). High-confidence m^6^A-SNPs were identified from seven miCLIP and two PA-m^6^A-seq experiments, suggesting SNPs located close to the m^6^A site that may destroy the DRACH motif, such as D(A/G/U) to C, R(G/A) to C/T, A to C/G/U, C to G/A/U and H(A/C/U) to G. Medium-confidence m^6^A-SNPs were selected from 244 MeRIP-Seq experiments, which identify the intersection between SNP and m^6^A sites. Whether or not a SNP changes the DRACH motif or other characteristic sequences modified by m^6^A methylation was predicted by the random forest model. Low-confidence m^6^A-SNPs were predicted from the genome by random forest algorithms. We found 15,720 m^6^A-cis-eQTLs and 381 m^6^A-trans-eQTLs by integrated analysis of eQTLs associated with sepsis and more than 300,000 SNPs in the m^6^Avar database ([**Figures 1A, B**](#f1){ref-type=\"fig\"}, [**Table S1**](#SM1){ref-type=\"supplementary-material\"}). According to the original sepsis eQTL study, a threshold of 1.00E-04 was used to test the association between m^6^A-eQTLs and gene expression in sepsis. As shown in [**Figure 1A**](#f1){ref-type=\"fig\"}, the p-values of m^6^A-cis-eQTLs ranged from 2.00E-04 to 5.19E-065, whereas p-values of m^6^A-trans-eQTLs ranged from 2.00E-09 to 5.93E-058. The m^6^A-cis-eQTLs were distributed on each chromosome with a similar pattern, displaying a gap close to the centromere. Among all m^6^A-cis-eQTLs, rs10239340 on chromosome 7 had the highest significance at p = 5.19E-065, followed by rs9849087 on chromosome 3 (p = 6.86E-062) ([**Table 1**](#T1){ref-type=\"table\"}). Among the m^6^A-trans-eQTLs, rs10876864 on chromosome 12 had the highest significance at p = 5.93E-058, followed by rs11171739 on chromosome 12 (p = 1.56E-041) ([**Table 2**](#T2){ref-type=\"table\"}).\n\n![Manhattan plot of genome\u2010wide identified m^6^A-eQTLs in sepsis patients. The Manhattan plot displayed \u2212log10 (p values) for each of m^6^A-eQTLs associated with sepsis. **(A)** m^6^A-cis-eQTLs **(B)** m^6^A-trans-eQTLs.](fgene-11-00007-g001){#f1}\n\n###### \n\nThe top 20 most significant m^6^A-cis-eQTLs in sepsis.\n\n SNP Chr Position Gene p value FDR\n ------------ ----- ---------- ------------------------------------------------------ ---------- ----------\n rs10239340 7 1.29E+08 Interferon regulatory factor 5(IRF5) 5.19E-65 4.22E-58\n rs9849087 3 1.21E+08 Golgin B1(GOLGB1) 6.86E-62 1.86E-55\n rs8070859 17 15887789 Zinc finger SWIM-type containing 7(ZSWIM7) 1.22E-60 2.28E-54\n rs9891938 17 15915072 Zinc finger SWIM-type containing 7(ZSWIM7) 3.52E-60 4.77E-54\n rs7313235 12 10132283 C-type lectin domain family 12 member A(CLEC12A) 1.14E-58 1.16E-52\n rs2285583 17 15968143 Zinc finger SWIM-type containing 7(ZSWIM7) 7.68E-58 6.93E-52\n rs7313235 12 10132283 C-type lectin domain family 12 member A(CLEC12A) 1.31E-57 1.07E-51\n rs1039320 5 73927752 Ectodermal-neural cortex 1(ENC1) 1.64E-56 1.02E-50\n rs4792717 17 15948430 Zinc finger SWIM-type containing 7(ZSWIM7) 4.00E-56 2.17E-50\n rs3785628 17 15970682 Zinc finger SWIM-type containing 7(ZSWIM7) 4.00E-56 2.17E-50\n rs7522860 1 1.56E+08 Progestin and adipoQ receptor family member 6(PAQR6) 4.88E-56 2.48E-50\n rs11150882 17 259648 Chromosome 17 open reading frame 97(C17orf97) 6.14E-56 2.93E-50\n rs2025577 1 1.56E+08 Progestin and adipoQ receptor family member 6(PAQR6) 1.17E-55 5.30E-50\n rs10474420 5 73934274 Ectodermal-neural cortex 1(ENC1) 1.84E-55 7.85E-50\n rs7313235 12 10132283 C-type lectin domain family 12 member A(CLEC12A) 2.15E-54 8.73E-49\n rs9884018 3 1.22E+08 Golgin B1(GOLGB1) 1.58E-52 5.85E-47\n rs10942714 5 73922463 Ectodermal-neural cortex 1(ENC1) 2.32E-52 8.20E-47\n rs10239340 7 1.29E+08 Interferon regulatory factor 5(IRF5) 2.93E-51 9.15E-46\n rs6864196 5 73944444 Ectodermal-neural cortex 1(ENC1) 5.98E-51 1.80E-45\n rs933489 1 1.56E+08 Progestin and adipoQ receptor family member 6(PAQR6) 5.72E-50 1.60E-44\n\n###### \n\nThe top 20 most significant m^6^A-trans-eQTLs in sepsis.\n\n SNP chr SNP position Probe chr Probe position Gene p value FDR\n ------------ ----- -------------- ----------- ---------------- ------------------------------------------- ---------- ----------\n rs10876864 12 56401085 17 8464669 LOC728823 5.93E-58 5.75E-48\n rs11171739 12 56470625 17 8464669 LOC728823 4.82E-51 1.56E-41\n rs705699 12 56384804 17 8464669 LOC728823 2.06E-48 5.00E-39\n rs1384 12 69747834 5 43175395 Zinc finger protein 131(ZNF131) 2.56E-34 2.07E-25\n rs10784774 12 69737879 5 43175395 Zinc finger protein 131(ZNF131) 2.31E-33 1.60E-24\n rs2168029 12 69734641 5 43175395 Zinc finger protein 131(ZNF131) 3.90E-33 2.52E-24\n rs6581889 12 69757429 5 43175395 Zinc finger protein 131(ZNF131) 6.04E-28 2.54E-19\n rs10024529 4 1363886 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 7.88E-18 1.63E-09\n rs1680032 4 1243573 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 8.09E-18 1.63E-09\n rs1732115 4 1244416 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 8.09E-18 1.63E-09\n rs1265923 4 1209174 19 44085725 LOC390940 1.40E-17 2.78E-09\n rs1265923 4 1209174 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 4.23E-17 7.73E-09\n rs730830 4 1240091 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 4.64E-17 8.06E-09\n rs3817604 4 1291337 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 4.65E-17 8.06E-09\n rs28429103 4 1320023 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 4.65E-17 8.06E-09\n rs10024529 4 1363886 19 44085725 LOC390940 1.92E-16 3.11E-08\n rs17164229 4 1078596 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 4.65E-16 7.40E-08\n rs1721 21 46349496 15 75890708 Snurportin 1 4.91E-16 7.67E-08\n rs1250116 4 1224587 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 9.58E-16 1.45E-07\n rs884421 4 1227469 19 41066160 Spectrin beta, non-erythrocytic 4(sptbn4) 9.58E-16 1.45E-07\n\nWe then surveyed the relationship between eQTLs associated with sepsis and controllers of m^6^A modification such as writers, erasers and readers (METTL3, METTL14, WTAP, FTO, ALKBH5, YTHDC1, YTHDC2, YTHDF1, YTHDF2, YTHDF3) ([**Table S2**](#SM2){ref-type=\"supplementary-material\"}). There was one eQTL in *YTHDF3* (rs7464, p = 2.42E-04), and 20 eQTLs in *YTHDC2* (p-values ranging from 2.64E-07 to 2.15E-04). Other m^6^A controllers were not associated with any sepsis-related eQTLs. However, none of the m^6^A controllers was identified as a significant gene for m^6^A-eQTLs.\n\nSepsis-Related m^6^A-cis-eQTLs Locate to Thousands of Genes {#s3_2}\n-----------------------------------------------------------\n\nTo investigate the biological function of m^6^A associated with sepsis, we analyzed genes corresponding to m^6^A-cis-eQTLs. There were 1321 genes with an average of 11.8 m^6^A-cis-eQTLs, ranging from 1 to 195 ([**Table S3**](#SM3){ref-type=\"supplementary-material\"}). There were 1038 genes with more than one m^6^A-cis-eQTL, and 52 genes had more than 50 m^6^A-cis-eQTLs. In particular, 10 genes had more than 100 m^6^A-cis-eQTLs: *RAD51C*, *LOC100129*, *LY6G5C*, *TRIM27*, *LOC642073*, *RFP*, *ABCC5*, *CLEC12A*, *WDR6* and *CAT* ([**Table 3**](#T3){ref-type=\"table\"}). The most significant p-values for each m^6^A-cis-eQTL associated with sepsis ranged from 2.91E-012 to 1.14E-058. *CLEC12A* had 121 m^6^A-cis-eQTLs with highly significant p-values.\n\n###### \n\nThe top 10 genes with the most number of m^6^A-cis-eQTLs in sepsis.\n\n Gene Freq SNP Chromosome Position p value FDR\n -------------- ------ ------------ ------------ ---------- ---------- ----------\n RAD51C 197 rs12935851 17 56600244 1.11E-31 2.19E-27\n LOC100129668 190 rs2523685 6 31426256 3.39E-19 1.43E-15\n LY6G5C 178 rs805290 6 31648403 6.27E-13 1.05E-09\n TRIM27 157 rs3132377 6 28885974 6.91E-15 1.58E-11\n LOC642073 155 rs6926737 6 32375745 6.75E-20 3.04E-16\n RFP 145 rs6912843 6 28904162 2.91E-12 4.38E-09\n ABCC5 128 rs7624838 3 1.84E+08 1.35E-13 2.54E-10\n CLEC12A 121 rs7313235 12 10132283 1.14E-58 1.16E-52\n WDR6 119 rs3212 3 49145741 6.25E-28 7.90E-24\n CAT 100 rs11032695 11 34447586 7.35E-36 2.66E-31\n\nGO and KEGG Enrichment Analysis of m^6^A-cis-eQTLs Genes {#s3_3}\n--------------------------------------------------------\n\nTo further survey the biological function of genes linked to m^6^A-cis-eQTLs associated with sepsis, we performed GO enrichment analysis in a whole-genome background ([**Figure 2**](#f2){ref-type=\"fig\"}, [**Table S3**](#SM3){ref-type=\"supplementary-material\"}). Among biological processes, genes containing m^6^A-cis-eQTLs were enriched in diverse pathways such as phagocytosis, negative regulation of gene expression, and platelet degranulation. Among cellular components, genes corresponding to m^6^A-cis-eQTLs were enriched in different compartments such as cytosol, membrane, lysosome and platelet alpha granule lumen.\n\n![GO enrichment analysis displayed genes respond to m^6^A-cis-eQTLs are enriched in various biological processes (BP), molecular function (MF) and cellular component (CC). Note that platelet degranulation (BP) and platelet alpha granule lumen (CC) are significant pathway.](fgene-11-00007-g002){#f2}\n\nTo elucidate the molecular pathways involving genes corresponding to m^6^A-cis-eQTLs, we carried out KEGG pathway enrichment analysis ([**Figure 3**](#f3){ref-type=\"fig\"}, [**Table S3**](#SM3){ref-type=\"supplementary-material\"}). Results revealed that these genes were enriched in multiple pathways such as Lysosome, *Staphylococcus aureus* infection, Tuberculosis, and Platelet activation. Among these, Lysosome was the most significant with p-value = 1.05E-05, followed by *S. aureus* infection with p-value = 6.44E-04.\n\n![KEGG analysis revealed genes respond to m^6^A-cis-eQTLs are enriched diverse pathways. Note that *Staphylococcus aureus* infection pathway is with high significance.](fgene-11-00007-g003){#f3}\n\nTo characterize the relationship between sepsis and genes corresponding to m^6^A-cis-eQTLs, we performed enrichment analysis using the disease database. Results showed that *TNF*, *HSPA1A*, *HSPA1B*, *TREM1*, *SOD2* and *MIF* genes related to sepsis correspond to m^6^A-cis-eQTLs.\n\nm^6^A-cis-eQTL Genes Related to Platelet Degranulation {#s3_4}\n------------------------------------------------------\n\nWe identified 17 genes related to platelet degranulation, *SERPING1*, *CD63*, *CTSW*, *TIMP1*, *CD9*, *VWF*, *ORM1*, *LAMP2*, *APP*, *CLEC3B*, *ITIH4*, *ABCC4*, *SERPINA1*, *CFD*, *QSOX1*, *ORM2* and *SRGN*, corresponding to m^6^A-cis-eQTLs associated with sepsis ([**Table S3**](#SM3){ref-type=\"supplementary-material\"}).\n\nm^6^A-cis-eQTL Genes Related to *S. aureus* Infection {#s3_5}\n-----------------------------------------------------\n\nWe identified 12 genes related to *S. aureus* infection, *C3AR1*, *FCGR2B*, *FCAR*, *C5*, *ITGB2*, *FPR2*, *C2*, *HLA-DPB1*, *CFD*, *HLA-DOB*, *ITGAM* and *PTAFR*, corresponding to m^6^A-cis-eQTLs associated with sepsis ([**Table S3**](#SM3){ref-type=\"supplementary-material\"}).\n\nDiscussion {#s4}\n==========\n\nDisease-related genes show genetic variation in the population, which can affect the occurrence and development of the disease including cancer, sepsis and infectious. For a long time, research on the functional mechanism of these genetic variations has focused on changes in biological functional activity of the encoded proteins. However, a large number of SNPs found in genome-wide association analysis do not correspond to functional regions of proteins. Similarly, eQTLs are not located in promoter regions regulating gene expression. In recent years, a large number of m^6^A methylation modifications have been identified on mRNA of eukaryotes ([@B47]). These regulate a series of biological processes by affecting the splicing, translocation, degradation and translation of mRNA involved in the occurrence and development of many diseases. Some SNPs that do not change the coded amino acid can regulate mRNA function by affecting the level of m^6^A modification, thus changing protein abundance. m^6^A-SNPs can change m^6^A modification levels to affect mRNA degradation rate; for example, change in mRNA secondary structure caused by altered m^6^A modification can affect mRNA binding with translation machinery. Similarly, m^6^A-SNPs can also affect mRNA transport from nucleus to cytoplasm. When m^6^A-SNPs are located at splicing sites, they can affect production of mature mRNA with correct biological function. Thus, m^6^A-SNPs can cumulatively regulate the function of key regulatory genes in the process of disease occurrence and development through multiple effects.\n\nThe role of m^6^A modification on mRNA in the pathogenesis of sepsis has not previously been reported. Therefore, study of m^6^A-eQTLs might aid in resolving the underlying mechanism of the pathogenesis of sepsis. By analyzing results of published research on eQTLs associated with sepsis and comparing them with overlapping m^6^A-SNPs in the m^6^AVAR database, we identified a large number of m^6^A-eQTLs related to sepsis, including 15,720 cis-eQTLs and 381 trans-eQTLs. As we expected, these m^6^A-eQTLs were distributed across all chromosomes, suggesting that m^6^A methylation of mRNA plays an important role in sepsis as in other diseases.\n\nAmong genes corresponding to m^6^A-eQTLs, we found some suggested as key players in the pathogenesis of sepsis. *HSPA1B*, with 61 m^6^A-cis-eQTLs, also named *HSP70*, is involved in the inflammatory response, which is an important biological step in the development of sepsis ([**Table S2**](#SM2){ref-type=\"supplementary-material\"}). Research on different phases of sepsis found that HSPA1B can be considered a serum marker for the acute proinflammatory phase ([@B13]). Study on Multiple organ dysfunction syndrome in sepsis found that increased HSPA1B has an anti-inflammatory effect ([@B40]). In rats, raloxifene prevents severe sepsis through induction of HSPA1B with an anti-inflammatory effect ([@B37]). Hesperidin can protect against lung injury induced by sepsis, also through induction of HSPA1B. Another protein, KNK43, also takes part in the HSPA1B pathway to coordinate the inflammatory response ([@B46]). There is evidence that heat stress induces HSPA1B to prevent lethal sepsis ([@B25]). These results indicate that m^6^A-eQTLs exist in the key regulatory genes of sepsis and m^6^A methylation is closely involved in the process of sepsis, while genetic variation may cause a large number of individual differences through the mechanism of m^6^A methylation of mRNA.\n\nWe used GO enrichment analysis to characterize critical biological processes associated with an abundance of sepsis m^6^A-eQTLs. The platelet degranulation process was associated with 17 genes identified in this study. It is well accepted that platelet degranulation is a typical biomarker of sepsis. In the development of sepsis, platelet dysfunction plays a critical role in tissue injury ([@B4]). Early onset neonatal sepsis can be predicted by platelet/lymphocyte ratio ([@B3]). A high level of platelet-derived growth factor B (PDGFB) is found in survivors of sepsis ([@B12]), and PDGFB can reduce the mortality of sepsis by blocking inflammatory responses ([@B12]). In sepsis-associated liver injury, AST/platelet ratio might be an early onset predictor ([@B12]). The underlying mechanism of platelet dysfunction in sepsis remains to be elucidated. Here, we revealed that genes corresponding to m^6^A-eQTLs are enriched in the platelet degranulation process, suggesting that m^6^A methylation is closely involved in the pathologies of sepsis. Published literature supports the relationship between the 17 platelet degranulation related genes corresponding to m^6^A-eQTLs and sepsis. ORM2 interacts with TLR2 signaling involved in the pathobiology of sepsis ([@B39]). CFD and VWF are important biomarkers of acute mortality for severe sepsis ([@B5]; [@B15]). TIMP1 could be used as prognostic marker for the onset of sepsis ([@B31]). Sepsis is attenuated by inhibition of ABCC4 in rats ([@B43]). Reduction of CD63 increases the mortality rate of sepsis in mice through its function in the immune response ([@B44]; [@B2]). The above evidence strongly supports our view that m^6^A methylation plays an extremely important role in the pathological basis of sepsis. We suggest that m^6^A-eQTLs may be a potential predictor of different clinical characteristics of the sepsis disease caused by genetic variation in individuals.\n\nKEGG pathway enrichment analysis in this study revealed 12 genes corresponding to m^6^A-eQTLs associated with sepsis enriched in the *S. aureus* infection pathway. *S. aureus* is one of the major pathogenic bacteria in sepsis ([@B23]). Among the 12 genes corresponding to sepsis m^6^A-eQTLs in this pathway, platelet activating factor receptor (*PTAFR*) is induced in sepsis patients infected with Gram-negative bacteria ([@B10]). Inhibition of PTAFR can block severe sepsis ([@B29]). FPR2 is also induced in the early phase of sepsis and functions in cerebral inflammation ([@B14]; [@B38]). ITGB2 has been identified as an important factor in the inflammatory response to sepsis ([@B19]). FCAR, a typical innate receptor for bacteria, is suggested to be an important protector against sepsis through mediation of bacterial phagocytosis ([@B7]; [@B9]). Such a high proportion of m^6^A-eQTLs related to *S. aureus* response genes indicates that m^6^A methylation plays an extremely important role in the occurrence and development of sepsis. The m^6^A-eQTLs we identified will be of value in the prognosis and diagnosis of sepsis.\n\nIn conclusion, we identified a large number of m^6^A-eQTLs potentially having critical functions in the pathogenesis and development of sepsis. These SNPs may contribute to the effect of genetic variation on the different outcomes of the disease. By mining the published literature, we found that these m^6^A-eQTLs are enriched in the platelet degranulation process and *S. aureus* infection pathway. Both are critical biological processes controlling the pathogenesis and development of sepsis, suggesting that mRNA m^6^A methylation plays a crucial role in sepsis.\n\nData Availability Statement {#s5}\n===========================\n\nAll datasets generated for this study are included in the article/[**Supplementary Material**](#SM1){ref-type=\"supplementary-material\"}.\n\nAuthor Contributions {#s6}\n====================\n\nXS and NL designed the project and wrote the manuscript. XS performed bioinformatics analysis. YD, GT and YL helped analysis of data and revised the manuscript.\n\nConflict of Interest {#s7}\n====================\n\nThe authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.\n\nFunding {#s8}\n=======\n\nThis work was supported by Fujian provincial Health and Family Planning Young and Middle Aged Talents Training Project (2018-ZQN-56);Quanzhou Science and Technology Plan Project (2018N025S).\n\nSupplementary Material {#s9}\n======================\n\nThe Supplementary Material for this article can be found online at: \n\n###### \n\nClick here for additional data file.\n\n###### \n\nClick here for additional data file.\n\n###### \n\nClick here for additional data file.\n\n[^1]: Edited by: Baolei Jia, Chung-Ang University, South Korea\n\n[^2]: Reviewed by: Hui-Bin Huang, Tsinghua University, China; Feng Ren, Ningbo No. 2 Hospital, China; Micaela Daiana Garcia, CONICET Institute of Cell Biology and Neuroscience (IBCN), Argentina\n\n[^3]: This article was submitted to Evolutionary and Genomic Microbiology, a section of the journal Frontiers in Genetics\n"], ["1. Introduction {#sec1}\n===============\n\nSkeletal muscle injury can be caused by a variety of conditions such as direct trauma, disuse, ischemia, exercise, toxins, and genetic diseases. To face these challenges, skeletal muscle has developed a remarkable regenerative capacity, which relies on muscle stem cells, named satellite cells. Skeletal muscle regeneration is a tightly regulated process during which quiescent satellite cells are activated and become proliferating myoblasts, which will differentiate and fuse to form multinucleated myotubes (newly formed muscle fiber) \\[[@B1]\\]. The coordination of the myogenesis process (formation of new muscle tissue) involves the cooperation of numerous other cellular and molecular components \\[[@B2]\\]. Particularly, the onset, development, and the resolution of the inflammatory response play an instrumental role in the regulation of myogenesis.\n\nMonocytes and macrophages are predominant myeloid cells that chronologically accumulate in skeletal muscle at the onset of injury-induced inflammation \\[[@B3]\\]. There are numerous evidences indicating that macrophages are key regulators of different biological processes involved during skeletal muscle regeneration, such as myogenesis, fibrosis, inflammation, and revascularization \\[[@B3]--[@B9]\\]. On the other hand, in chronic degenerative conditions, the excessive and disorganized influx of macrophages stimulates muscle necrosis, fibrosis, and defective muscle repair. Therefore, the spatiotemporal regulation of inflammation is vital for an effective regeneration of skeletal muscle.\n\nIn recent years, novel discoveries revealed that the plasticity, heterogeneity, and the roles played by macrophages in skeletal muscles are much more complex than anticipated. This review will discuss these novel insights into the role of macrophages in muscle homeostasis, regeneration, and diseases with a particular focus on Duchenne muscular dystrophy (DMD). Promising strategies targeting macrophage polarization in physiopathological conditions will also be discussed.\n\n2. Origin and Recruitment of Monocyte and Macrophages {#sec2}\n=====================================================\n\nNumerous tissues contain long-lived resident macrophages that originate from the yolk sac during development \\[[@B10]\\]. In steady state, these tissue-resident macrophages self-renew through in situ proliferation or are replenished by blood monocytes \\[[@B11]--[@B13]\\]. Resident macrophages are observed in healthy skeletal muscles where they regulate tissue homeostasis. In rats, resident macrophages are identified by the marker ED2, while infiltrating monocytes/macrophages are defined by the expression of the marker ED1. In humans, resident macrophages were shown to largely coexpress CD11b and CD206 \\[[@B14]\\]. Contrary to infiltrating macrophages, ED2^+^ resident macrophages do not contribute to phagocytosis \\[[@B15]\\]; instead, it is suggested that they act as sentinels that are readily activated by damage-associated molecular patterns (DAMPs) secreted during muscle injury to facilitate the invasion of circulating leukocytes. However, the literature on these resident cells is limited, and further research is needed to clearly comprehend their roles in healthy and regenerating skeletal muscle.\n\nAfter an injury, activated monocytes originating from the bone marrow adhere to the blood vessels, roll, and migrate to damaged sites, where they start differentiating into macrophages. In mice, two main monocyte subsets have been described according to their mechanism of extravasation and their level of expression of the protein Ly6C \\[[@B16], [@B17]\\]. The proinflammatory Ly6C^hi^ population recruited via the C-C motif chemokine receptor 2 axis (CCR2/CCL2) preferentially accumulates during the acute phase of inflammation, while the CX3C chemokine receptor-1- (CX3CR1-) dependent Ly6C^lo^ subset appears later and exhibits anti-inflammatory properties. Similar monocyte subsets have also been identified in humans using the markers CD14 and CD16. Monocytes CD14^hi^CD16^lo^ correspond to the Ly6C^hi^ monocytes in mice, while CD14^lo^CD16^hi^ relate to the Ly6C^lo^ monocyte profile \\[[@B16]\\].\n\nThe mechanism of monocyte recruitment appears to be specific to the tissue and the nature of the insult. For instance, both Ly6C^hi^ and Ly6C^lo^ were shown to sequentially invade the injured tissue after myocardial infarction using their CCR2 or CX3CR1 receptor, respectively \\[[@B18]\\]. On the other hand, it has been shown that only the Ly6C^hi^ subtype is recruited during sterile skeletal muscle injury, which thereafter switch to the Ly6C^lo^ phenotype \\[[@B17]\\]. The phagocytosis of apoptotic neutrophils by macrophages was shown to partially contribute to this switch \\[[@B17]\\]; however, it is likely that many other cellular and chemical interactions present in the dynamic regenerative microenvironment also contribute to this process. In addition to their transition from Ly6C^hi^ monocytes/macrophages, the Ly6C^lo^ cells also accumulate from local proliferation \\[[@B19]\\]. This finding was also observed in rats where the accumulation of ED1^+^ and ED2^+^ macrophages was shown to be partially mediated by local proliferation, especially when invasion of circulating monocytes is reduced by injection of liposome-encapsulated clodronate \\[[@B20]\\]. Notably, while the different subsets of macrophages were suggested to accumulate sequentially in the injured tissue, it is important to notice that both subsets of macrophages could be simultaneously present in acute regenerating muscles \\[[@B21]\\], a phenomenon which is exacerbated in chronic degenerative muscle diseases such as DMD \\[[@B22]\\].\n\n3. Macrophage Subsets and Polarization {#sec3}\n======================================\n\nA general classification suggests that macrophages can be immunologically classified into two main subsets according to their specific functions: the \"classically activated\" M1 macrophages, which are present in the inflammatory period and associated with phagocytosis, and the \"alternatively activated\" M2 macrophages, accumulating at the site of injury once necrotic tissue has been removed and participating in the regeneration and remodelling process. *In vitro*, the M2 phenotype has been further classified into three main subsets---M2a, M2b, and M2c---each of which requires specific polarization cues \\[[@B23], [@B24]\\]. The alternatively activated M2a macrophages arise from exposure to interleukin-4 (IL-4) and IL-13, the M2b subtype is polarized by IL-1 receptor ligands, and the M2c phenotype is promoted by IL-10 and glucocorticoids \\[[@B25], [@B26]\\]. In mice, the M2 macrophages are identified by the expression of the pan-macrophage marker F4/80 and the alternative activation markers such as Fizz-1 and Ym1 \\[[@B27]\\]. Of note, Arginase-1 was considered as a specific marker for M2 macrophages; however, it is also expressed in the spectrum of M1 macrophage polarization \\[[@B28]\\]. In humans, M2 macrophages express the pan-macrophage marker CD68 and alternative activation markers such as CD163 and/or CD206 \\[[@B27]\\].\n\nRecent insights suggest that this classification based on specific activating factors *in vitro* is an important underestimation of the different macrophage subsets. Accordingly, a study investigating the transcriptional program of macrophages showed that there is a wide spectrum of macrophage activation states \\[[@B29]\\]. The authors showed that while the bipolar activation state is maintained when the macrophages are stimulated with factors classically associated with M1 or M2 polarization (e.g., TNF-*\u03b1* vs. IL-4), it becomes much more complex when other factors such as fatty acids or a combination of molecules associated with chronic inflammation are used. From the 29 different conditions tested, the authors identified 10 major clusters of activation \\[[@B29]\\]. These different conditions *in vitro* only give a glimpse of the complexity of the microenvironment of macrophages during muscle regeneration *in vivo*. Indeed, macrophages are interacting with a fluctuating network of hundreds of different molecular, physical, and cellular components that affect their phenotype. For instance, after their extravasation into the injured tissue, monocytes will attach to the extracellular matrix (ECM), which continuously evolves during muscle regeneration. Notably, components of the ECM, such as collagen and fibrinogen, were shown to stimulate macrophage phagocytosis and expression of proinflammatory factors, respectively \\[[@B30]\\]. Alternatively, attachment of macrophages through their *\u03b1*4*\u03b2*1 integrin receptor to ECM matrix components (fibronectin and vascular cell adhesion molecule-1 (VCAM-1)) stimulates their transition toward the anti-inflammatory phenotype by activating Rac2 signalling \\[[@B31]\\]. Moreover, macrophage polarization is also sensitive to mechanical stress. Low-frequency mechanical stretch pushes macrophages toward the anti-inflammatory phenotype, while high-frequency strains maintain macrophages in their proinflammatory state \\[[@B32]\\]. These results suggest that macrophage activation and polarization *in vivo* are processes that are much more complex than what has been described so far. Accordingly, recent analysis of macrophage transcriptional signature after an *in vivo* muscle injury induced by cardiotoxin showed that Ly6C^hi^ and Ly6C^lo^ macrophages only partially overlap with M1 and M2 gene expression patterns, respectively \\[[@B19]\\]. Instead, the authors showed that the time lapse was the prevalent driving force regulating macrophage gene expression profile, suggesting that a global and coordinated change in the microenvironment components is required to regulate macrophage polarization. The authors identified four key features that account for the changes in gene expression in macrophages during the course of muscle regeneration: firstly, an early expression of genes involved in acute inflammation (e.g., S100A8/9, lipocalin-2, haptoglobin, formyl peptide receptor-1, and leukotriene B4 receptor-1); secondly, a metabolic shift from glycolysis to glutamine and oxidative metabolism-associated genes (e.g., glutamate synthase-1, glycerol-3-phosphate dehydrogenase 2, and superoxide dismutase 2); thirdly, a transient increase in genes associated with cell proliferation (e.g., cyclin-D1 and -A2, many members of the minichromosome maintenance (mcm-2, -3, -4, -5, -6, and -7), DNA ligase-1, replication factor C subunit-1, and ribonucleotide reductase catalytic subunit-M1 and -M2); and fourthly, an increase in the expression of ECM genes (e.g., fibrillin-1, decorin, periostin, lumican, osteonectin, and biglycan). These different clusters of genes could be used to identify new markers to characterize macrophage heterogeneity in skeletal muscle regeneration.\n\nOverall, the M1 and M2 macrophage nomenclature is oversimplistic to characterize macrophage polarization *in vivo*, which should rather be considered as a continuum of activation. Recent effort has been made to propose a common framework for the macrophage-activation nomenclature \\[[@B28]\\]. Here, for the sake of clarity, we propose to use a bipolar nomenclature (proinflammatory vs. anti-inflammatory) to describe the two opposite sides of the spectrum of macrophage activation; however, one should keep in mind that the actual activation state of macrophages is much more plastic, heterogeneous, and complex.\n\n4. Macrophages Regulate the Different Biological Processes Implicated in Acute Skeletal Muscle Healing {#sec4}\n======================================================================================================\n\n4.1. Macrophages Interact with Other Leukocytes to Regulate Inflammation {#sec4.1}\n------------------------------------------------------------------------\n\nThe inflammatory process is constituted of different types of leukocytes such as mast cells, neutrophils, eosinophils, monocytes/macrophages, and lymphocytes, which have all been shown to act on skeletal muscle regeneration \\[[@B6]\\]. Particularly, monocytes/macrophages emerged as the key cellular component orchestrating leukocyte accumulation and function during the different phases of the inflammatory process, i.e., the onset, development, and resolution stages ([Figure 1](#fig1){ref-type=\"fig\"}).\n\n### 4.1.1. Onset {#sec4.1.1}\n\nAs described previously, resident macrophages are important to sense damage to the tissue and initiate the recruitment of circulating leukocytes. Once activated, resident macrophages secrete chemokines such as cytokine-induced neutrophil chemoattractant 1 (CINC-1) and monocyte chemoattractant protein-1 (MCP-1) that promote the recruitment of neutrophils and monocytes \\[[@B6]\\]. Moreover, it was also observed that a subset of Ly6C^lo^ circulating monocytes was \"crawling\" inside the blood vessels independently of the blood flow, with the help of their receptors CX3CR1 and lymphocyte function-associated antigen-1 (LFA-1) \\[[@B33]\\]. These patrolling monocytes sense tissue damage or infection and transiently invade the tissue as soon as 1\u2009h after an insult (much faster than other circulating leukocytes). At this timepoint, patrolling monocytes are the principal source of TNF-*\u03b1*, which promotes the recruitment of other inflammatory cells. Moreover, patrolling monocytes were also shown to promote the recruitment of neutrophils through prolonged cell-cell contact in the microvasculature \\[[@B34]\\]. This direct physical interaction stimulates neutrophil retention and production of reactive oxygen species (ROS) at the site of the injury.\n\n### 4.1.2. Development {#sec4.1.2}\n\nStarting a few hours after the injury, the accumulation of neutrophils in the injured muscle remains elevated for a few days. Neutrophils stimulate host defense and the clearance of cell debris by phagocytosis and by the release of ROS and proteases \\[[@B3]\\]. Accordingly, depletion of neutrophils during acute muscle regeneration leads to persistence of necrotic tissue and delayed regeneration \\[[@B35]\\]. Moreover, a subset of neutrophils was also shown to promote angiogenesis \\[[@B36]\\]. Neutrophils also stimulate the development of the inflammatory process by expressing cytokines such as macrophage inflammatory protein 1 (MIP-1*\u03b1*) and MCP-1 that attract circulating monocytes at the damaged sites \\[[@B37]\\]. Ly6C^hi^ monocytes massively infiltrate in the injured muscle, where they play a key role in the development of inflammation by secreting proinflammatory cytokines such as TNF-*\u03b1* that further promote the recruitment of neutrophils and monocytes \\[[@B17], [@B38]\\]. This proinflammatory environment peaks around 48\u2009h after the injury. Thereafter, Ly6C^lo^ monocytes become the predominant subsets in the regenerating muscle, in which they play a key role to dampen inflammation.\n\n### 4.1.3. Resolution {#sec4.1.3}\n\nThe phase of resolution of inflammation is not a passive process caused by the decrease in proinflammatory signals; it is an active process that involves a variety of cell types and mediators \\[[@B39]\\]. Ly6C^lo^ antimacrophages are actively promoting the resolution of inflammation by expressing a wide array of anti-inflammatory cytokines (e.g., IL-4 and IL-13) and by switching their expression of proinflammatory lipids (e.g., prostaglandin-E~2~ (PGE~2~)) to proresolving lipids (e.g., 15*\u0394*-PGJ2) \\[[@B40]\\]. These mediators do not only reduce proinflammatory signals and ROS production, but they also actively stop the recruitment of neutrophils and promote their apoptosis and their nonphlogistic phagocytosis by macrophages \\[[@B39]\\]. Accordingly, the depletion of macrophages during muscle regeneration prolonged the presence of neutrophils in the injured muscle \\[[@B41]\\]. The importance of macrophages in the resolution of inflammation is crucial considering that the chronic presence of inflammatory cells has been associated with impaired tissue regeneration. At the late stages of muscle regeneration, macrophages ceased the expression of both pro- and anti-inflammatory cytokines and turned to a silenced mode, which precede the return to homeostasis \\[[@B42]\\]. Overall, monocytes/macrophages play a central role in the regulation of inflammation from the beginning to the end.\n\n4.2. Macrophages Interact with Satellite Cells to Regulate Myogenesis {#sec4.2}\n---------------------------------------------------------------------\n\nProinflammatory macrophages are key regulators of the host defense, and they are typically associated with clearance of cell debris during skeletal muscle repair \\[[@B3], [@B8]\\]. Necrotic fibers may act either as atrophic factors to repress myoblast growth or as physical barriers to prevent myoblast contact, indicating that sufficient infiltration of macrophages might be required for proper regeneration. For instance, using a mouse model deficient in CCR2, which is essential for Ly6C^hi^ monocyte extravasation, it was shown that the drastic reduction of infiltrating monocytes following muscle injury induced by ischemia \\[[@B43]\\], notexin or cardiotoxin \\[[@B17], [@B44]\\], and barium chloride \\[[@B45]\\] is accompanied by altered muscle regeneration. This impaired regeneration was partially mediated by insufficient phagocytosis of necrotic fibers \\[[@B45]\\]. However, even after adequate phagocytosis, myofibers failed to efficiently recover when intramuscular macrophages are depleted in a model of notexin-induced injury in mice \\[[@B17]\\].\n\nMacrophages have multiple beneficial roles during muscle regeneration in addition to their participation in the clearance of cell debris. Particularly, the importance of macrophages in the regulation satellite cells and myoblasts during the myogenesis process is now well defined ([Figure 1](#fig1){ref-type=\"fig\"}). The hypothesis that macrophages promote myogenesis was first supported by experiments showing that macrophage-conditioned medium triggers myoblast proliferation *in vitro* and improves muscle regeneration *in vivo* \\[[@B46], [@B47]\\]. The crucial role of macrophages to stimulate myogenesis is further illustrated in a model of 3D muscle construct *in vitro*, in which the addition of macrophages is necessary to allow the tissue to self-repair after an injury \\[[@B48]\\]. Pioneer work from Chazaud\\'s lab has shown that the release of proinflammatory cytokines by Ly6C^hi^ macrophages promotes myoblast proliferation and inhibits differentiation, while the release of anti-inflammatory cytokines by Ly6C^lo^ macrophages inhibits myoblast proliferation and stimulates their differentiation and fusion \\[[@B17]\\]. The exact cocktail of paracrine factors regulating satellite cell function has not been precisely characterized; however, many molecules secreted by macrophages have been shown to partially mediate these effects. For instance, cytokines highly expressed by proinflammatory macrophages such as interleukin-6 (IL-6) \\[[@B49]\\], TNF-*\u03b1*, and PGE~2~ \\[[@B50]\\] were shown to stimulate satellite cell proliferation. Moreover, Ly6C^hi^ macrophages secrete the enzyme ADAMTS1 (A Disintegrin-Like And Metalloproteinase With Thrombospondin Type 1 Motif) that reduces the Notch signalling pathway, leading to increased satellite cell activation and muscle regeneration \\[[@B51]\\]. On the other hand, cytokines and growth factors highly expressed by anti-inflammatory macrophages such as interleukin-4 (IL-4) \\[[@B52]\\] and insulin-like growth factor-1 (IGF-1) \\[[@B41]\\] were shown to stimulate myoblast differentiation/fusion and myofiber growth. In addition to paracrine factors, the direct physical contact of myogenic cells with macrophages is important to regulate their cell function and fate decision. Accordingly, *in vitro* coculture of macrophages and myogenic cells showed that macrophages have a proproliferative effect through the release of paracrine factors and an antiapoptotic effect by direct physical contact through a set of different adhesion molecules (VCAM1, intercellular adhesion molecule-1 (ICAM-1), platelet endothelial cell adhesion molecule-1 (PECAM-1), and CX3CR1) \\[[@B53], [@B54]\\].\n\nThe critical role of the different subsets of macrophages was also confirmed *in vivo*. It was first observed that in regenerating muscle, proinflammatory macrophages are in close proximity to proliferating satellite cells, while anti-inflammatory macrophages are near to the regenerating area containing differentiated myoblasts \\[[@B21]\\]. Depletion experiments were used to further characterize the role of the different subsets of macrophages *in vivo*. For instance, the depletion of infiltrating Ly6C^hi^ monocytes using genetic models or pharmacological compounds, prolonged the presence of necrotic cells, promoted the accumulation of muscle fat and fibrosis, and impaired the overall muscle regeneration \\[[@B17], [@B41], [@B55], [@B56]\\]. On the other hand, the suppression of the ability of macrophages to switch to their anti-inflammatory phenotype, induced by loss-of-function mutations in AMP-activated protein kinase-1 (AMPK*\u03b1*1) \\[[@B57]\\], IGF-1 \\[[@B58]\\], CCAAT/enhancer binding protein-*\u03b2* (CEBPB) \\[[@B59]\\], or peroxisome proliferator-activated receptor-*\u03b3* (PPAR-*\u03b3*) \\[[@B60]\\], was shown to reduce muscle fiber growth, without affecting the removal of necrotic tissue. In turn, models of satellite cell deletion also showed to have delayed macrophage transition to their anti-inflammatory phenotype, suggesting that there is a regulatory feedback by which myogenic cells contribute to the phenotypic switch of macrophages \\[[@B61]\\]. Altogether, these *in vitro* and *in vivo* experiments demonstrate that the different subsets of macrophages have complementary roles in the regulation of satellite cell/myoblast function, myogenesis progression, and optimal muscle regeneration. Furthermore, these findings also suggest that the temporal and spatial recruitment of macrophages is crucial to regulate the progression of satellite cells through the myogenesis process. Therefore, disorganized macrophage accumulation could send aberrant signals to satellite cells and impair their myogenesis capacity, which will be further discussed later in this manuscript.\n\n4.3. Macrophages Interact with FAPs to Regulate Muscle Fibrosis {#sec4.3}\n---------------------------------------------------------------\n\nAnother stem cell type, the fibroadipogenic progenitors (FAPs), plays a crucial role in skeletal muscle regeneration. These tissue-resident stem cells can differentiate into fibroblasts or adipocytes. In acute skeletal muscle injury, FAPs support satellite cell activation and differentiation and, retroactively, satellite cells inhibit FAP differentiation into adipocytes \\[[@B62]--[@B64]\\]. However, in chronic muscle disorders, FAPs can turn into direct contributors of ectopic fat deposition and formation of fibrotic scars that fail to support satellite cell activity \\[[@B65]\\]. Therefore, FAP activity and accumulation need to be closely regulated. It was shown that FAPs quickly and massively accumulate in the early phase of acute muscle injury, while their number quickly decreases after a few days \\[[@B5]\\]. Interestingly, this decrease in FAP accumulation correlates with the peak of macrophage accumulation \\[[@B2]\\]. It was demonstrated that the infiltration of proinflammatory macrophages is essential to control the accumulation of FAPs, via their secretion of TNF-*\u03b1* that directly stimulates FAP apoptosis \\[[@B5]\\]. Nitric oxide is another factor abundantly produced by proinflammatory macrophages that was shown to inhibit FAP differentiation toward adipocytes *in vitro* and to reduce the deposition of intramuscular fat and connective tissue *in vivo* \\[[@B66]\\]. The absence of monocyte recruitment to the site of injury in CCR2^\u2212/\u2212^ mice or following diphtheria toxin injection to ITGAM-DTR mice impairs FAP clearance and prolongs their presence in the injured muscle leading to abnormal collagen deposition \\[[@B5], [@B67]\\]. On the other hand, anti-inflammatory macrophages release transforming growth factor-*\u03b2* (TGF-*\u03b2*) that promotes FAP survival, which could be important for tissue remodelling during late muscle healing phases. Coculture of fibroblasts with the different subsets of macrophages confirmed that anti-inflammatory macrophages promote fibroblast proliferation and collagen synthesis, while proinflammatory macrophages reduce collagen synthesis and secrete enzymes such as MMP-1 and MMP-3 that degrade ECM \\[[@B68], [@B69]\\]. In turn, evidence suggests that a subset of FAPs could also contribute to the phenotypic switch of macrophages \\[[@B67]\\].\n\nOverall, macrophages play a crucial role to control fibrogenic cell accumulation and activity and to regulate muscle fibrosis. Particularly, the sequential accumulation of the different macrophage subsets is decisive in this process to find the delicate balance that not only limits the excessive activity of fibrotic cells and fibrosis deposition but also allows tissue remodelling needed for the return to homeostasis ([Figure 1](#fig1){ref-type=\"fig\"}).\n\n4.4. Macrophages Interact with Endothelial Cells to Regulate Neovascularization {#sec4.4}\n-------------------------------------------------------------------------------\n\nIn steady state, satellite cells reside in close proximity to the blood vessels \\[[@B70]\\]. There is a regulatory cross talk by which satellite cells secrete VEGFA to recruit endothelial cells, which in turn maintain satellite cell quiescence through the notch ligand Dll4 (Delta-like 4) \\[[@B70]\\]. Similarly, the interaction between angiopoietin-1 secreted by the smooth muscle cells and the Tie-2 receptor of the satellite cells was also demonstrated to promote quiescence \\[[@B71]\\]. Following an injury, cells from the blood vessels interact with satellite cells to promote revascularization, which is critical for muscle recovery. Particularly, endothelial cells directly regulate satellite cell growth by secreting various growth factors (IGF-1, hepatocyte growth factor (HGF), basic fibroblast growth factor (bFGF), platelet-derived growth factor (PDGF), and vascular endothelial growth factor (VEGF)), and through a retroactive loop, differentiated myoblasts stimulate angiogenesis \\[[@B72]\\]. Similarly, pericytes, which are juxtaposed to capillary endothelial cells, were shown to have myogenic capacities *in vitro* and to stimulate myoblast function and muscle regeneration *in vivo* \\[[@B73]\\].\n\nMacrophages play a central role during muscle regeneration to regulate the function of endothelial cells, which in turn promote the polarization of macrophages to their anti-inflammatory phenotype ([Figure 1](#fig1){ref-type=\"fig\"}) \\[[@B74]\\]. For instance, the depletion of infiltrating monocytes in CCR2^\u2212/\u2212^ mice impairs collateral arteriogenesis after ischemic hindlimb occlusion \\[[@B75]\\]. However, the role of the different subsets of macrophages on vascularization is still debated. The proangiogenic role of tumour-associated macrophages, a distinct subset of anti-inflammatory macrophages \\[[@B76]\\], is well defined; however, the role of the different macrophage subsets in a nontumourigenic environment is variable and dependent on various factors. An *in vitro* study indicated that anti-inflammatory macrophages promote the formation of new blood vessels to a higher level than proinflammatory macrophages \\[[@B77]\\]. Another *in vitro* model showed that proinflammatory macrophages (stimulated with lipopolysaccharide (LPS)\u2009+\u2009interferon-*\u03b3* (IFN-*\u03b3*)) increase the length and number of blood vessel sprouts to a higher level than anti-inflammatory M2a macrophages (induced by IL-4\u2009+\u2009IL-13), but to a lower level than anti-inflammatory M2c macrophages (induced by IL-10) \\[[@B78]\\]. Notably, the anti-inflammatory M2a macrophage subset in these experiments produced higher levels of paracrine factors recruiting pericytes \\[[@B78]\\]. A model of *in vitro* coculture between endothelial cells, myogenic progenitor cells, and macrophages stimulated with IL-4 or IL-10 showed that anti-inflammatory macrophages coordinate angiogenesis and myogenesis in part by the secretion of oncostatin M \\[[@B79]\\]. Overall, the subsets of macrophages play several roles that contribute to the different phases of angiogenesis. Accordingly, it was shown that the subsequent incubation of endothelial cells with proinflammatory followed by anti-inflammatory M2a macrophages *in vitro* (to mimic the macrophage phenotype switch observed *in vivo*) enhances the blood vessel network formation \\[[@B78]\\].\n\nNeovascularization was studied *in vivo* with different models of biomaterial implementation. Similar to *in vitro* experiments, the conclusion regarding the roles of the different subsets of macrophages on angiogenesis is variable depending on the experimental design and the outcomes measured. While some studies indicated that anti-inflammatory macrophages are primarily responsible for microvascular network growth and remodelling \\[[@B80]\\], others showed that the vascularization is related to a higher ratio of proinflammatory\u2009:\u2009anti-inflammatory macrophages \\[[@B81]\\]. This discrepancy might be related to the diversity of macrophage phenotypes *in vivo* and to the complex regulatory network between these subsets of macrophages and the numerous cell types involved in angiogenesis. Overall, macrophages play a crucial role in the regulation of muscle revascularization after an injury; however, further studies are needed to delineate the specific impacts of the different subsets of macrophages.\n\n5. Macrophages in Chronic Muscle Disorders {#sec5}\n==========================================\n\nTo mediate their beneficial effects on the different cellular processes involved in skeletal muscle regeneration, the accumulation of the different subsets of macrophages needs to be controlled, transient, and sequential. Disorganization or excessive macrophage activity is a common feature of many chronic conditions, which contributes to tissue degeneration. For instance, iron overloading caused by the excessive engulfment of erythrocytes by anti-inflammatory macrophages induces their switch to an unrestrained proinflammatory phenotype, which stimulates chronic inflammation and impairs wound healing \\[[@B82]\\]. Asynchronous muscle injuries (induced by two consecutive traumatic injuries separated by a few days) also perturb the proper course of inflammation leading to the concurrent (nonsequential) accumulation of proinflammatory and anti-inflammatory macrophages in the injured area that increases muscle fibrosis \\[[@B83]\\].\n\nMany muscular diseases are associated with chronic inflammation and impaired muscle regeneration. For instance, skeletal muscles from patients with Pompe disease, which is caused by acid-alpha glucosidase deficiency resulting in lysosomal glycogen accumulation, are subjected to an excessive invasion of proinflammatory macrophages that is correlated with impaired satellite cell differentiation \\[[@B84]\\]. Similarly, dysferlinopathy, another type of progressive myopathy caused by a mutation in the dysferlin gene, is associated with chronic accumulation of macrophages. These macrophages are maintained in a cytodestructive proinflammatory state that promotes myogenic cell apoptosis/necrosis \\[[@B85]\\].\n\nThe most studied form of muscular disorders is DMD, a frequent and severe debilitating disease characterized by progressive muscle weakness resulting in loss of ambulation, respiratory dysfunctions, and premature death. DMD is caused by a mutation in the gene that encodes for dystrophin, a protein important for muscle fiber stability and for satellite cell function \\[[@B86]\\]. Therefore, in the absence of dystrophin the muscles are subjected to repetitive and overlapping cycles of degeneration and regeneration. The microenvironment in these dystrophic muscles is characterized by the overactivation of inflammatory pathways such as NF-*\u03ba*B \\[[@B87]\\], increased cell membrane permeability, and abnormal intracellular calcium influx, as well as a deregulated nitric oxide signalling \\[[@B88]\\]. These abnormalities provoke changes in gene expression toward a chronic inflammatory molecular signature characterized by the high expression of molecules associated with cytokine and chemokine signalling, vascular adhesion and permeability, and lymphoid and myeloid markers \\[[@B89]\\]. Particularly, osteopontin is one of the most highly upregulated genes in muscles from mdx mice (mouse model of DMD) and in DMD patients \\[[@B89], [@B90]\\]. Osteopontin is an immunomodulator protein involved in immune cell migration and survival, and its ablation in dystrophic mdx mice was shown to promote the transition of proinflammatory macrophages toward their anti-inflammatory phenotype leading to reduced fibrosis and improved muscle function \\[[@B91]\\]. The chronic inflammatory environment in dystrophic muscles promotes the long-lasting recruitment of neutrophils and monocytes/macrophages, which instead of contributing to tissue clearance through phagocytosis of cell debris, rather stimulate muscle cell lysis \\[[@B92]\\] through their high levels of expression of cytotoxic molecules such as ROS. Accordingly, the depletion of neutrophils \\[[@B93]\\] or monocytes \\[[@B92]\\] reduces the number of necrotic fibers in mdx mice. Interestingly, the ablation of CCR2 in mdx mice not only reduces the number of infiltrating macrophages but also restores the macrophage polarization balance by skewing macrophages to their anti-inflammatory phenotype, which decreases muscle histopathology and increases muscle force \\[[@B94]\\]. This beneficial effect was not sustained at long term, potentially due to the local proliferation of resident macrophages that compensate for the lack of infiltrating monocytes \\[[@B94], [@B95]\\].\n\nIn contrast to the self-limited inflammation following acute sterile muscle injury, the conflicting signals sent simultaneously by degenerative and regenerative environments in chronic or excessive muscle injuries impair macrophage polarization. For instance, following a massive injury induced by muscle laceration, macrophages adopt an intermediary phenotype, which was associated with impaired muscle regeneration and persistent collagen deposition \\[[@B96]\\]. Interestingly, in this model, the exogenous transplantation of proinflammatory macrophages in the injured muscle reestablished the polarization state, which resulted in decreased fibrosis and improved muscle healing. Likewise, macrophages expressing high levels of both the proinflammatory macrophage marker iNOS (inducible nitric oxide synthase) and the anti-inflammatory macrophage marker CD206 have been observed in mdx mice \\[[@B94]\\]. In dystrophic muscles, hybrid macrophages expressing high levels of both proinflammatory cytokines (TNF-*\u03b1*) and anti-inflammatory cytokines (TGF-*\u03b2*) showed their inability to reduce the accumulation of FAPs in the injured muscle \\[[@B5]\\]. Similarly, it was shown that the binding of macrophages to excessive fibrinogen deposition in dystrophic muscle stimulates the production of the proinflammatory cytokine IL-1*\u03b2* together with TGF-*\u03b2* \\[[@B97]\\]. These hybrid macrophages, particularly Ly6C^hi^ macrophages expressing high levels of LTBP4 (latent TGF-*\u03b2* binding protein), promote the overexpression of the ECM component by FAPs and fibroblasts, leading to aberrant muscle fibrosis \\[[@B98]\\]. Interestingly, therapeutic strategies promoting the switch of macrophages toward the proinflammatory or anti-inflammatory phenotype were demonstrated to reduce fibrosis in dystrophic mice. For instance, blocking TGF-*\u03b2*-induced p38 kinase activation with the tyrosine kinase inhibitor Nilotinib restores the ability of proinflammatory macrophages to induce FAP apoptosis and promote the resolution of fibrosis in mdx mice \\[[@B5]\\]. On the other hand, skewing macrophages toward their anti-inflammatory phenotype by AMPK activation blocks their production of latent-TGF-b1 and reduces fibrosis deposition \\[[@B98]\\]. Overall, the chronic and deregulated macrophage accumulation and polarization observed in dystrophic muscles perturb the inflammatory process, enhance myofiber degeneration, impair myogenesis, and stimulate fibrosis deposition, which contribute to accelerate the progression of the disease ([Figure 1](#fig1){ref-type=\"fig\"}).\n\n6. Macrophages in Muscle Aging {#sec6}\n==============================\n\nAging is associated with progressive degeneration that affects multiple tissues, including skeletal muscles. Progressive loss of muscle mass of approximately 1% to 2% per year is observed beyond the age of 50 \\[[@B99]\\]. In some conditions, aging is also associated with sarcopenia, a phenomenon characterized by progressive and generalized loss of muscle mass and force/function leading to physical disability, poor quality of life, and death. Genome-wide transcription analysis revealed that the expression of inflammatory- and immunology-related genes is particularly affected in skeletal muscle during aging \\[[@B100]\\]. Evidence suggests that altered macrophages during aging impair satellite cell function and muscle regeneration. An *in vitro* model showed that conditioned medium collected from old bone marrow-derived macrophages (BMDM) decreased the number of Ki67^+^ myoblasts compared to conditioned medium generated from young BMDM, suggesting a reduction in the ability of macrophages to secrete proproliferative factors during aging \\[[@B101]\\]. In resting muscles of aged mice, an increase in M2a macrophages (CD68^+^CD163^+^) has been observed, which correlates with an increase in skeletal muscle fibrosis \\[[@B102]\\]. Furthermore, the transplantation of bone marrow cells isolated from young mice into aged mice prevented the increase of M2a macrophages and the accumulation of connective tissues in these muscles. In humans, one study comparing young (21-33 years) to elderly subjects (70-81 years) showed that total macrophage density (CD68^+^) is not different between the two groups, but that the gene expression of CD206 is higher in the elderly group, suggesting an increase in the proportion of anti-inflammatory macrophages in aging human skeletal muscle, similar to what has been observed in mice \\[[@B103]\\]. However, another study showed that in elderly subjects (average 71.4 years), there is a decrease in the number of both proinflammatory macrophages (CD11b^+^ cells) and anti-inflammatory macrophages (CD163^+^ cells) when compared to young individuals (average 31.9 years) \\[[@B104]\\]. Notably, both subpopulations of macrophages increase following acute resistance exercise in young adults but not in the elderly, indicating an impaired ability of aged muscle to develop a coordinated inflammatory response. Moreover, another study investigating the effect of aging on skeletal muscle macrophages in different conditions (healthy, bed rest, and rehabilitation exercise) showed that elderly individuals (average 66 years old) have less proinflammatory macrophages (CD11b^+^CD68^+^) and a similar number of anti-inflammatory macrophages (CD68^+^CD163^+^) than young individuals (average 23 years old) in each condition \\[[@B105]\\]. These studies indicate that the effect of aging on skeletal muscle macrophages in humans is variable depending on the marker used, the population examined, and the condition studied. Overall, we can conclude that the function of macrophages in skeletal muscle homeostasis and regeneration seems to be perturbed during aging; however, further high-quality research is needed to better define these dysfunctions and comprehend the physiopathological mechanisms.\n\n7. Promoting Muscle Regeneration by Modulating the Macrophage Phenotype {#sec7}\n=======================================================================\n\nConsidering the detrimental effect of inflammation in dystrophic muscles, anti-inflammatory drugs are a standard therapeutic approach for many muscle diseases. Accordingly, glucocorticoids are the only drugs that consistently demonstrated efficacy on the preservation of muscle force and ambulatory function in DMD patients \\[[@B106]\\]. Glucocorticoid treatment reduces macrophage accumulation and promotes their switch toward the anti-inflammatory phenotype, which is correlated with reduction of muscle necrosis and preservation of muscle force and function in DMD \\[[@B107], [@B108]\\]. However, glucocorticoids are nonspecific and have many detrimental side effects. Particularly, they stimulate signalling pathways involved in muscle catabolism and indirectly contribute to muscle wasting \\[[@B109]\\]. Therefore, novel therapeutic approaches specifically targeting macrophages in order to restore their polarization are a promising avenue for the treatment of DMD ([Figure 2](#fig2){ref-type=\"fig\"}) \\[[@B110]\\].\n\n7.1. Anti-Inflammatory Cytokines and Growth Factors {#sec7.1}\n---------------------------------------------------\n\n### 7.1.1. Interleukin-10 {#sec7.1.1}\n\nIL-10 has been used as an immune-based intervention because of its potential to deactivate proinflammatory macrophages and induce the anti-inflammatory phenotype *in vitro* \\[[@B111]--[@B116]\\]. To determine the role of IL-10 *in vivo*, the regenerative capacity of IL-10-null mice was investigated after hind limb muscle unloading and reloading \\[[@B117]\\]. The authors showed that IL-10 mutant mice exhibit high levels of proinflammatory markers (IL-6 and CCL2), persistent signs of muscle damage, and reduced accumulation of anti-inflammatory macrophages (expressing CD163 and arginase-1), leading to altered muscle regeneration \\[[@B117]\\]. Similarly, ablation of IL-10 expression in 12-week-old dystrophic mice reduces anti-inflammatory M2c macrophage polarization and muscle strength \\[[@B118]\\]. *In vitro* coculture assays revealed that IL-10 does not affect directly myoblast proliferation or differentiation, but rather affects myogenesis indirectly by promoting the transition of macrophages toward their anti-inflammatory M2c phenotype, which favours myoblast differentiation \\[[@B117], [@B118]\\]. Therefore, IL-10 has been considered as a therapeutic target to improve muscle regeneration; however, administration of IL-10 early in the regenerative process leads to the premature differentiation of myoblasts which reduces fiber size at 7 days postcardiotoxin injury \\[[@B42]\\] and may promote tissue fibrosis \\[[@B118], [@B119]\\].\n\n### 7.1.2. Insulin Growth Factor-1 {#sec7.1.2}\n\nIGF-1 is a key growth factor involved in numerous biological processes. During muscle regeneration, IGF-1 was shown to mediate myogenic cell proliferation, differentiation, and survival, and it also plays a crucial role in shaping the macrophage activation state \\[[@B58], [@B120]\\]. During muscle regeneration, IGF-1 is secreted by various cell types, including pro- and anti-inflammatory macrophages, which show a similar level of expression of this growth factor \\[[@B58]\\]. Conditional deletion of the IGF-1 gene in myeloid cells promotes the accumulation of the Ly6C^hi^ proinflammatory monocyte/macrophage phenotype and reduces CD206^+^ anti-inflammatory macrophages during muscle regeneration, leading to increased fat deposition and reduced fiber size at 10 days after cardiotoxin injury \\[[@B58]\\]. Analysis of the transcriptional profile showed that IGF-1 deletion skewed macrophages toward their proinflammatory profile, which indicates that IGF-1 is an autocrine factor regulating macrophage polarization \\[[@B58]\\]. Moreover, IGF-1 is necessary for IL-4-induced transition of macrophages toward their anti-inflammatory phenotype \\[[@B121]\\]. The therapeutic efficacy of IGF-1 injection has been observed by increased fiber size after sterile muscle injury in transgenic mice \\[[@B45], [@B122]\\] and improved muscle strength in old adult mice \\[[@B123]\\]. However, because myeloid cell-derived IGF-1 peaks by 3 days postinjury and then decline to baseline \\[[@B45]\\], the long-term beneficial effect of IGF-1 remains uncertain. Particularly, since IGF-1 is involved in a variety of cellular processes, its exogenous administration could have detrimental side effects. For instance, IGF-1 suppresses circulating insulin and growth hormone levels, causing hypoglycemia in humans \\[[@B124]\\] and stimulates human osteogenic sarcomas \\[[@B125]\\].\n\nOverall, anti-inflammatory cytokines and growth factors have a great potential to skew macrophages toward their anti-inflammatory profile, which would be beneficial in chronic degenerative muscle diseases; however, the mitigation of their potential side effects is technically challenging and remains a concern for the development of successful therapeutic approaches ([Table 1](#tab1){ref-type=\"table\"}).\n\n7.2. RNA Silencing {#sec7.2}\n------------------\n\n### 7.2.1. Small Interfering RNA {#sec7.2.1}\n\nThe potential of small interfering RNA (siRNA) has been investigated in many studies to silence proinflammatory markers and adhesion molecules such as TNF-*\u03b1*, VCAM-1, and P-selectins during inflammatory diseases \\[[@B126], [@B127]\\]. The ability of siRNA to promote macrophage skewing toward their anti-inflammatory phenotype has been evaluated in different conditions. For instance, the delivery of siRNA targeting collapsin response mediator protein-2 (CRMP2) through lipidoid nanoparticles resulted in a drastic switch toward the anti-inflammatory macrophage phenotype which decreased inflammation, fibrosis, and heart failure postmyocardial infarction \\[[@B128]\\]. Similarly, silencing of TIMP-1 in proinflammatory macrophages was shown to promote their proangiogenesis capacity at a similar level than the anti-inflammatory phenotype \\[[@B77]\\]. Receptors such as toll-like receptor-4 (TLR4) and CCR2 are also potential targets in inflammatory diseases and healthy muscle regeneration. The ablation of these receptors blunts the total macrophage accumulation, which results in reduced inflammation in acute injury and dystrophic mdx mice \\[[@B45], [@B129], [@B130]\\]. Genetic or pharmacologic blockage of TLR4, TLR2, or CCR2 in chronic degenerative muscles of mdx mice reduced total macrophage numbers and skewed them toward an anti-inflammatory profile (iNOS^\u2212^CD206^+^), leading to enhanced histopathology and muscle force generation \\[[@B94], [@B129], [@B131]\\]. On the other hand, loss-of-function of TLR2 or CCR2 decreases total monocytes/macrophages during acute muscle injury but fails to polarize macrophages toward an anti-inflammatory phenotype and causes abnormal persistence of necrotic fibers and impaired regeneration \\[[@B44], [@B45], [@B131]\\]. Therefore, the therapeutic strategies aiming at reprograming the macrophage phenotype must be carefully selected for the treatment of chronic degenerative conditions, since the controlled and coordinated accumulation of both pro- and anti-inflammatory phenotypes is essential for optimal muscle healing after an acute injury.\n\n### 7.2.2. Small Noncoding RNA Molecules {#sec7.2.2}\n\nMicroRNA (miRNAs) are small noncoding RNA molecules containing about 22 nucleotides, which function as posttranscriptional regulators of many genes and cellular processes in an autocrine or paracrine manner. Multiple miRNAs were shown to be involved in macrophage polarization. For instance, miR-9, miR-127, miR-155, and miR-125b were classified as proinflammatory inducers, while miR-124, miR-223, miR-34a, let-7c, miR-132, miR-146a, and miR-125a-5p skewed macrophages toward an anti-inflammatory phenotype \\[[@B132]\\]. The way that miRNAs regulate proinflammatory macrophage phenotypes includes silencing of specific targets such PPAR-*\u03b4*, B-cell lymphoma 6 (Bcl6), dual-specificity protein phosphatase 1 (Dusp1), signal transducer and activator of transcription-6 (STAT6), C/EBP, suppressors of cytokine signalling-1 (SOCS1), and interferon regulatory factor 4 (IRF4) and stimulating the c-Jun N-terminal kinase (JNK) pathway \\[[@B133]--[@B135]\\]. miRNAs that promote anti-inflammatory polarization mechanistically inhibit Notch1, signal-regulatory protein beta-1 (SIRPb1), STAT3, C/EBP-*\u03b4*, and interleukin 1 receptor-associated kinase-1/tumour-necrosis-factor-receptor-associated factor-6 (IRAK1-TRAF6) \\[[@B136]--[@B138]\\]. Transfection of either miR-34a, miR-146a, or miR-132 reduces the levels of proinflammatory-associated markers (iNOS, IL-12) upon LPS challenge \\[[@B136], [@B137]\\] and enhances anti-inflammatory markers \\[[@B137]\\]. Knockout gene strategies of either let-7c or miR-124 also showed an increase in proinflammatory markers (CD86, iNOS, TNF-*\u03b1*, and IL-12), in parallel with a decrease in anti-inflammatory-associated markers (FR-b, CD206, and Ym1) \\[[@B136], [@B139]\\]. Delivering a mixture of miR-1, -133, and -206 after laceration of rat tibialis anterior muscle enhances muscle regeneration and prevents fibrosis formation \\[[@B140]\\]. Although the effect of these myomiRNAs is likely to be mediated through a direct muscle-specific effect, rather than by acting on inflammation, it illustrates the potential of this therapeutic strategy for muscle disorders. However, the use of miRNA as a therapeutic approach is challenging because of inappropriate biodistribution, poor *in vivo* stability, and untoward side effects ([Table 1](#tab1){ref-type=\"table\"}) \\[[@B141]\\].\n\n7.3. NF-*\u03ba*B Inhibitors {#sec7.3}\n-----------------------\n\nNF-*\u03ba*B is a key transcription factor in macrophages that is required for the expression of numerous proinflammatory genes \\[[@B142]\\]. In DMD, the NF-*\u03ba*B pathway is persistently overexpressed in immune cells and skeletal muscle cells \\[[@B143]\\]. Inhibition of this pathway specifically in myeloid cells of dystrophic mdx mice reduced inflammation and muscle necrosis, while its specific deletion in muscle progenitor cells increased myogenesis \\[[@B143]\\]. Pharmacological inhibition of this pathway mitigated the disease and improved muscle function in dystrophic mdx mice and golden retriever muscular dystrophy dog model \\[[@B143], [@B144]\\]. Therefore, it is a promising therapeutic target for chronic muscle diseases.\n\nHowever, while the NF-*\u03ba*B pathway has been initially described as an inflammatory pathway, accumulating evidence indicate that its effect on inflammation is more complex than anticipated \\[[@B145]\\]. A pioneer study showed that the inhibition of NF-*\u03ba*B during the onset of inflammation reduces the inflammatory response, while its inhibition during the resolution of inflammation results in the prolongation of the inflammatory response \\[[@B146]\\]. *In vitro* models have also shown that inhibition of NF-*\u03ba*B impairs the maturation of human monocytes into both pro- and anti-inflammatory macrophages \\[[@B147]\\]. Other models have shown that the effect of the NF-*\u03ba*B pathway varies depending on different factors such as the type of cell and insult. For instance, in a model of bacterial infection, the specific knockout of IKK*\u03b2* (factor involved in the NF-*\u03ba*B pathway) in airway epithelial cells inhibited inflammation; however, its inhibition in myeloid cells promoted the inflammation response \\[[@B148]\\]. IKK*\u03b2*-deficient macrophages showed increased markers of inflammation and an impaired ability to skew to their anti-inflammatory phenotype, which suggests an important role of NF-*\u03ba*B in the macrophage phenotype switch \\[[@B148]\\]. Overall, while NF-*\u03ba*B inhibitors are attractive compounds for the treatment of chronic muscle disorders, the broad and complex roles of this pathway make it a difficult target for the development of a macrophage-centered therapeutic approach ([Table 1](#tab1){ref-type=\"table\"}) \\[[@B145]\\].\n\n7.4. Nutritional Compounds {#sec7.4}\n--------------------------\n\n### 7.4.1. Proteins and Amino Acids {#sec7.4.1}\n\nMany nutritional compounds were shown to regulate the inflammatory process, which represent a simple therapeutic approach for the treatment of chronic muscle disorders. Cod and shrimp proteins were shown to decrease the density of neutrophils and proinflammatory macrophages, while increasing the anti-inflammatory subset in rat muscles following acute sterile injury \\[[@B149]--[@B151]\\]. The beneficial effects of cod protein on the resolution of inflammation and muscle regeneration after injury were attributable to its high content of arginine, glycine, taurine, and lysine \\[[@B150]\\]. These amino acids have been shown to decrease muscle cell damage in various rodent models of inflammation including endotoxin- and exercise-induced muscle damage by inhibiting the secretion of inflammatory markers, such as TNF-*\u03b1*, IL-1*\u03b2*, IL-6, and PGE~2~, and by reducing COX-2 expression and ROS generation \\[[@B152]--[@B158]\\]. The protective effects of L-arginine on muscle cell membrane integrity in mdx mice was reported to be mediated through a decrease in TNF-*\u03b1*, IL-1*\u03b2*, and IL-6 expression levels \\[[@B153]\\]. Therefore, dietary fish protein rich in arginine, glycine, and taurine represents a safe, inexpensive, and efficient approach for the treatment of inflammatory musculoskeletal diseases ([Table 1](#tab1){ref-type=\"table\"}).\n\n### 7.4.2. Long-Chain Polyunsaturated Fatty Acids {#sec7.4.2}\n\nOmega-3 polyunsaturated fatty acids (PUFA) were shown to have a variety of anti-inflammatory effects such as decreasing adhesion molecules and leukocyte chemotaxis in a variety of inflammatory conditions \\[[@B159]--[@B161]\\]. This effect is partly mediated by their ability to inhibit NF-*\u03ba*B-dependent inflammatory genes and blunt the production of eicosanoids, such as prostaglandins and leukotrienes \\[[@B162]\\]. In addition to reducing leukocyte accumulation, PUFA directly target macrophages to inhibit their activation and promote their switch toward their anti-inflammatory phenotype \\[[@B163]\\]. In skeletal muscle, the long-term therapy of omega-3 supplementation to mdx mice reduced proinflammatory markers (TNF-*\u03b1* and NF-*\u03ba*B levels) and improved muscle regeneration \\[[@B164]\\]. Similarly, a diet supplemented with fish oil diminished the signs of inflammation and reduced fibrosis in the diaphragm muscle of old mdx mice \\[[@B165]\\]. Therefore, a diet rich in PUFA represents a simple strategy for the treatment of chronic muscle disorders.\n\n### 7.4.3. Vitamins and Antioxidants {#sec7.4.3}\n\nDifferent studies looked at the role of vitamins to regulate inflammation and macrophage phenotype. So far, retinoic acid (active form of vitamin A), vitamin D3, and vitamin E have been shown to play a role in the functional polarization of macrophages. Using a microarray to scan over 40,000 genes in peritoneal macrophages, it was shown that retinoic acid acts through GATA-6 signalling to change the profile of macrophages, which acquire some markers of the anti-inflammatory profile (e.g., Arg1) but not others (e.g., CD206) \\[[@B166]\\]. These findings show that retinoic acid promotes an anti-inflammatory-oriented profile that is located in a broad spectrum of macrophage polarization states. Retinoic acid was also shown to potentiate the ability of IL-4 to skew macrophages toward their anti-inflammatory phenotype, indicating that macrophage polarization is a result of the complex interaction of various molecular components \\[[@B167]\\]. Skeletal muscle regeneration was delayed in mice deficient in retinoic acid receptor-*\u03b3*, while the treatment of injured wild-type mice with a retinoic acid receptor-*\u03b3* agonist reduced fibrotic/adipose tissue and improved muscle repair \\[[@B168]\\]. However, the exact contribution of macrophages in the positive effect of retinoic acid on skeletal muscle repair remains to be determined.\n\nVitamin D3 has an inhibitory role in a plethora of cellular immune processes, including in T cells, by reducing the inclination of Th0 toward Th1 cells, along with a selective reduction of Th1-related cytokines \\[[@B169], [@B170]\\]. Moreover, vitamin D3 was shown to promote Treg development, which plays an important role in driving the M2 macrophage phenotype \\[[@B171]\\]. Vitamin D3 deficiency was shown to impair the maturation of monocytes to macrophages, while vitamin D3 addition increases the expression of macrophage-specific surface antigens. Macrophages treated with vitamin D3 adopt an intermediary phenotype located on the broad spectrum of macrophage polarization, which is characterized by a controlled increase in oxidative burst, chemotaxis, and phagocytosis, together with a decrease in the expression of TLR2/4 and a reduced level of the proinflammatory cytokines TNF-*\u03b1*, IL-1, and IL-6 \\[[@B172]\\].\n\nAs the most abundant lipid soluble chain-breaking antioxidant in cell membranes, vitamin E has been shown to prevent mitochondrial oxidative damages and entrap peroxyl radicals and oxygen species, all of which are putative factors in several human diseases \\[[@B173]\\]. Besides its well-known antioxidant properties, accumulating evidences support the immunostimulating effects of vitamin E in pathogen-infected subjects through different mechanisms that enhance the Th1-like pattern immune response \\[[@B174]\\]. In conditions with a low-grade inflammation (e.g., obesity and aortic lesions), vitamin E appears to suppress infiltrating macrophage accumulation and related cytokines \\[[@B175], [@B176]\\]. Indeed, *\u03b3*-tocopherol, one of the active forms of vitamin E, substantially reduced the recruitment of adipose tissue macrophages in high-fat-fed mice. Moreover, LPS-mediated proinflammatory macrophage polarization was reduced in *\u03b3*-tocopherol-treated human adipose tissue with minimal influence on alternative polarization into anti-inflammatory macrophages \\[[@B175]\\].\n\nAltogether, these findings indicate that vitamins are not classical inducers of the anti-inflammatory phenotype, but they rather promote an intermediary phenotype located in the continuum of macrophage polarization. Thus, the contribution of vitamins to the promotion of the pro- or anti-inflammatory phenotype of macrophages is dependent on their combinatory effect with other molecular and cellular components.\n\n7.5. Biomaterials {#sec7.5}\n-----------------\n\nThe advancement in bioengineering led to the development of new implantable medical devices that can be used in regenerative medicine to modulate macrophage response in different tissues, such as skeletal muscles \\[[@B177]\\]. The interface between the biomaterial surface and the tissue initiates cellular events that activate a subsequent signalling cascade of paracrine and autocrine factors in the host tissue. These biomaterials can be either synthetic (biodegradable or nonbiodegradable) or biologic \\[[@B178], [@B179]\\].\n\n### 7.5.1. Biologic Materials {#sec7.5.1}\n\nThese biomaterials include human and porcine skin substitutes, porcine small intestine submucosa, dermal, and other natural substitutes (e.g., collagen, chitosan, silk, and keratin) \\[[@B178], [@B180]\\]. The nature and the age of the source animal have a significant impact on the effect of the transplanted biomaterial. For instance, porcine small intestine submucosa harvested from pigs at different ages revealed that a scaffold isolated from younger animals promote a dominant anti-inflammatory macrophage response and better muscle regeneration than a scaffold derived from older animals \\[[@B181]\\]. The macrophage response is also affected depending on whether the scaffold is implanted in its native form or its cross-linked form (which increases the protein cross-links to improve stability and durability), the former enhancing the anti-inflammatory phenotype, while the latter promoting the proinflammatory phenotype of macrophages \\[[@B182], [@B183]\\].\n\n### 7.5.2. Synthetic Biomaterials {#sec7.5.2}\n\nSynthetic biomaterials such as polyethylene, polyethylene terephthalate, polyacrylamide, perfluoropolyether, and polydioxanone elicit an anti-inflammatory response in macrophages *in vitro* \\[[@B178]\\]. Macrophage response to biomaterials is dependent on many factors including their composition, characteristics (dimension, pore size, and topography), and the quality of the sterilisation \\[[@B178]\\]. The pore size is a critical regulator of macrophage polarization, with a smaller pore size inducing the proinflammatory phenotype of macrophages cultured on perfluoropolyether \\[[@B184]\\], while larger pores induce an anti-inflammatory response \\[[@B185]\\]. In addition to pore size, other factors such as the nature of the material play a significant role in polarizing macrophages, since macrophages cultured on expanded polytetrafluoroethylene and chitosan with large pores show a proinflammatory cytokine profile \\[[@B186], [@B187]\\].\n\n### 7.5.3. Hybrid Biomaterials {#sec7.5.3}\n\nThese biomaterials are derived from both synthetic and biologic materials. For instance, the coating of polypropylene mesh with ECM components (isolated from decellularized porcine skin) was shown to increase the ratio of anti-inflammatory macrophages compared to uncoated polypropylene mesh \\[[@B188]\\]. Moreover, these biomaterials could be used as a carrier for biochemical cues (e.g., cytokines and growth factors) or pharmacological compounds. Therefore, biomaterials could be used as a mixed therapy with other anti-inflammatory-stimulating factors described previously. Moreover, biomaterials can also be used as a carrier in cellular transplantation experiments (e.g., for macrophages, satellite cells, or other stem cells). For instance, a tissue engineering strategy showed that a compound containing mesenchymal stem cells and a decellularized ECM scaffold synergistically promoted macrophage polarization toward the M2 phenotype and improved skeletal muscle regeneration in rats \\[[@B189]\\]. Biomaterials were also shown to improve the success of myoblast transplantation; however, the contribution of macrophage polarization in the beneficial impact of these biomaterials is still elusive \\[[@B190]\\].\n\nBiomaterials were also used as a scaffold to increase muscle regeneration. Acellular biological scaffolds were shown to elicit an anti-inflammatory macrophage response resulting in constructive remodelling, while scaffolds containing cellular components were associated with a proinflammatory macrophage response resulting in fibrosis and failed regeneration \\[[@B191]\\]. Biomaterials were also tested as a strategy to improve innervation, vascularization, and myofiber contractility in skeletal muscles \\[[@B192]\\]; however, the potential of biomaterials as a macrophage-centered approach for the treatment of DMD remains to be investigated. Nonetheless, the recent advances in bioengineering open an exciting new therapeutic avenue that could be used in combination with other factors regulating macrophage polarization for the treatment of chronic degenerative muscle disorders ([Table 1](#tab1){ref-type=\"table\"}).\n\n7.6. Macrophage Transplantation {#sec7.6}\n-------------------------------\n\nTransplantation of M2 macrophages is considered as a new cell-based therapy for many diseases including Alzheimer, diabetes, and peripheral arterial disease \\[[@B193]--[@B195]\\]. In a rat model of Alzheimer, M2 macrophage transplantation greatly attenuated inflammation and cognitive impairment by skewing endogenous microglial cells toward the M2 phenotype \\[[@B193]\\]. Moreover, systemic administration of peritoneal M2 macrophages enhanced glucose tolerance, prevented rejection, and prolonged the survival time of islet allografts in diabetic mice \\[[@B194]\\]. With regard to skeletal muscle regeneration, it was shown that transplantation of M1-polarized macrophages (LPS/IFN-*\u03b3*) following ischemia-induced muscle injury enhanced the recovery of muscle function, while the administration of nonpolarized macrophages did not \\[[@B196]\\]. Similar results were observed in another model of muscle injury (laceration) \\[[@B96]\\]. Another study demonstrated that early administration of M1-polarized macrophages (IFN-*\u03b3*) reduced fibrosis and improved myofiber size and muscle function, while early administration of M2-polarized macrophages (IL-4/IL-13) improved myofiber size but not muscle force and fibrosis \\[[@B195]\\]. These results indicate that the transplantation of macrophages needs to be timely coordinated to improve skeletal muscle regeneration. Notably, the safety and efficacy of macrophage transplantation have already been tested in two clinical studies, showing a significant improvement of motor and cognitive activities in patients with stroke and neurological affectations \\[[@B197], [@B198]\\].\n\nSatellite cell transplantation is also a promising therapeutic avenue to treat different muscle diseases; however, it faces many technical challenges such as poor cell survival, lack of self-renewal, and long-term engraftment. Macrophages represent an attractive approach to improve the success rate of satellite cell transplantation. For instance, coinjection of myoblasts with proinflammatory macrophages supported myoblast engraftment by extending their proliferative phase and delaying their differentiation, while coinjection with anti-inflammatory macrophages did not improve myoblast engraftment \\[[@B199]\\]. Altogether, these findings suggest that macrophages are an interesting therapeutic approach, either as a direct therapy or as a cofactor for the transplantation of other cell types. However, macrophage polarization needs to be tightly regulated to optimize muscle regeneration.\n\n8. Conclusion {#sec8}\n=============\n\nMuscle regeneration relies on different stem cell types, especially satellite cells and FAPs. While these cells are the ultimate executors of muscle repair, their activity is regulated and coordinated by neighbouring cells. Particularly, macrophage polarization toward their proinflammatory or anti-inflammatory phenotype has been shown to play key roles in myogenesis and skeletal muscle healing. The novel insights into the field of inflammation have revealed that macrophages span a continuum of polarization states, which evolves depending on intrinsic and extrinsic factors. In chronic degenerative muscle disorders, the abnormal phenotype adopted by macrophages was shown to contribute to this detrimental process. Therefore, new therapeutics targeting macrophage polarization such as cytokines and growth factors, nutritional compounds, RNA silencing, pharmacological drugs, and biomaterials are tested to improve skeletal muscle regeneration. Depending on the type of muscle injury and on the desired therapeutic effect, these strategies could be used to skew macrophage polarization toward the proinflammatory phenotype (e.g., to decrease excessive fibrosis) or toward the anti-inflammatory phenotype (e.g., to dampen inflammation and promote myogenesis). Despite some technical challenges, these new strategies have a strong therapeutic potential to mitigate different muscle disorders such as DMD. The recent technological advances combined with our improved comprehension of the role of macrophages in skeletal muscle regeneration and diseases will synergize to develop this promising field of research.\n\nJ.D., T.M., and P.F. are supported by the CHU Sainte-Justine Foundation. N.A.D. is supported by the Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council (NSERC), CHU Sainte-Justine Foundation, Fonds de Recherche Qu\u00e9bec-Sant\u00e9, and Grand D\u00e9fi Pierre Lavoie Foundation. N.A.D. acknowledges the support of the Th\u00e9Cell network and the Stem Cell Network.\n\nConflicts of Interest\n=====================\n\nThe authors declare that they have no conflict of interest.\n\n![Macrophages are central regulators in skeletal muscle regeneration and diseases. In acute muscle injury (a), the inflammatory process is characterized by early accumulation of proinflammatory macrophages, which play a key role in various biological processes involved in muscle regeneration, by regulating fibrosis (FAP apoptosis), myogenesis (satellite cell proliferation), angiogenesis (sprouting), and inflammation (phagocytosis). Thereafter, macrophages switch toward the anti-inflammatory phenotype, which dampens inflammation, stimulates satellite cell/myoblast differentiation, and promotes tissue remodelling. This temporal and coordinated process is essential for optimal muscle healing. In a chronic degenerative muscle (b), the concurrent pro- and anti-inflammatory signals lead to the adoption of an abnormal hybrid phenotype by macrophages, which promote chronic inflammatory cell infiltration, excessive fibrosis, impaired myogenesis, and disorganized blood vessel network.](SCI2019-4761427.001){#fig1}\n\n![Macrophage-centered therapeutic approaches. Different strategies were developed to restore a balance in macrophage polarization in chronic degenerative muscle disorders. These strategies include cytokines (e.g., IL-10), nutritional compounds (e.g., PUFA and vitamins), RNA silencing (e.g., miRNA), pharmacological drugs (e.g., glucocorticoids), and biomaterials (synthetic, biological, or mixed). These strategies could be used to skew macrophage polarization toward their pro- or anti-inflammatory phenotype depending on the desired therapeutic effect.](SCI2019-4761427.002){#fig2}\n\n###### \n\nTable showing the pros and cons of the different therapeutic approaches targeting macrophages to improve muscle regeneration and/or mitigate muscle diseases.\n\n Therapeutic approaches Advantages Challenges References\n --------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------------------------- -------------------------------------------------\n Anti-inflammatory cytokines (e.g., IL-10) Endogenous molecules; deactivate proinflammatory macrophages and induce the anti-inflammatory phenotype Short-term effect; nonspecific (could directly impair other cellular processes in skeletal muscle regeneration) \\[[@B115], [@B117]\\]\n Growth factors (e.g., IGF-1) Endogenous molecules; promote macrophage transition to their anti-inflammatory phenotype; promote muscle growth Short-term effect; systemic side effects \\[[@B121], [@B124], [@B125]\\]\n RNA silencing (e.g., miRNA, siRNA) Specifically target genes implicated in chronic inflammation; skewed macrophages toward pro- or anti-inflammatory phenotype Poor stability; inappropriate distribution; off-target side effects; delivery \\[[@B126], [@B127], [@B136], [@B138], [@B200]\\]\n NF-*\u03ba*B inhibitors Dampen inflammation; easy to deliver; good stability Nonspecific (could directly impair other cellular processes in skeletal muscle regeneration) \\[[@B143]\\]\n Nutritional compounds (proteins, amino acids, PUFA, vitamins, and antioxidants) Promote macrophage transition; potentiate the effect of other therapies; inexpensive; easy to administer Mild therapeutic effect \\[[@B166], [@B171], [@B172]\\]\n Biomaterials Skewed macrophages toward pro- or anti-inflammatory phenotype; local effects; long-term effects; combination with other therapies Invasive; biocompatibility; risk of contamination; degradation of the biomaterial \\[[@B201]\\]\n Macrophage transplantation Specifically deliver the desired macrophage subset; increase the success rate of satellite cell transplantation Invasive; systemic side effects; expensive; time consuming \\[[@B193]--[@B195]\\]\n\n[^1]: Guest Editor: Fengjuan Lyu\n"], ["Introduction\n============\n\nPersonal health information, such as data generated by sensors or data collected by patients themselves through their diaries, contains important information regarding the people's daily lifestyle. Previous studies have shown that clinicians can use these patient data to provide tailored medical services, especially for patients with chronic diseases \\[[@ref1]-[@ref3]\\], and that 60% of the patients are open to providing real-time access to their self-collected health information \\[[@ref4]\\]. The use of self-collected data is especially relevant for patients with diabetes, because they often have to adhere to complex treatment regimes. If, for example, a patient is treated with insulin, the dosage has to be adjusted in concordance with not only the calorie intake, but also other factors such as physical exercise \\[[@ref5]\\] and undercurrent disease \\[[@ref6]\\]. Patients with diabetes and physicians have traditionally relied on analog diaries, but as personal computers and smartphones have become commonplace, there has been an explosive increase in the use of digital diaries and wearables \\[[@ref7],[@ref8]\\]. In addition, several research projects and private companies are providing solutions to allow clinicians to consult data collected by the patients themselves \\[[@ref2],[@ref9]\\]. However, none of these solutions are widely used, mainly because they are proprietary and require specific hardware and software to collect and access the data. This makes it difficult to provide fluid integration between such devices and the physicians' existing tools and constitutes an important barrier of acceptance for the introduction of these types of data \\[[@ref10]\\].\n\nThis paper is part of the \"Full Flow of Health Data Between Patients and Health Care Systems\" project, which focuses on integrating self-collected health data into consultations in Norway using diabetes and Fast Healthcare Interoperability Resources (FHIR) as a case.\n\nMajor health care actors such as Epic Systems Corporation and Cerner propose application programming interfaces relying on FHIR standards \\[[@ref11],[@ref12]\\]. Open source projects such as OpenMRS and Open mHealth also provide access to FHIR resources \\[[@ref13],[@ref14]\\], and studies propose to use FHIR to improve the health care sector \\[[@ref15]\\]. Norwegian electronic health records (EHRs) are currently working on implementing FHIR standards in their respective solutions \\[[@ref16]\\], but none of them are ready to manage FHIR resources today, as they are not able to receive and display FHIR data. We therefore provided clinicians with a standalone dashboard (ie, view providing key performance indicators) displaying the patients' self-collected health data to be used as an addition to their current EHR.\n\nEven if self-collected data could be seamlessly integrated, user acceptance is not guaranteed. Patients with diabetes can collect large amounts of data. If the data cannot be presented in an efficient way, it cannot be efficiently comprehended, severely diminishing its usefulness \\[[@ref17]-[@ref20]\\]. Many physicians struggle to obtain an overview of constantly expanding EHRs. The introduction of a potentially large amount of new data that the physicians are not used to utilizing must therefore be handled with great care, as even minor ill-considered implementation details can have a huge negative impact \\[[@ref18]-[@ref20]\\]. Optimal presentation of health data depends on the information needed by the clinicians. There is no optimal way of presenting clinical data, because these needs vary a lot \\[[@ref21]-[@ref25]\\].\n\nThis paper presents the design of a dashboard for displaying the self-collected health data from patients with diabetes and describes how the user interface attempts to meet the clinicians' information needs. Furthermore, the paper presents the prestudy assessment of the dashboard by clinicians.\n\nMethods\n=======\n\nPhases\n------\n\nIn the two main phases of the study, we used different methodologies: iterative dashboard design and prestudy assessment ([Figure 1](#figure1){ref-type=\"fig\"}). The iterative design phase supported the conception and implementation of the dashboard, while the prestudy assessment was used to collect the clinicians' experiences with the developed dashboard as well as their recommendations.\n\nBased on previous studies by the authors \\[[@ref26],[@ref27]\\], we created the first prototype of the dashboard to be used as a first input for the iterative design process. The information collected from the studies \\[[@ref26],[@ref27]\\] was used to identify the data required during diabetes consultations and to define the requirements for the graphical user interface (GUI) of the dashboard.\n\nIterative Dashboard Design\n--------------------------\n\nThe development of the dashboard followed a three-step iterative process to approach the following primary objectives: (1) identify the needs of both patients and clinicians regarding information with clinical relevance during a consultation, information suitable to be collected by patients, and how to present the information in the GUI in order to improve its usability during consultations; (2) evaluate early prototypes and propose adjustments; and (3) develop prototypes based on the proposed adjustments identified in step two.\n\nTo achieve these objectives, we organized facilitated workshops, supported by open-ended discussions, to approach specific tasks in rapid development cycles.\n\n![The two main phases of the study, with their components and results.](diabetes_v4i3e14002_fig1){#figure1}\n\nFacilitated Workshops and Open-Ended Discussions\n------------------------------------------------\n\nFacilitated workshops are sessions bringing users, stakeholders, and partners together to define and evaluate product requirements \\[[@ref28]\\].\n\nWe organized two facilitated workshops using a participatory design approach \\[[@ref29]\\] involving four of the authors (AGi, AGr, E\u00c5, and AH), four clinicians (two nurses and two doctors who have worked with patients with diabetes), and two patients with diabetes. The clinicians and patients were recruited by our partner---the University Hospital of North Norway. Different methodologies were used during these workshops, namely, brainstorms, idea storms, and go-rounds, to balance creativity and problem-solving tasks and to reduce the pressure on the patients by allowing everyone to speak in turn. The facilitated workshops lasted 3 hours each, and participants were invited to use their own experiences to contribute to the workshops' primary objectives. The majority group decision--making technique was employed during these sessions.\n\nIn addition to the facilitated workshops, we organized a total of 11 sessions with open-ended discussion---three focused on mathematical models to use for medical and statistical calculations and involved two computer scientists; four focused on targeting the GUI usability, namely, the information to be displayed, which was attended by one computer scientist and one GUI expert; two focused on a first assessment regarding the medical relevance of the information displayed, which was joined by a computer scientist and a general practitioner; and two focused on the evaluation of the dashboard prototype against the requirements and involved four of the authors (AGi, AGr, E\u00c5, and AH).\n\nScenarios\n---------\n\nWe used a simulation-type scenario approach to model real-life situations and narratives \\[[@ref30]\\]. The modelling process relied on a taxonomy containing four elements that were used for each scenario. These elements were as follows:\n\n1. Settings: the context and the situation of the scenario\n\n2. Agents: those who participate in the scenario\n\n3. Goals: the functional targets of the scenario\n\n4. Events: the actions taken by the agents during the scenario\n\nThe detailed information concerning the three main scenarios was defined together with the participants during the first facilitated workshop. We chose to use a scenario approach because it facilitates the cooperation of the participants during the facilitated workshops, who can see themselves in the situations and evoke their own experiences, and it simplifies the design process of the dashboard by providing concrete and flexible situations \\[[@ref31]\\].\n\nPrototyping\n-----------\n\nThe prototyping phase consisted of implementing the dashboard to support the given scenarios by using computer-generated data that express the data requirements for the scenarios.\n\nThe dashboard was then built to achieve the objectives described in the scenarios. An agile development process \\[[@ref32]\\] was exclusively used for this task, as evolution, changes, and adaptability were necessary, considering the continuous inputs provided by the workshops. The implementation relied on Java Enterprise Edition 8, Java Server Faces 2.2, and Glassfish 5. The developed prototypes were assessed during the workshops and improved during each iteration of the design process.\n\nOnce the authors and participants in the workshops decided that the dashboard was satisfactory to be used in a real situation, we stopped the iterative design process and selected the last prototype for a prestudy assessment by different clinicians.\n\nPrestudy Assessment of the Dashboard by Clinicians\n--------------------------------------------------\n\n### Protocol\n\nThe design of the prestudy assessment was guided by the Standards for Reporting Qualitative Research checklist to enhance the organization and reporting of this study \\[[@ref33]\\]. The aim of the prestudy assessment was to evaluate the pertinence of the functionalities presented in the dashboard GUI and its usability prior to a medical trial.\n\nWe used a case study approach, organizing a total of five workshops in health care offices (hospital and general practitioner \\[GP\\] office), each involving one to four clinicians, accounting to a total of 14 clinicians, and one or two researchers. The 14 clinicians were recruited through our partner, the University Hospital of North Norway, or by direct contact initiated by us; none participated in the dashboard design and all are currently participating in the medical trial. We were limited in the number of participants to include due to external factors (eg, time constraints and unavailability of further participants).\n\nDuring the workshops, we presented the FullFlow system, which included the last prototype of the dashboard, by using the self-collected health data from one in-house researcher who has type 1 diabetes (an exemption was obtained from the local ethics committee: Ref 2018/719 \\[[@ref34]\\]), hereafter referred to as Research Patient. We extracted these data from the Research Patient's Diabetes Diary to fill the FullFlow system, using the Diabetes Share Live solution to transmit the data in a way similar to that used in a previous study \\[[@ref27]\\]. The use case presented in the workshops was based on the Research Patient's real-life diabetes data (ie, insulin intake, carbohydrate intake, blood glucose values, physical activities, weight, medication, and personal aims) and is similar to one scenario created in the dashboard design process. The Research Patient participated in all workshops, where he could explain the different values displayed in the dashboard and answer questions regarding his lifestyle and the recorded values.\n\n### Data Collection\n\nDuring the workshops, we distributed a paper-based questionnaire to the participants after presenting the system and letting the clinicians test it. We then collected the questionnaires at the end of the session. The first and second (AGi and E\u00c5) authors designed a specific questionnaire based on the System Usability Scale \\[[@ref35]\\] and the Computer System Usability Questionnaire \\[[@ref36]\\].\n\nWe decided to use a custom questionnaire, as the assessment did not permit inclusion of important usability factors due to a lack of clinical context such as patient-clinician relationships. Given that the questionnaire was administered to the participants before the study, we wanted to provide open-ended questions to obtain important feedback for this iterative process before starting the medical trial. The questionnaire contained four questions about the system and the role of the user (eg, nurse):\n\n- Q1a: Do you think the system will be useful during consultation? Q1b: Potential comments.\n\n- Q2a: Would you like to have more information delivered by the FullFlow system? Q2b: Potential comments.\n\n- Q3a: Would you like to remove or hide information currently delivered by the FullFlow system? Q3b: Potential comments.\n\n- Q4: Do you have any feedback you would like to offer?\n\n### Qualitative Analysis\n\nThe first author (AG) performed a qualitative analysis based on three keywords: *expectation*, *usability*, and *functionality*. In our context, we defined *expectation* as a general belief that positive or negative outcomes could occur in clinical settings by using our proposed system. The use of this term was inspired by the work of Bialosky et al \\[[@ref37]\\]. We used the seven notions provided by V\u00e1zquez-Garc\u00eda et al \\[[@ref38]\\] to define *usability*: knowability (user can understand, learn, and remember how to use the system), operability (capacity of the system to accommodate users with different needs), efficiency (capacity of the system to produce appropriate results), robustness (capacity of the system to resist error), safety (capacity of the system to avoid risk), and satisfaction (capacity of the system to generate interest in users). We used the definition proposed by Salleh et al \\[[@ref39]\\] to describe *functionality*: a set of functions and their specified properties. We then used the feedback to improve the system before starting the medical trial. We used the feedback obtained in order to improve the system before starting the medical trial.\n\nResults\n=======\n\nOverview\n--------\n\nFrom previous studies, we identified eight relevant data types for diabetes consultation---blood pressure, calories, carbohydrates, heart rate, blood glucose, insulin, weight, and physical activity ([Figure 2](#figure2){ref-type=\"fig\"} A)---and relevant medical calculations such as insulin-to-carbohydrate (I:C) ratio and basal insulin to bolus insulin ratio ([Figure 2](#figure2){ref-type=\"fig\"} C). As a requirement for the GUI, we identified the need to present the data in different time frames (per hour, per day, per week, and for the complete period \\[[Figure 2](#figure2){ref-type=\"fig\"} B\\]) and the use of a color scale to illustrate data ranges ([Figure 2](#figure2){ref-type=\"fig\"} D).\n\nIterative Dashboard Design\n--------------------------\n\n### Facilitated Workshops and Open-Ended Discussions\n\nThe first prototype was presented to the participants in the first facilitated workshop. Based on this prototype and their own experiences, the participants suggested improvements to both data, functionalities, and GUI. The suggested improvements were translated into requirements and implemented in the prototype presented during the second workshop. The improvements suggested during the second workshop were used as requirements during the development of the final prototype. The requirements identified are summarized in [Table 1](#table1){ref-type=\"table\"}.\n\n### Scenarios\n\nWe created three main scenarios ([Table 2](#table2){ref-type=\"table\"}); this was considered a manageable number of scenarios for the workshops and open-ended discussions while still allowing diversification of the situations.\n\n![First prototype of the FullFlow dashboard system.](diabetes_v4i3e14002_fig2){#figure2}\n\n###### \n\nSummary of the requirements defined based on suggestions from the participants in the facilitated workshops and their description.\n\n ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n Requirements Description\n ------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n R1: Displaying data collected by patients At least blood glucose, blood pressure, insulin (bolus/basal), medication, carbohydrates, calories, and physical activity. Being able to accept new data types (eg, menstruation, ketones, and polypharmacy) would be a plus.\\\n The system shall inform clinicians if the patients register life goals (eg, what they are focusing on in their daily self-management).\n\n R2: Quantify data collected by patients The system will notify which data have been collected by the patients and quantify them.\n\n R3: Displaying data collection period The system will provide clinicians the length of time during which patients collected their data.\n\n R4: Variabilities in the patients' data values The system will be able to present a variability value for all data types to indicate how much these values diverge.\n\n R5: Medical calculations The system will be able to provide medically relevant information (eg, insulin-to-carbohydrate ratio and insulin sensitivity).\n\n R6: Grading data reliability The system will permit clinicians to know immediately if the data collected by the patients are reliable (ie, worth their time consulting the data).\n\n R7: Hiding eA~1c~^a^ Removing eA~1c~ from the graphical user interface.\n\n R8: Reduce complexity of blood glucose ranges The system will use the simplified (3 levels) blood glucose range.\n\n R9: Consulting all self-collected health data at once The system will present all self-collected health data at once in a graph.\n\n R10: Pattern recognition The system will ease identifying patterns in patients' lifestyle per day, per week, and for the whole period (eg, hyperglycemic events each day after dinner).\n\n R11: Bridge to existing data The system shall provide information clinicians can assess by comparing existing data to the self-collected health data.\n\n R12: Overview of the patients' situations The system will be able to inform clinicians about what the patients struggle with, what they manage, etc.\n\n R13: Visual helper The system will provide information about which data are in and out of range.\n ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n\n^a^eA~1c~: estimated hemoglobin A~1c~.\n\n###### \n\nScenarios created to support the user requirements. Settings: the context and situation of the scenario. Agents: actors in the scenario. Goals: targets of the scenario. Events: the actions taken by the agents during the scenario.\n\n ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n Taxonomy Scenario 1 Scenario 2 Scenario 3\n ---------- ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n Settings Patient has nightly hypoglycemic events. The patient has an appointment with a diabetes nurse to discuss his situation and therefore collected health data for 1 month prior to the appointment. The patient uses finger pricks and an insulin pen. Patient struggles with carbohydrate counting and always ends up in hyperglycemia after meals, despite using a hybrid closed-loop system (continuous glucose monitor and a pump). Patient also reaches hypoglycemic levels after the insulin action (\"yoyo\" effect). Patient has an appointment with a dietitian after having collected 1 week of data. Patient always has high fasting blood glucose levels, despite being on medication and following cooking courses. Patient has a meeting with his general practitioner after collecting 2 weeks of data.\n\n Agents Patient with type 1 diabetes and diabetes nurse Patient with type 1 diabetes and dietitian Patient with type 2 diabetes and general practitioner\n\n Goals The system should show the hypoglycemic events and identify the nightly trends. The system should show the insulin dosages and the carbohydrate intakes to help the nurse identify possible points of action. The system should show the relationship between meal intakes, insulin-on-board levels, and blood glucose levels. The system should show the high glucose situations, the calorie intakes that are above the recommended levels, the patient's lack of physical activity, the high blood pressure, and that the patient sometimes forgets to take his medication.\n\n Events Patient registers, on an average, per day:\\ Patient registers, on an average, per day:\\ Patient registers, on an average, per day:\\\n 10 blood glucose values,4 carbohydrate intakes,6 insulin injections (2 basal, 4 bolus), and10 minutes of physical activity.Nurse discusses the patient's hypoglycemic events with him and consults the data using the FullFlow dashboard. 288 blood glucose values,hourly insulin bolus dosage, and5 carbohydrate intakes.Dietitian discusses with the patient his \"yoyo effect\" after meals and consults the self-collected health data using the FullFlow dashboard. 1 blood glucose value,2 medication intakes, and5 calorie intakes.Patient also has:\\\n 2 weight registrations,1 blood pressure registration, and3 physical activity registrations (\\<10 minutes).General practitioner discusses the situation with the patient and uses the FullFlow system to get an overview of his situation.\n ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n\n### Final Prototype\n\nWe provide an example of the dashboard based on the self-collected health data obtained from the Research Patient, which was similar to the use case presented to clinicians in the preassessment study. The proposed dashboard contains six main sections accessible from a menu displayed at the top of the page ([Figure 3](#figure3){ref-type=\"fig\"}):\n\n1. The *Overview* contains information regarding the data reliability, the data collected, the patients' personal goals, and a list of noticeable events and their potential causes. It is the landing page of a FullFlow report.\n\n2. The *Combined Data* displays all the quantifiable data sent by the patients in combination with the calculated information for the whole period in a unique graph.\n\n3. The *Daily Distribution* distributes all quantifiable data per hour in multiple graphs (one graph per data type).\n\n4. The *Daily Evolution* summarizes the data per day in multiple graphs (one graph per data type).\n\n5. The *Time Period* displays the data for the whole period in multiple graphs (one graph per data type).\n\n6. The *Data List* lists all the data collected by the patients in a table.\n\n#### Overview Section\n\nThe Overview section provides a summary of all data collected by the patients and the results of the FullFlow analyses ([Figures 4](#figure4){ref-type=\"fig\"} and [5](#figure5){ref-type=\"fig\"}). The objective of this section is to provide an overview of the patients' situation and the medically related events found to be important to discuss or address, without the need to consult the whole data set. The first data displayed are the time period ([Figure 4](#figure4){ref-type=\"fig\"} A), determined by the first and last FHIR artefacts ordered by date. This addresses the requirement R3 ([Table 1](#table1){ref-type=\"table\"}).\n\n![Dashboard menu.](diabetes_v4i3e14002_fig3){#figure3}\n\n![Overview section, part 1. (A) Title and period of time. (B) Data reliability. (C) Summary of the data. (D) List of all the data collected by the patients. (E) Estimated hemoglobin A1c. (F) Blood glucose summary. (G) Time in range and time out of range for blood glucose registrations. (H) Average daily values of data collected by the patients for the period. (I) Latest values for each type of data collected by patients.](diabetes_v4i3e14002_fig4){#figure4}\n\nThe second dataset displayed is related to the reliability of the patients' self-collected health data ([Figure 4](#figure4){ref-type=\"fig\"} B). A knowledge-based module (KBM) grades the reliability of the self-collected health data based on the presence or absence of registered data, potential errors in data values, inconsistencies between data sources, the number of data registrations, and the regularity of the registrations made by the patients. This service addresses the requirement R6 ([Table 1](#table1){ref-type=\"table\"}) by providing clinicians information about the quality and reliability of data at an early stage of consultation. In this example, the system graded the data as reliable. The system provides a list of issues if the data are graded as not reliable. We explained and illustrated this system in a previous article \\[[@ref40]\\].\n\nThe next subsection, the Data Summary ([Figure 4](#figure4){ref-type=\"fig\"} C), first contains a table ([Figure 4](#figure4){ref-type=\"fig\"} D) listing all the patients' self-collected health data with important calculations for diabetes patients, such as insulin sensitivity and insulin to carbohydrates ratio (I:C), if the data collected permit the calculation of these components. These values are displayed side by side with the ratios submitted by the patients, if available, permitting a simple comparison. The table contains the number of registrations and the average daily number of registrations per day for all types of data collected. The table also provides the average of all the values as well as the pooled SD per data type (called \"average deviation\" for the clinicians, see Discussion). The pooled SD is calculated using the formula:\n\n![](diabetes_v4i3e14002_fig14.jpg)\n\n\\...where *n*~k~ represents the number of registrations for a day and ![](diabetes_v4i3e14002_fig15.jpg)represents the variance for a day. We used the same approach for appropriate data types (eg, not used for blood pressure where the system considers only the latest registered value per day). The table also contains specific diabetes rules, such as the 100/85 rule for estimating the insulin sensitivity (also called \"correction factor\") \\[[@ref41]\\] or the 400 rule for estimating the insulin-to-carbohydrate ratio \\[[@ref42]\\]. Patients can also provide this information, and in this case, both collected and calculated values will be listed one above the other for easy comparison. This table addresses requirements R2, R4, and R5 ([Table 1](#table1){ref-type=\"table\"}).\n\nThe next dataset provided is the estimated hemoglobin A~1c~ (eA~1c~) value ([Figure 4](#figure4){ref-type=\"fig\"} E), calculated from the average blood glucose value of all blood glucose registrations, based on the formula proposed by Nathan et al \\[[@ref43]\\]: *eAG*~mmol/L~=1.59\\* *A*~1c~--2.59, where eAG is the estimated average glucose level in mmol/L and A~1c~ the hemoglobin A~1c~ value. The system calculates the eA~1c~ only if there are at least 3 blood glucose registrations a day and 21 blood glucose registrations in total. This system provides two standards for the eA~1c~ value---National Glycohemoglobin Standardization Program (NGSP; %) and International Federation of Clinical Chemistry and Laboratory Medicine (IFCC) (mmol/mol)---considering that Norway replaced NGSP with IFCC in 2018 \\[[@ref44]\\]. To convert NGSP to the IFCC value \\[[@ref44]\\], we use the following formula:\n\n![](diabetes_v4i3e14002_fig16.jpg)\n\nThis service addresses the requirement R11 ([Table 1](#table1){ref-type=\"table\"}) by providing a possible comparison between self-collected health data and laboratory results. However, it also conflicts with the requirement R7 ([Table 1](#table1){ref-type=\"table\"}). Therefore, we decided to hide these values during the medical trial.\n\nThe blood glucose summary ([Figure 4](#figure4){ref-type=\"fig\"} F) displays the average blood glucose value and the pooled SD (same values as in [Table 1](#table1){ref-type=\"table\"}). The blood glucose values per range ([Figure 4](#figure4){ref-type=\"fig\"} G) display the number of registrations and their percentages per range (low, on target, or high), which are defined as per the standards \\[[@ref45],[@ref46]\\]. This addresses requirement R8 ([Table 1](#table1){ref-type=\"table\"}).\n\nThe average daily values ([Figure 4](#figure4){ref-type=\"fig\"} H) display the average of all collected data when appropriate (same values as in [Table 1](#table1){ref-type=\"table\"}). The final dataset displayed is the last value for each data type the patients have registered ([Figure 4](#figure4){ref-type=\"fig\"} I).\n\nFullFlow grades each piece of information presented in [Figure 4](#figure4){ref-type=\"fig\"} E-I and provides four background color states: green, orange, red, and white. These colors have different meanings: green indicates that the value is in the recommended range, orange indicates that the value is slightly above or under the recommended range, red indicates that the value is out of range, and white indicates that a value is not graded because of a lack of standards or that the value depends heavily on context. The visual representation is inspired by the work of Sim et al \\[[@ref47]\\], who are using a similar grading system, and the work of Diagliati et al \\[[@ref48]\\], who used traffic lights. This grading addresses requirements R13 and R12 ([Table 1](#table1){ref-type=\"table\"}).\n\n![Overview section, part 2. A: List of personal goals defined in the patients' diary. B: Example of a personal goal. C: List of noticeable events based on the collected data. D: List of events detected organized by type. E. Distribution of event types per day and per hour. F: Example of a noticeable event.](diabetes_v4i3e14002_fig5){#figure5}\n\nNext, the overview section contains personal goals ([Figure 5](#figure5){ref-type=\"fig\"} A) defined by the patients with or without clinician involvement. Personal goals can be measurable (eg, keeping your blood glucose level between 4 and 10 mmol/L \\[[Figure 5](#figure5){ref-type=\"fig\"} B\\]) or nonmeasurable (eg, more proactive). FullFlow provides progress and description for measurable goals. Displaying personal goals addresses requirement R1 ([Table 1](#table1){ref-type=\"table\"}).\n\nThe next section provides information about noticeable events ([Figure 5](#figure5){ref-type=\"fig\"} C). Noticeable events are important events that clinicians and patients should address to improve the health situation of the patients. FullFlow identifies them using KBM in combination with the patients' self-collected health data and statistical calculations. FullFlow first summarizes the noticeable events by displaying the number of occurrences ([Figure 5](#figure5){ref-type=\"fig\"} D) and distributing the events during the day and the day of the week based on the time, to potentially identify trends ([Figure 5](#figure5){ref-type=\"fig\"} E). Subsequently, FullFlow displays one event at a time, ordered from the most to the least serious, and provides potential causes and explanations for them ([Figure 5](#figure5){ref-type=\"fig\"} F). This section includes other medical conditions related to blood pressure or sleeping pattern in addition to hypoglycemic and hyperglycemic events shown in [Figure 5](#figure5){ref-type=\"fig\"}. We described the KBM in detail in a previous article \\[[@ref40]\\]. Noticeable events address requirement R12 ([Table 1](#table1){ref-type=\"table\"}).\n\n#### Combined Data Section\n\nThe combined data section presents all the quantifiable data available in FullFlow (self-collected health data and calculations), as shown in [Figure 6](#figure6){ref-type=\"fig\"}. This graph is based on the Highstock library \\[[@ref49]\\] and addresses requirement R9 ([Table 1](#table1){ref-type=\"table\"}).\n\nClinicians can change the timeframe by selecting a start and an end date ([Figure 6](#figure6){ref-type=\"fig\"} B) and selecting a predefined time length such as 3 days or 1 week ([Figure 6](#figure6){ref-type=\"fig\"} A) or by sliding, extending, or narrowing the data range selector ([Figure 6](#figure6){ref-type=\"fig\"} D). Clicking on a data type in the lowest part of the graph hides or shows the data type in the center of the graph, allowing clinicians to focus on what they would like to analyze ([Figure 6](#figure6){ref-type=\"fig\"} E). The vertical axes are built automatically (either left or right, [Figure 6](#figure6){ref-type=\"fig\"} C and 6C') depending on the data type available. The frequency of measurements or the data type extracted from Logical Observation Identifiers Names and Codes or the Systematized Nomenclature of Medicine Clinical Terms contained in FHIR artefacts define the data representation in the graph. Series represent data types having at least 20 registrations per day or being of a specific type, such as blood glucose, while bars represent the rest. Areas represent the reference range of the FHIR artefacts (eg, in-range for blood glucose values) linked to a data type. A mouse hovering above a point shows the exact time and value for all data types with the exact same time. We used the OpenAPS approach to calculate the insulin on board (IoB) \\[[@ref50]\\] and the work of Dana Lewis \\[[@ref51]\\] to calculate the carbohydrates on board (CoB).\n\n![Combined data. (A) Period selection by predefined time length. (B) Period selection by dates. (C, C\\'): Multiple y-axes. (D) Period selection by range selector. (E) List of all data types represented in the graph.](diabetes_v4i3e14002_fig6){#figure6}\n\n#### Daily Distribution Section\n\nThe daily distribution section distributes all the available data in a single day to help clinicians identify daily patterns (requirement R10 in [Table 1](#table1){ref-type=\"table\"}), such as hypoglycemic events during the nights or hyperglycemic events during the afternoon. This section proposes one graph per data type ([Figure 7](#figure7){ref-type=\"fig\"}), which displays all the blood glucose measurements available in a single day. In this example, only finger prick registrations are shown. In addition to displaying the data, FullFlow calculates a moving average of all the values. FullFlow uses either a simple moving average or a weighted moving average, depending on the data type and how patients have collected them (see Discussion for more details). This type of graph also contains reference ranges when provided.\n\n#### Daily Evolution Section\n\nThe daily evolution section simply presents the sum, the average, or the latest data per day for the whole period, depending on the data type ([Figure 8](#figure8){ref-type=\"fig\"}). For instance, blood glucose values are averaged per day, insulin amount values are summed per day, and the latest of the blood pressure values of the day are used for each day. This type of graph also contains reference ranges when provided. Each data type has its own graph.\n\n#### Time Period Section\n\nThe time period section shows all data available for the whole period by using the same approach as the combined data, except that one graph contains only one data type ([Figure 9](#figure9){ref-type=\"fig\"}).\n\n#### Data List Section\n\nThe data list section presents extracted information from all health data self-collected by patients in a list, without the calculated values of the FullFlow, as shown in [Figure 10](#figure10){ref-type=\"fig\"}. The section displays the number of registrations made by the patients and shows the date, data type, value, unit, and comment for each entry. Clinicians can order the table by clicking on the head of a column (eg, ordering data per data type) or look up specific registrations using the search field (top right in [Figure 10](#figure10){ref-type=\"fig\"}).\n\nThe different sections in this dashboard permit the display of any type of data collected by the patients and addresses the requirement R1 ([Table 1](#table1){ref-type=\"table\"}).\n\n![Daily distribution of blood glucose values.](diabetes_v4i3e14002_fig7){#figure7}\n\n![Daily evolution of the blood glucose for the whole period.](diabetes_v4i3e14002_fig8){#figure8}\n\n![Time period for blood glucose values.](diabetes_v4i3e14002_fig9){#figure9}\n\n![Data list section.](diabetes_v4i3e14002_fig10){#figure10}\n\nPrestudy Assessment of the Dashboard by Clinicians\n--------------------------------------------------\n\nThis section presents the assessment of the full system (a combination of the Diabetes Diary \\[[@ref52]\\], Diabetes Share Live \\[[@ref27]\\], and FullFlow) by clinicians, following the approach described in the Methods section. As mentioned in the previous section, the graphical interface was presented without the eA~1c~ value displayed in [Figure 4](#figure4){ref-type=\"fig\"}. [Multimedia Appendix 1](#app1){ref-type=\"supplementary-material\"} contains the transcribed answers to the collected questionnaires. The following subsections present the results of the analyses of the questionnaires organized using the taxonomy defined in the Methods section (Data Collection subsection) and concerning the FullFlow system only (the Diabetes Diary and Diabetes Share Live are outside the scope of this study).\n\n### Participants\n\nFourteen clinicians participated in the prestudy assessment: nine (64.3%) were GPs, four (28.6%) were diabetes nurses, and one (7.1%) was a dietitian.\n\n### Pertinence of the Functionalities Provided by the FullFlow System\n\nRegarding the relevance of the functionalities provided by the system, the overwhelming majority of the participants (9/14, 64.3%) considered them relevant and would like to keep the system in the current state, without adding or removing any functionalities, as shown in [Table 3](#table3){ref-type=\"table\"}. Five (35.7%) participants would have liked to add or remove one or more functionalities in the system. Although the majority of the primary health care personnel (GPs) were satisfied with the information available in the system (7/9 or 77.8% would like to keep the system in its current state, while 2/9 or 22.2% would like to alter it), the situation was less clear for the secondary health care personnel (nurses and dietitian), with three (of 5, 60%) clinicians wanting to adjust functionalities and the other two (40%) not wanting to change the system. Regarding functionality alterations, five clinicians proposed 11 points to improve the system and offer more pertinent data ([Figure 11](#figure11){ref-type=\"fig\"}).\n\n###### \n\nClinicians' evaluations of potential required adjustments to FullFlow, categorized by the results of the evaluation (to keep or adjust functionalities) and by clinical role (general practitioner, diabetes nurse, and dietician).\n\n Role General practitioner, n Diabetes nurse, n Dietitian, n Total, n (%)\n ------------------------ ------------------------- ------------------- -------------- --------------\n Adjust functionalities 2 2 1 5 (36)\n Keep functionalities 7 2 0 9 (64)\n\n![Sankey diagram of the functionality adjustments proposed by the clinicians. Each color corresponds to a specific type of adjustment. Orange: new service; lilac: new data type; light green: remove functionality; green: add functionality; dark green: proposed functionality adjustment. The numbers represent the number of times an adjustment was mentioned. BG: blood glucose; IoB: insulin on board; CoB: carbohydrates on board; I:C: insulin to carbohydrate ratio.](diabetes_v4i3e14002_fig11){#figure11}\n\nOf the eleven functionality adjustments proposed, nine (of 11, 81.8%) were related to adding new functionalities and the other two (18.2%) were related to removing functionalities. Proposals for adding new functionalities were divided into two subgroups: new services (n=4) and new data types (n=5). New data types would require adding data types not available in the system when they were presented to the clinicians, while adding new services would mean creation of new functionalities using the data currently available in the system. Of the suggested new data types, insulin type (eg, slow or fast acting) was mentioned twice by clinicians, with the suggestion that it be available in both the overview section and the graphs. The other data types suggested were blood pressure, plasma glucose, and lipids. Of the suggested new services, clinicians twice expressed the desire to enter goals and notes directly into the Diabetes Diary of the patients through the FullFlow system. Another clinician requested more detailed blood glucose ranges such as high hypoglycemia in the overview section, and a second suggested displaying I:C values by time of the day (eg, fasting, morning, afternoon, and night). Depending on the situations of the patients, these new data types and services could \"help provide more tailored advice\" and \"facilitate cooperation with the patients,\" according to the clinicians. Of the functionalities suggested for removal, one clinician proposed removing IoB and CoB from the graphical interface, suggesting that \"they will not have time to investigate this data.\"\n\n[Figure 12](#figure12){ref-type=\"fig\"} shows the correlation between the suggested adjustments and clinical roles. The data show that adjustment needs were disjointed between the primary and secondary health care personnel: The former group expressed the need to add blood pressure, plasma glucose, and lipids to the functionalities of the FullFlow system (mentioned once each), while the latter group did not need them. The secondary health care personnel group proposed adjusting the services available in FullFlow, while the GPs focused only on new data types.\n\nThe needs of the dietitian and diabetes nurses intersected, with the proposal of writing goals and notes directly in the Diabetes Diary of the patients via the FullFlow system (mentioned once per group). The nurses proposed recording the insulin type (mentioned twice) and a more detailed blood glucose range (mentioned once) in the FullFlow system. The dietitian was the only clinician to suggest removing features from the FullFlow system (IoB and CoB) and displaying the I:C values by time of the day.\n\n### Usability of the FullFlow System\n\nOne clinician pointed out the possibility of the system being time consuming during consultation, which could reduce its efficiency. Querying the robustness of the FullFlow system, one participant noted that insulin and carbohydrate intake times should be matched in the Combined Data graph. A bug resulting in movement of registrations on the time axis (x) when hiding or showing data types ([Figure 6](#figure6){ref-type=\"fig\"} E) was corrected, and registrations having the same time were shown close to each other.\n\n![Matrix presenting the correlation between the suggested adjustments and clinical roles (general practitioner, diabetes nurse, and dietitian). The columns represent the clinical roles and use the same color coding as the previous figures (D/orange: dietitian; N/beige: diabetes nurse; GP/grey: general practitioner). The rows represent the adjustments proposed by the clinicians and follow the same categorization and color coding as the previous figure (light green: remove functionality; beige: new service; lilac: new data type). (-) denotes a proposed functionality removal, while (+) denotes a proposed functionality introduction. The dark grey circles represent a suggested adjustment by a specific clinical role and the number of times it was mentioned by that role. The vertical lines represent logical sets, while the horizontal lines denote the intersections of the logical sets, like a Euler diagram. BG: blood glucose; IoB: insulin on board; CoB: carbohydrates on board; I:C: insulin-to-carbohydrate ratio.](diabetes_v4i3e14002_fig12){#figure12}\n\n### Expectations and Summary\n\nAll the participants (14/14, 100%) expected that the presented system---a combination of the Diabetes Diary, Diabetes Share Live and FullFlow---would be useful during their daily consultations. They forecast that the system would be good for all patients, but particularly effective if patients enter enough data regularly in their diaries.\n\nThey predicted that three types of patients would be interested in this solution: (1) patients who are interested in technology and self-management; (2) patients concerned about their diabetes and quality of life; and (3) patients living in remote areas, where the usage of the system could support remote consultations and avoid patients travelling several hours for a single face-to-face consultation. One participant mentioned that several patients already use self-management apps, which would ease the introduction of this system.\n\nOverall, the system was very well received by the participants and they were eager to start using it during consultations. However, the participants mentioned that experience using the system will be needed to validate their expectations and clarify the system's usability and functionality.\n\nDiscussion\n==========\n\nPrincipal Results\n-----------------\n\nThis paper presented a dashboard for displaying patients' self-collected health data during consultations, using diabetes as a case example. The graphical interface was implemented using continuous feedback from clinicians and patients to minimize possible future user resistance by providing relevant information to meet clinicians' needs. We limited the potential increase in time consumption due to the usage of this solution by proposing information related to the quality of self-collected health data (identifying whether the data are worth consulting), displaying an overview of the situation of a patient, and identifying important medical events without the need to consult the complete data set.\n\nThe prestudy assessment showed that the solution could be effective during consultations, especially if patients live in remote areas or are interested in either mobile technologies or improving their life conditions. The majority of clinicians were satisfied with the current state of the graphical interface, and all clinicians were eager to start using it.\n\nThe prestudy assessment also showed that the needs of primary and secondary health care personnel are disjointed: GPs do not need the same data and services as diabetes nurses or dietitians. However, due to the limitations of the Diabetes Diary (see below), their wishes cannot be fulfilled.\n\nDashboard Functionalities and Graphical User Interface\n------------------------------------------------------\n\nThe information provided by the KBM module, namely, the grading of the self-collected health data ([Figure 4](#figure4){ref-type=\"fig\"} B), the identification of trends ([Figure 5](#figure5){ref-type=\"fig\"} C-E), and the identification of potential causes of medical events ([Figure 5](#figure5){ref-type=\"fig\"} F) address two of the main barriers of acceptance of introducing self-collected health data into consultation, namely, the distrust of this source of data \\[[@ref53]-[@ref56]\\] and a time increase in consultation.\n\nThe calculations presented in the overview table ([Figure 4](#figure4){ref-type=\"fig\"} D) can facilitate diabetes management \\[[@ref57]-[@ref60]\\] for both patients and clinicians. We chose to use a table for representing this information, considering that clinicians are accustomed to using tables for visually representing data, which can surpass graphs in certain conditions \\[[@ref61]\\]. We used a standard pooled deviation for illustrating the variability of data type, considering that diabetes, as a chronic disease, is a day-to-day management disease and that a routine (ie, less variability of medical values) can improve the condition of patients drastically \\[[@ref62],[@ref63]\\]. For instance, a low glucose variability is more important for diabetes patients than having an in-range hemoglobin A~1c~ for preventing complications \\[[@ref64]\\]. Therefore, providing an indication about how much patients are able to stabilize their blood glucose values during each day is important for them. Although previous studies proposed several methods for measuring glucose variability using SD, coefficient of variability, mean amplitude of glycemic excursion, or continuous overall net glycemic action with CGM, there is a lack of consensus on which method should be used \\[[@ref64],[@ref65]\\]. Moreover, these methods have drawbacks when using self-monitoring blood glucose values due to a lack of sufficient and regular number of measurements. Since our system uses available data either from CGM, self-monitoring of blood glucose, or a combination of the two, we are looking for a generic model that can work for all types of available data from the patient. It is quite optimistic to assume that patients self-register data regularly every day, because it reminds them that they are sick \\[[@ref66]\\]; we used pooled SDs to weight the average of each day's SD. This weighting gives larger groups (days with more registrations) a proportionally greater effect on the overall estimate of the variability \\[[@ref67]\\] and allows us to increase the robustness of statistical calculations. Clinicians agreed to use this approach. Another point to discuss is our decision to use the more accessible term \"average deviation\" instead of \"pooled SD.\" We believe that this term will prevent patients and clinicians from being exposed to mathematical concepts in order to understand the value. However, the complete definition, with the formula and explanations of the term, is presented to users if they hover the mouse over the \"average deviation\" term. Moreover, we expect feedback on this taxonomy from the medical trial.\n\nWe decided to use the eA~1c~ functionality, although its use is contested by some authors \\[[@ref68],[@ref69]\\] for allowing clinicians to compare the eA~1c~ with the hemoglobin A~1c~ results of the laboratory tests, since previous studies showed that there is a correlation between the hemoglobin A~1c~ and eA~1c~ values \\[[@ref70]\\]. An important deviation between these two values could indicate a poor quality and reliability of the self-collected health data due to, for example, an insufficient number of registrations per day and can therefore be used as one of the indicators of the quality and reliability of the self-collected health data. Today, due to technical restrictions, the FullFlow system cannot integrate EHRs' data and display the hemoglobin A~1c~ value side by side with the eA~1c~. Clinicians can consult the hemoglobin A~1c~ values in their EHRs and use FullFlow for consulting the eA~1c~. In addition, this approach is used by the American Diabetes Association \\[[@ref71]\\] and MySugr \\[[@ref72]\\] and is cited in the NGSP's website \\[[@ref73]\\]. However, we decided to hide the eA~1c~ value, considering that clinical workers were concerned that this value can confuse patients in Norway. Nonetheless, the system will still collect the value, allowing us to compare the calculated values against the laboratory test results or the hemoglobin A~1c~ values reported through questionnaires, to determine how this approach fits real situations. The dashboard containing the eA~1c~ may be of interest to clinicians, patients, researchers, and computer scientists.\n\nRegarding the grading of each piece of information ([Figure 4](#figure4){ref-type=\"fig\"} E-I), the system uses different approaches depending on the type of data. For instance, the FullFlow relies on medical standards given by the Norwegian Directorate for Health \\[[@ref74]\\] and international public entities \\[[@ref75]\\] (eg, hemoglobin A~1c~ values) or values we defined during our workshops (eg, grades for the blood glucose in-range values). Some values are not graded, such as the daily amount of insulin used, because each patient follows tailored insulin therapy, depending on physiological conditions such as weight as well as lifestyle factors such as meal times and physical activity \\[[@ref76]\\].\n\nDisplaying the patients' personal goals in the overview section ([Figure 5](#figure5){ref-type=\"fig\"} A) before the noticeable events will help the patients steer the medical consultation toward what they would like to discuss with their clinicians, as some of them are too shy to interrupt the clinicians directly, according to the feedbacks collected in the workshops.\n\nThe moving average and weighted moving average used by the daily distribution section ([Figure 7](#figure7){ref-type=\"fig\"}) further facilitate the visual detection of patterns by clinicians, which can be useful for improving patients' lifestyle \\[[@ref77],[@ref78]\\]. We are aware of other types of moving averages such as the exponential weight moving average \\[[@ref79]\\] or the Hull moving average for reducing lag \\[[@ref80]\\]. However, we decided to use a simple weighted moving average in the first version of the FullFlow. The decision regarding the usage of a weighted or simple moving average relies on the analysis of the FHIR artefacts. For instance, a blood glucose value obtained from a finger prick has twice the weight of a blood glucose value measured with a CGM, considering that finger pricks are more accurate than the CGMs, which require calibration \\[[@ref81]\\]. The window size for calculating the moving average is set to five registrations to suppress the sheer power the CGM readings have over the self-monitoring blood glucose measurements (ie, five registrations maximum are used for calculating one value of the weighted moving average). This fact remains true even though the CGMs are becoming more accurate \\[[@ref82]\\] and some do not require calibration at all \\[[@ref83]\\].\n\nComparison with Previous Studies\n--------------------------------\n\nThe dashboard we proposed differs from others such as MySugr \\[[@ref84]\\], the dashboard of Diagliati et al \\[[@ref48]\\], Carelink by Medtronic \\[[@ref85]\\], the clinical decision system by Sim et al \\[[@ref47]\\], the system proposed by Martinez-Millana et al \\[[@ref86]\\], and the platform proposed by Fico et al \\[[@ref87]\\]. The main differences are listed below:\n\n1. FullFlow does not limit the integration of data to specific companies or types of sensors: finger pricks, CGMs, insulin pens or insulin pumps can all be used by the patients.\n\n2. FullFlow analyzes the data and proposes recommendations regarding potential causes of medical situations.\n\n3. FullFlow provides indicators regarding the reliability of the self-collected health data.\n\n4. FullFlow empowers patients by introducing their personal goals in the medical consultation.\n\nLimitations\n-----------\n\nThe first limitation is the size of the sample for the design and prestudy assessment phases, in which 18 clinicians and 2 patients participated. Although the sample did not permit involvement of all types of clinical roles to identify their needs and evaluate the graphical interface according to their preferences, it was sufficient for determining that the dashboard is ready to enter a medical trial.\n\nDuring the prestudy assessment, one of the clinicians mentioned that (s)he was afraid that the system could be time consuming. Although the KBM can, in theory, address this issue, as we explained in a previous article \\[[@ref40]\\], we fear this challenge will greatly impact the medical trial due to the technical solutions chosen.\n\nWe know that the chosen patient platform, the Diabetes Diary, is not the optimum app for all diabetes patients, as it lacks important features such as the insulin type, blood pressure, polypharmacy, and integration into glucometers and physical activity trackers for automatic data transmission. These missing features might result in a degradation of the reliability of the data and experience for the patients as well as for the clinicians, who would like to have access to these missing data, as specified in the Prestudy Assessment section. Moreover, the Diabetes Share Live solution platform, which requires many steps to be performed during consultation for viewing the self-collected health data, could degrade the experience of the users. This platform requires eight steps to share the data: (1) patients open the Diabetes Diary, (2) patients wait for the application to give a unique identification code, (3) clinicians open an Internet Navigator, (4) patients give clinicians the unique code, (5) clinicians enter the code on the Webpage, (6) clinicians choose a time period, (7) patients acknowledge the time period given by the clinicians and select the data they want to share, and (8) clinicians consult the FullFlow.\n\nHowever, the FullFlow system itself is not affected by these limitations and can accept data from several applications and several operating systems. For example, while the insulin type will not be displayed during the medical trial (the system displays \"Insulin Unknown\"; [Figure 4](#figure4){ref-type=\"fig\"} D and [4](#figure4){ref-type=\"fig\"} H), the FullFlow differentiates types of insulin and treats them differently when such information is available. [Figure 13](#figure13){ref-type=\"fig\"} shows an example of different insulin types for data collected using the MySugr app, where bolus and basal insulin types are treated as separate entities and combined to calculate the IoB by using different profiles \\[[@ref50]\\]. [Multimedia Appendix 2](#app2){ref-type=\"supplementary-material\"} shows an instance of the dashboard populated with other data types and demonstrated that the system is able to display any FHIR data.\n\nNevertheless, the medical trial will still allow us to conduct research on the relevance of the information displayed, its potential impact on medical services, and the relevance of the KBM. Although the approach and business rules of the KBM are trusted by the clinicians who were involved in its creation, the medical trial will measure trust in the system during its usage, which will depend on the situation of the patients and the data collected by them. It could also be suitable for remote consultation.\n\nThe last limitation concerns the integration of EHR data into FullFlow, which, while planned, is not yet available. Therefore, FullFlow cannot directly show EHR data, such as hemoglobin A~1c~, and clinicians will have to use both systems during consultations. However, while not reaching its maximum potential, FullFlow will still permit the study of the integration of self-collected health data into consultations.\n\nFuture Research\n---------------\n\nThe graphical interface can still be improved in different ways: The table in the Data Summary section could contain information related to the in-range values of each data type and be visually graded like the rest of the overview page (green, orange, red, and white). Shortcuts to the combined graph from a noticeable event could be made, with automatic selection of data to display or hide. It may also be possible to see self-collected health values day by day, with the current day values displayed in a large graph at the top of the page and all other days' values listed under this graph as smaller graphs, one per day; we could also add daily computational glucose variability using SDs to the top of the overall graph.\n\nWe believe that the results from the medical trial, in which clinicians use FullFlow in their daily consultations, are necessary to assess what information is useful to add or remove, before changing the graphical interface. Nevertheless, we believe that the proposed dashboard is a viable temporary solution, and ensuring interoperability of the data using standards and terminologies will allow the independence of the EHRs and permit users to display the information in the ways that benefits most of their users.\n\nThe graphical interface could also be improved by adding dual signaling for visually impaired people. For instance, the data summary table in the overview section could integrate visual cues, such as equals signs or arrows pointing up or down, to indicate whether values are in range or out of range. These signs could be added below the values displayed in circles in the overview section or even used as texture.\n\n![Example of data list and combined data with different types of insulin.](diabetes_v4i3e14002_fig13){#figure13}\n\nIn addition, reports do not contain information regarding the patients themselves (eg, names or identity numbers). This is due to the usage of the Diabetes Diary and Diabetes Share Live. It will not affect the medical trial, given that clinicians and patients use the system in real time together and clinicians can export the reports to their EHRs, where the patient will already be selected. Notably, clinicians would like to write goals or notes directly into the patients' apps using the FullFlow system, which is outside the scope of the study at this stage; we would suggest that patients use their mobile apps themselves to directly create the goals defined in collaboration with their clinicians during consultation.\n\nAlthough the system can read and display any data types as long as they are in an FHIR format, it will use only \"registered\" data types for advanced services (eg, blood glucose, insulin, blood pressure, and menstruation), such as grading data reliability or exploring potential causes of medical events. The registered data types are listed in another article \\[[@ref40]\\]. We plan to add new business rules for new data types in the future, such as lipids (as requested by a clinician) or foot temperature for early detection of injuries due to diabetic neuropathy. [Multimedia Appendix 2](#app2){ref-type=\"supplementary-material\"} shows an example of the graphical interface containing lipids as \"unregistered\" data type and six registered data types.\n\nConclusions\n-----------\n\nThe designed dashboard could ease the introduction of self-collected health data during medical consultation by providing relevant information about the situation of the patients, the reliability of the data, and important medical events without the need to consult the data in details. Moreover, the designed dashboard could be an effective solution for face-to-face and remote consultations.\n\nA medical trial, started in November 2018, will provide medical context and document user experience and medical outcomes through usage logs, interviews, and surveys and will help us adjust and improve the dashboard in terms of its graphical interface and functionalities. The results are expected in the beginning of 2020.\n\nThe work described in this paper is sponsored by the Norwegian Research Council under the project \"Full Flow of Health Data Between Patients and Health Care Systems\" (number 247974/O70) and is part of the PhD program of the first author, AG. We would like to thank all participants of the workshops.\n\nConflicts of Interest: None declared.\n\nTranscribed answers to the collected questionnaires.\n\nExample of the graphical interface with different data types.\n\nCGM\n\n: continuous glucose monitor\n\nCoB\n\n: carbohydrates on board\n\neA ~1c~\n\n: estimated HbA~1c~\n\nEHR\n\n: electronic health record\n\nFHIR\n\n: Fast Healthcare Interoperability Resources\n\nGP\n\n: general practitioner\n\nGUI\n\n: graphical user interface\n\nI:C\n\n: insulin-to-carbohydrate ratio\n\nIFCC\n\n: International Federation of Clinical Chemistry and Laboratory Medicine\n\nIoB\n\n: insulin on board\n\nKBM\n\n: knowledge-based module\n\nNGSP\n\n: National Glycohemoglobin Standardization Program\n"], ["Commentary {#section1-1535759719835351}\n==========\n\nThe currently highlighted manuscript represents the latest study from a group who has extensively contributed to the knowledge in the field regarding the mechanisms underlying tumor-associated epilepsy, expanding our knowledge regarding the underlying pathological mechanisms ([Figure 1](#fig1-1535759719835351){ref-type=\"fig\"}). They initially discovered that glioma cells release excitotoxic concentrations of glutamate which acts in an autocrine fashion to increase the proliferation of glioma cells and induces excitotoxicity in the peritumoral region, thereby performing dual roles to facilitate and accommodate tumor growth, respectively (for review see reference de Groot et al^[@bibr1-1535759719835351]^). They have pinpointed the source of glutamate release from glioma cells to the system Xc\u2212 transporter (SXC; for review see reference de Groot et al^[@bibr1-1535759719835351]^). This group has also demonstrated that GABAergic signaling in the peritumoral region is compromised resulting from a loss of parvalbumin-positive (PV+) interneurons in the peritumoral region, compromised function of remaining fast-spiking interneurons (FSNs), and a shift in the equilibrium potential for chloride in excitatory neurons, due to a decreased expression of the K+/Cl\u2212 cotransporter.^[@bibr2-1535759719835351]^ Collectively, this group has contributed a large body of work demonstrating a 2-hit model of excessive glutamate release and compromised GABAergic inhibition contributing to tumor-associated epilepsy ([Figure 1](#fig1-1535759719835351){ref-type=\"fig\"}).\n\n![Numerous mechanisms whereby glutamate release from glioma cells influences tumor-associated epilepsy. Glutamate release via reversal of the system xc\u2212 transporter 1) increases the proliferation of glioma cells, 2) induces excitotoxicity in the peritumoral region facilitating tumor expansion, and 3) reduces KCC2 expression in principal neurons causing compromised GABAergic inhibition. Glioma cells also release MMPs leading to 4) degradation of perineuronal nets and 5) loss of fast-spiking interneurons.](10.1177_1535759719835351-fig1){#fig1-1535759719835351}\n\nIn the currently highlighted manuscript, the authors demonstrate that glioma cells release matrix metalloproteinases (MMPs), degrading perineuronal nets (PNNs) surrounding FSNs in the peritumoral region, compromising GABAergic inhibition and contributing to tumor-associated epileptiform activity. These findings propose a mechanism underlying the observed loss in FSNs and the compromised function of remaining FSNs in the peritumoral region. In addition, previous studies point to a role for extracellular matrix proteins in establishing the chloride gradient required for effective inhibitory GABAergic signaling,^[@bibr3-1535759719835351]^ which suggests that the degradation of PNNs may also underlie the observed changes in chloride homeostasis in the peritumoral region.\n\nA loss of FSNs has been demonstrated previously in numerous models of brain trauma, including tumor models, ischemia, and Traumatic brain injury (TBI). In fact, a degradation of PNNs surrounding PV+ interneurons has also been demonstrated following TBI.^[@bibr4-1535759719835351]^ Perineuronal nets are thought to protect FSNs from oxidative stress and excitotoxicity,^[@bibr5-1535759719835351]^ and thus, the degradation of PNNs increases vulnerability to the loss of FSNs. Despite similarities in the neuropathology in these models, the authors of the current study suggest that release of MMPs from glioma cells, not astrocytes, mediate the loss of FSNs in the peritumoral region, which obviously does not translate to the other models of brain injury. This evidence would suggest that despite the similar neuropathology, there are unique mechanisms contributing to the loss of FSNs in these models of brain injury. To our knowledge, the mechanisms underlying the loss of PNNs in other models of brain injury, such as TBI, remains unknown.\n\nAn alternative interpretation of these results is that the degradation of PNNs surrounding PV+ interneurons is a beneficial mechanism to support the increased plasticity necessary for recovery, as has been proposed in the case of ischemia.^[@bibr6-1535759719835351]^ This theory builds on the well-established role for PNNs in neural plasticity (for review see reference Wang et al^[@bibr7-1535759719835351]^). Perineuronal nets play a role in regulating plasticity during development and into adulthood. The opening of PNNs is thought to facilitate synaptic plasticity during critical periods; whereas, the establishment of PNNs is thought to stabilize existing connections and limit further plasticity. Thus, it is possible that the degradation of PNNs in the peritumoral region may play a role in homeostatic plasticity to help recovery and the restoration of appropriate synaptic transmission following injury. Although this is an attractive theory, data presented in the currently highlighted manuscript directly demonstrates that degradation of PNNs with chondroitinase ABC (ChABC) increases epileptiform activity. Further, ChABC has been shown to increase seizure susceptibility.^[@bibr8-1535759719835351]^ In addition to the role of PNNs in plasticity, they are also thought to confer protection to oxidative stress^[@bibr5-1535759719835351]^ and, therefore, may also play a role in neuroprotection. As such, disruption of PNNs has been observed under pathological conditions and has been implicated in numerous disorders, ranging from Alzheimer disease to addiction (for review see reference Sorg et al^[@bibr9-1535759719835351]^). The currently highlighted manuscript suggests a role for PNNs in the underlying neuropathology of tumor-associated epilepsy, in which degradation of PNNs surrounding FSNs leaves these neurons vulnerable to neurodegeneration and compromised function in those that remain. Combined, these findings demonstrate that the degradation of PNNs contributes to hyperexcitability rather than supporting recovery of the network following injury.\n\nConsistent with the notion that the degradation of PNNs contributes to hyperexcitability, the authors demonstrate compromised function of remaining FSNs associated with changes in the electrophysiological properties of these neurons, including the inability of sustain their characteristic high firing rate. Fast-spiking interneurons in the peritumoral cortex exhibit a higher membrane capacitance, a change which the authors pinpoint to the loss of the surrounding PNN, since it can be mimicked with ChABC and blocked using an MMP inhibitor. The authors suggest that PNNs subserve a previously unknown function, acting in similar fashion to oligodendrocytes forming a myelin sheath to decrease the membrane capacitance of FSNs, thereby facilitating the high firing rate of these neurons. This is a provocative idea and supports a novel function for PNNs. However, since this finding is not the primary focus of the current study, it is not given the attention that it deserves. Although it may be a bit premature to ascribe this essential function to PNNs, the evidence presented in the currently highlighted manuscript supports this notion and is certainly worthy of further study.\n\nThe currently highlighted manuscript is the latest study contributing to an extensive body of work from the Sontheimer laboratory characterizing mechanistic changes contributing to tumor-associated epilepsy, summarized in [Figure 1](#fig1-1535759719835351){ref-type=\"fig\"}. This line of research has led to the identification of the SXC, which contributes to glutamate release from glioma cells contributing to tumor expansion and tumor-associated epilepsy (for review see reference Sontheimer et al^[@bibr10-1535759719835351]^. Sulfasalazine, which blocks SXC, has shown to be effective in preclinical glioma models and is being pursued in clinical trials for the treatment of glioblastoma. In addition, the currently highlighted study also points to defective inhibitory signaling in tumor-associated epilepsy due to the degradation of PNNs and suggests that MMPs may be a useful therapeutic target.\n\nBy Jamie Maguire\n"], ["The copyright line for this article was changed on 6 October 2016 after original online publication.\n\nINTRODUCTION {#prot25007-sec-0001}\n============\n\nMost cellular processes are carried out by physically interacting proteins.[1](#prot25007-bib-0001){ref-type=\"ref\"} Characterizing protein interactions and higher order assemblies is therefore a crucial step in gaining an understanding of how cells function.\n\nRegrettably, protein assemblies are still poorly represented in the Protein Databank (PDB).[2](#prot25007-bib-0002){ref-type=\"ref\"} Determining the structures of such assemblies has so far been hampered by the difficulty in obtaining suitable crystals and diffraction data. But this limitation is being circumvented with the advent of new powerful electron microscopy techniques, which now enable the structure determinations of very large macromolecular assemblies at atomic resolutions.[3](#prot25007-bib-0003){ref-type=\"ref\"}\n\nOn the other hand, the repertoire of individual protein 3D structures has been increasingly filled, thanks to large\u2010scale structural genomics projects such as the PSI (/) and others (/). Given a newly sequenced protein, the odds are high that its 3D structure can be readily extrapolated from structures of related proteins deposited in the PDB.[4](#prot25007-bib-0004){ref-type=\"ref\"}, [5](#prot25007-bib-0005){ref-type=\"ref\"} Moreover, thanks to the recent explosion of the number of available protein sequences, it is now becoming possible to model the structures of individual proteins with increasing accuracy from sequence information alone[6](#prot25007-bib-0006){ref-type=\"ref\"}, [7](#prot25007-bib-0007){ref-type=\"ref\"} as will be highlighted in the CASP11 results in this issue. Structures from this increasingly rich repertoire may be used as templates or scaffolds in protein design projects that have useful medical applications.[8](#prot25007-bib-0008){ref-type=\"ref\"}, [9](#prot25007-bib-0009){ref-type=\"ref\"} Larger protein assemblies can be modeled by integrating information on individual structures with various other types of data with the help of hybrid modeling techniques.[10](#prot25007-bib-0010){ref-type=\"ref\"}\n\nComputational approaches play a major role in all these endeavors. Of particular importance are methods for deriving accurate structural models of multiprotein assemblies, starting from the atomic coordinates of the individual components, the so\u2010called \"docking\" algorithms, and the associated energetic criteria for singling out stable binding modes.[11](#prot25007-bib-0011){ref-type=\"ref\"}, [12](#prot25007-bib-0012){ref-type=\"ref\"}, [13](#prot25007-bib-0013){ref-type=\"ref\"}\n\nTaking its inspiration from CASP, the community\u2010wide initiative on the Critical Assessment of Predicted Interactions (CAPRI), established in 2001, has been designed to test the performance of docking algorithms (/). Just as CASP has fostered the development of methods for the prediction of protein structures, CAPRI has played an important role in advancing the field of modeling protein assemblies. Initially focusing on protein--protein docking and scoring procedures, CAPRI has expanded its horizon by including targets representing protein\u2010peptide and protein nucleic acids complexes. It has moreover conducted experiments aimed at evaluating the ability of computational methods to estimate binding affinity of protein--protein complexes[14](#prot25007-bib-0014){ref-type=\"ref\"}, [15](#prot25007-bib-0015){ref-type=\"ref\"}, [16](#prot25007-bib-0016){ref-type=\"ref\"} and to predict the positions of water molecules at the interfaces of protein complexes.[17](#prot25007-bib-0017){ref-type=\"ref\"}\n\nConsidering the importance of macromolecular assemblies, and the new opportunities offered by the recent progress in both experimental and computational techniques to probe and model these assemblies, a better integration of the different computational approaches for modeling macromolecular assemblies and their building blocks was called for. Establishing closer ties between the CASP and CAPRI communities appeared as an important step in this direction, inaugurated by running a joint CASP\u2010CAPRI prediction experiment in the summer of 2014. The results of this experiment were summarized at the CASP11 meeting held in Dec 2014 in Cancun Mexico, and are presented in detail in this report.\n\nThe CASP11\u2010CAPRI experiment, representing CAPRI Round 30, comprised 25 targets for which predictions of protein complexes were assessed. These targets represented a subset of the 100 regular CASP11 targets. This subset comprised only \"easy\" CASP targets, those whose 3D structure could be readily modeled using standard homology modeling techniques. Targets that required more sophisticated approaches (*ab\u2010initio* modeling, or homology modeling using very distantly related templates) were not considered, as the CAPRI community had little experience with these approaches. The vast majority of the targets were homo\u2010oligomers. CAPRI groups were given the choice of modeling the subunit structures of these complexes themselves, or using models made available by CASP participant, in time of the docking calculations.\n\nOn average, about 25 CAPRI groups, and about 7 CASP groups submitted docking predictions for each target. About 12 CAPRI scorer groups per target participated in the CAPRI scoring experiment, where participants are invited to single out correct models from an ensemble of anonymized predicted complexes generated during the docking experiment.\n\nIn total, these groups submitted \\>9500 models that were assessed against the 3D structures of the corresponding targets. The assessment was performed by the CAPRI assessment team, using the standard CAPRI model quality measures.[18](#prot25007-bib-0018){ref-type=\"ref\"}, [19](#prot25007-bib-0019){ref-type=\"ref\"} A major issue for the assessment, and for the Round as a whole, was the uncertainties in the oligomeric state assignments for a significant number of the targets. For many of these the assigned state at the time of the experiment was inferred solely from the crystal contacts by computational methods, which can be unreliable.\n\nIn presenting the CAPRI Round 30 assessment results here, we highlight this issue and the more general challenge of correctly predicting the association modes of weaker complexes of identical subunits, and those of higher order homo\u2010oligomers. In addition, we examine the influence of the accuracy of the modeled subunits on the performance of the docking and scoring predictions, and evaluate the extent to which docking procedures confer an advantage over standard homology modeling methods in predicting homo\u2010oligomer complexes.\n\nTHE TARGETS {#prot25007-sec-0002}\n===========\n\nThe 25 targets of the joint CASP\u2010CAPRI experiment are listed in Table [1](#prot25007-tbl-0001){ref-type=\"table-wrap\"}. Of these 23 are homo\u2010oligomers, with 18 declared to be dimers and five to be tetramers, and two heterocomplexes. Hence for the majority of the targets (23) the goal was to model the interface (or interfaces in the case of tetramers) between identical subunits, whose size varied between 44 and 669 residues but was of \u223c250 residues on average. The majority of the targets were obtained from structural genomics consortia. They represented mainly microbial proteins, whose function was often annotated as putative.\n\n###### \n\nThe CAPRI\u2010CASP11 Targets of CAPRI Round 30\n\n Target ID Contributor Quaternary state Residues Buried area (\u00c5^2^) Protein \n ----------- ------------- ------------------ ------------ ---------------------------------------------- --------- ------ -----------------------------------------------------------------------------------------------------------\n T68 T0759 NSGC **1 or 2** **1** 109 860 Plectin 1 and 2 repeats (HR9083A) of the Human Periplakin\n T69 T0764 JCSG 2 **2** 341 2415 Putative esterase (BDI_1566) from Parabacteroides distasonis\n T70 T0765 JCSG **2** **4** 128 2030 Modulator protein MzrA (KPN_03524) from Klebsiella pneumoniae subsp.\n T71 T0768 JCSG **4** **4** 170 2380 Leucine rich repeat protein (BACCAP_00569) from Bacteroides capillosus ATCC 29799\n T72 T0770 JCSG 2 **2** 488 1120 SusD homolog (BT2259) from Bacteroides thetaiotaomicron\n T73 T0772 JCSG **4** **4** 265 5900 Putative glycosyl hydrolase (BDI_3914) from Parabacteroides distasonis\n T74 T0774 JCSG **1** **4** 379 2040 Hypothetical protein (BVU_2522) from Bacteroides vulgatus\n T75 T0776 JCSG 2 **2** 256 1040 Putative GDSL\u2010like lipase (PARMER_00689) from Parabacteroides merdae (ATCC 43184)\n T77 T0780 JCSG 2 **2** 259 1600 Conserved hypothetical protein (SP_1560) from Streptococcus pneumoniae TIGR4\n T78 T0786 Non\u2010SGI **4** **4** 264 4160 Hypothetical protein (BCE0241) from Bacillus cereus\n T79 T0792 Non\u2010SGI **2** 80 680 OSKAR\u2010N\n T80 T0801 NPPB **2** 2 376 1960 Sugar aminotransferase WecE from Escherichia coli K\u201012\n T81 T0797 Non\u2010SGI 2 2 44 1070 cGMP\u2010dependent protein kinase II leucine zipper\n T0798 2 2 198 Rab11b protein \n T82 T0805 Non\u2010SGI **2** 2 214 3250 Nitro\u2010reductase rv3368\n T84 T0811 NYSGRC **2** 255 1740 Triose phosphate isomerase\n T85 T0813 NYSGRC **2** **2** 307 4620 Cyclohexadienyl dehydrogenase from Sinorhizobium meliloti in complex with NADP\n T86 T0815 NYSGRC **2** **2** 106 470 Putative polyketide cyclase (protein SMa1630) from Sinorhizobium meliloti\n T87 T0819 NYSGRC **2** **2** 373 3430 Histidinol\u2010phosphate aminotransferase from Sinorhizobium meliloti in complex with pyridoxal\u20105\\'\u2010phosphate\n T88 T0825 Non\u2010SGI **2** **2** 205 1350 WRAP\u20105\n T89 T0840 Non\u2010SGI 1 669 870 RON receptor tyrosine kinase subunit\n T0841 1 253 Macrophage stimulating protein subunit (MSP) \n T90 T0843 MCSG 2 **2** 369 2360 Ats13\n T91 T0847 SGC **1** **2** 176 1320 Human Bj\u2010Tsa\u20109\n T92 T0849 MCSG **2** **2** 240 1900 Glutathione S\u2010transferase domain from Haliangium ochraceum DSM 14365\n T93 T0851 MCSG **2** **2** 456 2680 Cals8 from Micromonospora echinospora (P294S mutant)\n T94 T0852 MCSG **2** **2** 414 1190 APC103154\n\nBold numbers under Quaternary State indicate the oligomeric state assignments available at the time of the prediction experiment; 1 (monomer), 2 (dimer), 4 (tetramer); numbers in regular fonts indicate subsequent assignments collected from the PDB entries for the target structures.\n\nNSGC, Northeast Structural Genomics Consortium; JCSG, Joint Center for Structural Genomics; Non\u2010SGI, Non\u2010SGI research Centers and others; NNPB, NatPro PSI:Biology; NYSGRC, New York Structural Genomics Research Center; MCSG, Midwest Center for Structural Genomics; SGC, Structural Genomics Consortium.\n\nSince it is not uncommon for docking approaches to use information on the symmetry of the complex to restrain or filter docking poses, predictors needed to be given reliable information on the biologically/functionally relevant oligomeric state of the target complex to be predicted. While self association between proteins is common, with between 50 and 75% of proteins forming dimers in the cell,[20](#prot25007-bib-0020){ref-type=\"ref\"}, [21](#prot25007-bib-0021){ref-type=\"ref\"} this association depends on the binding affinity between the subunits and on their concentration. Information on the oligomeric state is in principle derived using experimental methods such as gel filtration or small\u2010angle X\u2010ray scattering (SAXS),[22](#prot25007-bib-0022){ref-type=\"ref\"} and is usually communicated by the authors upon submission of the atomic coordinates to the PDB. With a majority of the targets being offered by structural genomics consortia before their coordinates were deposited in the PDB, author\u2010assigned oligomeric states were available to predictors only for a subset (\u223c15) of the targets, and those were often tentative. For the remaining targets, the oligomeric state was inferred from the crystal contacts using the PISA software,[23](#prot25007-bib-0023){ref-type=\"ref\"} which although being a widely used standard in the field, may still yield erroneous assignments in a non\u2010negligible fraction of the cases, as will be shown in this analysis. Such incorrect assignments represented a confounding factor in this CAPRI round, but also allowed to show that docking calculations may help to correct them.\n\nGLOBAL OVERVIEW OF THE PREDICTION EXPERIMENT {#prot25007-sec-0003}\n============================================\n\nAs in typical CAPRI Rounds, CAPRI predictor groups were provided with the amino\u2010acid sequence of the target protein (for homo\u2010oligomers), or proteins (for heterocomplexes), and with some relevant details about the protein, communicated by the structural biologists. Using the sequence information, the groups were then invited to model the 3D structure of the protein or proteins, and to derive the atomic structure of the complex. To help with the homology\u2010modeling task, with which CASP participants are usually more experienced than their CAPRI colleagues, 3D models of individual target proteins predicted by CASP participants were made available to CAPRI groups for use in their docking calculations. A good number of CAPRI groups, but not all, took up this offer.\n\nIn addition to submitting 10 models for each target complex, predictors were invited to upload a set of 100 models. Once all the submissions were completed, the uploaded models were shuffled and made available to all groups as part of the CAPRI scoring experiment. The \"scorer\" groups were in turn invited to evaluate the ensemble of uploaded models using the scoring function of their choice, and submit their own 10 best ranking ones. The typical timelines per target were about 3 weeks for the homology modeling and docking predictions, and 3 days for the scoring experiment.\n\nTable [2](#prot25007-tbl-0002){ref-type=\"table-wrap\"} lists for each target the number of groups submitting predictions and the number of models assessed. On average \u223c25 CAPRI groups submitted a total of \u223c230 models per target, and an average of 12 scorer groups submitted a total of \u223c120 models per target. With the exception of three targets, an average of seven groups registered with CASP submitted a total of anywhere between 1 and 33 models for individual targets. CASP predictors participated in larger numbers in the prediction of T88 (T0825) and of the heterocomplexes (T89 -- T0840/T0841 and T81 -- T0797/T0798), where the CASP targets were defined as the oligomeric structures.\n\n###### \n\nCAPRI Round 30 Experiment Statistics\n\n ------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n Number of groups Number of models \n ----- -------- ------------------------------------------ --- ---------------------------------------------------- ------------------ ---- ---- ----- ------ ----- -----\n T68 T0759 4q28 2 23 10 12 3 221 1000 120 7\n\n T69 T0764 4q34 2 28 10 14 7 266 1000 132 17\n\n T70 T0765 4pwu 2 23 8 13 5 221 710 130 18\n\n T71 T0768 4oju 3 22 9 14 1 214 810 131 1\n\n T72 T0770 4q69 3 25 11 13 4 244 914 130 11\n\n T73 T0772 4qhz 2 23 11 11 7 221 1195 110 16\n\n T74 T0774 4qb7 2 22 11 10 7 202 911 96 11\n\n T75 T0776 4q9a 1 26 12 12 8 253 840 120 21\n\n T76 T0779 *Cancelled -- no structure* \n\n T77 T0780 4qdy 4 24 12 12 6 229 971 120 12\n\n T78 T0786 4qvu 2 24 10 11 5 229 818 110 15\n\n T79 T0792 5a49 3 25 11 12 9 242 900 120 23\n\n T80 T0801 4piw 1 27 10 12 8 264 911 120 27\n\n T81 T0797\\ 4ojk 1 23 9 11 20 218 641 110 64\n T0798 \n\n T82 T0805 [b](#prot25007-note-0005){ref-type=\"fn\"} 1 25 10 12 9 242 911 120 27\n\n T83 T0809 *Cancelled -- article from different group online* \n\n T84 T0811 [b](#prot25007-note-0005){ref-type=\"fn\"} 1 25 10 12 10 241 910 120 28\n\n T85 T0813 4wji 1 25 11 12 8 241 920 120 21\n\n T86 T0815 4u13 2 26 11 12 9 251 1010 119 25\n\n T87 T0819 4wbt 1 24 10 12 9 231 894 120 25\n\n T88 T0825 [b](#prot25007-note-0005){ref-type=\"fn\"} 1 27 10 13 18 261 910 130 62\n\n T89 T0840\\ [b](#prot25007-note-0005){ref-type=\"fn\"} 1 22 9 11 55 211 790 110 243\n T0841 \n\n T90 T0843 4xau 1 23 9 11 9 221 811 110 28\n\n T91 T0847 4urj 1 25 9 11 9 242 798 110 24\n\n T92 T0849 4w66 1 23 9 11 9 225 789 110 33\n\n T93 T0851 4wb1 1 22 9 11 8 213 697 110 27\n\n T94 T0852 4w9r 1 22 9 12 8 215 783 120 21\n ------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n\nThe number of groups corresponds to registered groups that effectively submitted models for the respective target. The number of models represents submitted models, regardless of quality and includes disqualified models. CAPRI groups are allowed to submit no more than their 10 best models, whereas CASP groups are allowed to submit no more than their 5 best models.\n\nNumber of interfaces assessed.\n\nNot yet released.\n\nTable [2](#prot25007-tbl-0002){ref-type=\"table-wrap\"} also lists the uploader groups and the models that they make available for the scoring experiment (100 models per target per uploader group). As detailed above, the uploaded models are complexes output by the docking calculations carried out by individual participants for a given target. Models, uploaded by the different groups, are anonymized, shuffled, and made available to groups solely interested in testing their scoring functions.\n\nSYNOPSIS OF THE PREDICTION METHODS {#prot25007-sec-0004}\n==================================\n\nRound 30 participants used a wide range of modeling methods and software tools to generate the submitted models. In addition, the approaches used by a given group often differed across targets. Here, we provide only a short synopsis of the main methodological approaches. For a more detailed description of the methods and modeling strategies, readers are referred to the extended Methods Abstracts provided by individual participants (see Supporting Information Table S6).\n\nTemplates, representing known structures of homologs to a given target, stored in the PDB, were used in a number of ways. Most commonly, they were employed to model the 3D structures of individual subunits. Some CAPRI participants selected their own templates and used a variety of custom built or well\u2010established algorithms such as Modeller,[24](#prot25007-bib-0024){ref-type=\"ref\"} Swiss\u2010Model,[25](#prot25007-bib-0025){ref-type=\"ref\"} or ROSETTA,[26](#prot25007-bib-0026){ref-type=\"ref\"} to model the subunit structures. Others used the models produced by various servers participating in the CASP11 experiment and made available to CAPRI groups, or servers of other groups (HADDOCK[27](#prot25007-bib-0027){ref-type=\"ref\"}). The quality of the CASP server models was usually first assessed using various criteria and the best quality models were selected for the docking calculations. Some groups selected a single best model for a given target, whereas others used several models (sometimes up to five models). Several groups additionally used loop modeling to adjust the conformation of loops regions, and subjected the subunit models to energy refinement.\n\nThe majority of CAPRI participants used protein docking and scoring methods to generate and rank candidate complexes. Many employed their own docking methods, some of which were designed to handle symmetric assemblies, whereas others relied on well\u2010established docking algorithms such as HEX,[28](#prot25007-bib-0028){ref-type=\"ref\"} ZDock,[29](#prot25007-bib-0029){ref-type=\"ref\"} RosettaDock,[30](#prot25007-bib-0030){ref-type=\"ref\"} as well as on docking programs such as MZDock[31](#prot25007-bib-0031){ref-type=\"ref\"} which apply symmetry constraints.\n\nWhen templates were available for a given target (mostly for homodimers), some participants used the information from these templates (consensus interface residues, contacts, or relative arrangement of subunits) to guide the docking calculations or to select docking solutions. Others used the dimeric templates directly to model the target dimer (template\u2010based \"docking\"[32](#prot25007-bib-0032){ref-type=\"ref\"}, [33](#prot25007-bib-0033){ref-type=\"ref\"}, [34](#prot25007-bib-0034){ref-type=\"ref\"}). Less than a hand\u2010full of groups employed template\u2010based modeling alone for all or most of the targets.\n\nTo model tetrameric targets, most groups proceeded in two steps. They used either known dimeric homologs, or docking methods to build the dimer portion of the tetramer, and then run their docking procedures to generate a dimer\u2010of\u2010dimers, representing the predicted tetramer.\n\nASSESSMENT PROCEDURES AND CRITERIA {#prot25007-sec-0005}\n==================================\n\nThe standard CAPRI assessment protocol {#prot25007-sec-0006}\n--------------------------------------\n\nThe predicted homo and heterocomplexes were assessed by the CAPRI assessment team, using the standard CAPRI assessment protocol, which evaluates the correspondence between predicted complex and the target structure.[18](#prot25007-bib-0018){ref-type=\"ref\"}, [19](#prot25007-bib-0019){ref-type=\"ref\"}\n\nThis protocol (summarized in Fig. [1](#prot25007-fig-0001){ref-type=\"fig\"}) first defines the set of residues common to all the submitted models and the target, so as to enable the comparison of residue\u2010dependent quantities, such as the root mean square deviation (rmsd) of the models versus the target structure. Models where the sequence identity to the target is too low are not assessed. The threshold is determined on a per\u2010target basis, but is typically set to 70%.\n\n![Schematic illustration of the CAPRI assessment criteria. The following quantities were computed for each target: (1) all the residue\u2010residue contacts between the Receptor (R) and the Ligand (L), and (2) the residues contributing to the interface of each of the components of the complex. Interface residues were defined on the basis of their contribution to the interface area, as described in references.[18](#prot25007-bib-0018){ref-type=\"ref\"}, [19](#prot25007-bib-0019){ref-type=\"ref\"} For each submitted model the following quantities were computed: the fractions *f(nat)* of native and *f(non\u2010nat)* of non\u2010native contacts in the predicted interface; the root mean square displacement (rmsd) of the backbone atoms of the ligand (*L\u2010rms*), the mis\u2010orientation angle *\u03b8* ~L~ and the residual displacement *d* ~L~ of the ligand center of mass, after the receptor in the model and experimental structures were optimally superimposed. In addition we computed *I\u2010rms*, the rmsd of the backbone atoms of all interface residues after they have been optimally superimposed. Here the interface residues were defined less stringently on the basis of residue\u2010residue contacts (see Refs. [18](#prot25007-bib-0018){ref-type=\"ref\"}, [19](#prot25007-bib-0019){ref-type=\"ref\"}).](PROT-84-323-g001){#prot25007-fig-0001}\n\nThe set of common residues is used to evaluate the two main rmsd\u2010based quantities used in the assessment: the ligand rmsd (*L\u2010rms*) and the interface rmsd (*I\u2010rms*). *L\u2010rms* is the backbone rmsd over the common set of ligand residues after a structural superposition of the receptor. *I\u2010rms* is the backbone rmsd calculated over the common set of interface residues after a structural superposition of these residues. An interface residue is defined as such when any of its atoms (hydrogens excluded) are found within 10 \u00c5 of any of the atoms of the binding partner.\n\nAn important third quantity whereby models are assessed is *f(nat),* representing the fraction of native contacts in the target, that is, reproduced in the model. This quantity takes all the protein residues into account. A ligand\u2010receptor contact is defined as any pair of ligand\u2010receptor atoms within 5 \u00c5 distance. Atomic contacts below 3 \u00c5 are considered as clashes; predictions with too many clashes are disqualified. The clash threshold varies with the target and is defined as the average number of clashes in the set of predictions plus two standard deviations. The quantities *f(nat), L\u2010rms* and *I\u2010rms* together determine the quality of a predicted model, and based on those three parameters models are ranked into four categories: High quality, medium quality, acceptable quality and incorrect, as summarized in Table [3](#prot25007-tbl-0003){ref-type=\"table-wrap\"}.\n\n###### \n\nSummary of CAPRI Criteria for Ranking Predicted Complexes\n\n Score *f*(nat) L\u2010rms I\u2010rms\n -------- ------------ ---------- ---------------- ----- ---------------\n \\*\\*\\* High \u2265 0.5 \u2264 1.0 OR \u2264 1.0\n \\*\\* Medium \u2265 0.3 \\< 1.0--5.0\\] OR \\< 1.0--2.0\\]\n \\* Acceptable \u2265 0.1 \\< 5.0--10.0\\] OR \\< 2.0--4.0\\]\n Incorrect \\< 0.1 \\> 10.0 AND \\> 4.0\n\nApplying the CAPRI assessment protocol to homo\u2010oligomers {#prot25007-sec-0007}\n--------------------------------------------------------\n\nEvaluating models of homo and heteroprotein complexes against the corresponding target structure is a well\u2010defined problem when the target complex is unambiguously defined, for example, if the target association mode and corresponding interface represents the biologically relevant unit. This is usually, although not always, the case for binary heterocomplexes, but was not the situation encountered in this experiment for the homo\u2010oligomer targets. All except two of the 25 targets for which predictions were evaluated here represent homo\u2010oligomers. For about half of these targets the oligomeric state was deemed unreliable, as it was either only inferred computationally from the crystal structure using the PISA software[23](#prot25007-bib-0023){ref-type=\"ref\"} or because the authors\\' assignment and inferred oligomeric states, although available, were inconsistent (Table [1](#prot25007-tbl-0001){ref-type=\"table-wrap\"}). Only about 15 targets had an oligomeric state assigned by the authors at the time of the experiment.\n\nTo address this problem in the assessment, the PISA software was used to generate all the crystal contacts for each target and to compute the corresponding interface areas. The interfaces were then ranked according to size of the interface. In candidate dimer targets, submitted models were usually evaluated against 1 or 2 of the largest interfaces of the target, and acceptable or better models for any or all of these interfaces were tallied. For candidate tetramer targets, the relevant largest interfaces for each target were identified in the crystal structure, and predicted models were evaluated by comparing in turn each pair of interacting subunits in the model to each of the relevant pairs of interacting subunits in the target (Supporting Information Fig. S1), and again the best predicted interfaces were retained for the tally. One of the two bonafide heterocomplexes was also evaluated against multiple interfaces.\n\nEvaluating the accuracy of the 3D models of individual subunits {#prot25007-sec-0008}\n---------------------------------------------------------------\n\nSince this experiment was a close collaboration between CAPRI and CASP, the quality of the 3D models of individual subunits in the predicted complexes was assessed by the CASP team using the LGA program,[35](#prot25007-bib-0035){ref-type=\"ref\"} which is the basic tool for model/target comparison in CASP.[36](#prot25007-bib-0036){ref-type=\"ref\"}, [37](#prot25007-bib-0037){ref-type=\"ref\"} The tool can be run in two evaluation modes. In the sequence\u2010dependent mode, the algorithm assumes that each residue in the model corresponds to a residue with the same number in the target, while in the sequence\u2010independent mode this restriction is not applied. The program searches for optimal superimpositions between two structures at different distance cutoffs and returns two main accuracy scores; GDT_TS and LGA_S. The GDT_TS score is calculated in the sequence\u2010dependent mode and represents the average percentage of residues that are in close proximity in two structures optimally superimposed using four selected distance cutoffs (see Ref. [38](#prot25007-bib-0038){ref-type=\"ref\"} for details). The LGA_S score is calculated in both evaluation modes and represents a weighted sum of the auxiliary LCS and GDT scores from the superimpositions built for the full set of distance cutoffs (see Ref. [35](#prot25007-bib-0035){ref-type=\"ref\"} for details). We have run the evaluation in both modes, but since the CAPRI submission format permits different residue numbering, we used the LGA_S score from the sequence\u2010independent analysis as the main measure of the subunit accuracy assessment. This score is expressed on a scale from 0 to 100, with 100 representing a model that perfectly fits the target. The rmsd values for subunit models cited throughout the text are those computed by LGA software. We verified that for about 80% of the assessed models the GDT\u2010TS and LGA\u2010S scores differed by \\<15 units, indicating that these models correspond to near identical structural alignments with the corresponding targets, in line with the fact that the majority of the targets of this Round represent proteins that could be readily modeled by homology. Of the remaining 20% with larger differences between the 2 scores, 18% correspond to disqualified models or incorrect complexes and 2% correspond to acceptable (or higher quality) predicted complexes. Their impact on the analysis is therefore negligible.\n\nBuilding target models based on the best available templates {#prot25007-sec-0009}\n------------------------------------------------------------\n\nIn order to better estimate the added value of protein docking procedures and template\u2010based modeling techniques it seemed of interest to build a baseline against which the different approaches could be benchmarked. To this end, the best oligomeric structure template for each target available at the time of the predictions was identified. Based on this template, the target model was built using a standard modeling procedure, and the quality of this model was assessed using the CAPRI evaluation criteria described above.\n\nTo identify the templates, the protein structure database \"PDB70\" containing proteins of mutual sequence identity \u226470% was downloaded from HHsuite.[39](#prot25007-bib-0039){ref-type=\"ref\"} The database was updated twice during the experiment (See Supporting Information Table S5 for the release date of the database used for each target). Only homo\u2010complexes were considered for this analysis.\n\nThe best available templates were detected in three different ways and target models were generated from the templates as follows: (1) *Detection based on sequence information alone*: For each target sequence, proteins related to the target were searched for in the protein structure database by HHsearch[40](#prot25007-bib-0040){ref-type=\"ref\"} in the local alignment mode with the Viterbi algorithm.[41](#prot25007-bib-0041){ref-type=\"ref\"} Among the top 100 entries, up to 10 proteins that are in the desired oligomer state were selected as templates. When more than two assembly structures with different interfaces were identified, the best ranking one was selected as template. The target and template sequences were aligned using HHalign[40](#prot25007-bib-0040){ref-type=\"ref\"} in the global alignment mode with the maximum accuracy algorithm. Based on the sequence alignments, oligomer models were built using MODELLER.[42](#prot25007-bib-0042){ref-type=\"ref\"} The model with the lowest MODELLER energy out of 10 models was selected for further analysis. (2) *Detection based on the experimental monomer structure*: Proteins with highest structural similarity to the experimental monomer structure were searched for using TM\u2010align.[43](#prot25007-bib-0043){ref-type=\"ref\"} Among the top 100 entries, up to 10 proteins that are in the desired oligomer state were selected as templates as described above. Based on the target\u2010template alignments output by TM\u2010align, models were built using MODELLER, and the lowest energy model was selected as described above. (3) *Detection based on the experimental oligomer structure*: A similar procedure to those described above was applied. Although this time, the best templates were identified by searching for proteins with the highest structural similarity to the target oligomer structure. The search was performed using the multimeric structure alignment tool MM\u2010align.[44](#prot25007-bib-0044){ref-type=\"ref\"} For computational efficiency, MM\u2010align was applied only to the 100 proteins with the highest monomer structure similarity to the target. Models were built using MODELLER based on the alignment output by MM\u2010align.\n\nRESULTS {#prot25007-sec-0010}\n=======\n\nThis section is divided into three parts. The first part presents the prediction results for the 25 individual targets for which the docking and scoring experiments were conducted. In the second part, we present an overview of the results across targets and across predictor and scorer groups, respectively. In the third part, we review the accuracy of the models of individual subunits in the predicted oligomers, and how this accuracy influences the performance of docking procedures.\n\nPrediction results for individual targets {#prot25007-sec-0011}\n-----------------------------------------\n\n### Easy homodimer targets: T69, T75, T80, T82, T84, T85, T87, T90, T91, T92, T93, T94 {#prot25007-sec-0012}\n\nThe 12 targets in this category comprised some of the largest subunits of the entire evaluated target set, with sizes ranging between 176 and 456 residues. Four of the targets were multi\u2010domain proteins (T85, T87, T90, and T93), and one (T82) was an intertwined dimer.\n\nIn the following, we present examples of the performance achieved for this category of targets. Detailed results for all the targets of Round 30 can be found in the Supporting Information Table S2, and on the CAPRI website (URL: /).\n\nAn illustrative example of the average performance obtained for this category of targets is that obtained for target **T69 (T0764)**: a 341\u2010residue putative esterase (BDI_1566) from *Parabacteroides distasonis*. The submitted models for this target were evaluated against two interfaces in the crystal structure of this protein, generated by applying the crystallographic symmetry operations listed in the Supporting Information Table S1, and depicted in Figure [2](#prot25007-fig-0002){ref-type=\"fig\"}(a): one large interface (2415 \u00c5^2^) and a smaller interface (622 \u00c5^2^). Good prediction results were obtained only for interface 1. Twenty\u2010eight CAPRI predictor groups submitted a total of 266 models for this homodimer. Of these, 30 were of acceptable quality and 57 were of medium quality. Twelve predictor groups and three docking servers submitted at least one model of acceptable quality or better. Among those, nine groups and one server (CLUSPRO) submitted at least 1 medium quality model. The best performance (10 medium quality models) was obtained by the groups of Seok, Lee and Guerois, followed closely by the groups of Zou, Shen, and Eisenstein (see Supporting Information Table S2 for the complete ranking)\n\n![Target structures and prediction results for easy dimer targets. **T69 (T0764),** a Putative esterase (BDI_1566) from *Parabacteroides distasonis*, PDB code 4Q34. (**a**) Target structure, with highlighted interfaces (1,2). (**b**) Global docking prediction results displaying one subunit in cartoon representation, with the center of mass of the second subunit in the target (red sphere), and in docking solutions submitted by CAPRI predictors (light blue spheres), CAPRI scorers (dark blue spheres), and CASP predictors (yellow spheres). **T80 (T0801)**, a sugar aminotransferase WecE from Escherichia coli K\u201012, PDB code 4PIW. (**c**) Target structure. (**d**) Global docking prediction results by different predictor groups (see legend (b) for detail). **T82 (T0805)** Nitroreductase (structures unreleased). (**e**) Target structure. (**f**) Global docking prediction results by different predictor groups. **T94 (T0852),** uncharacterized protein Coch_1243 from *Capnocytophaga ochracea* DSM 7271, PDB code 4W9R. (**g**) Target structure. (**h**) Global docking prediction results by different predictor groups.](PROT-84-323-g002){#prot25007-fig-0002}\n\nThe best model for this target, obtained by Guerois, had an *f(nat)* value of 49%, and *L\u2010rms* and *I\u2010rms* values of 2.88 and 2.12 \u00c5, respectively (Supporting Information Table S4).\n\nSix groups, registered with CASP, submitted in total 12 models for this target, comprising one acceptable model by the group of Umeyama and one medium quality model by the Baker group. The global landscape of all the predicted models by the different groups is outlined in Figure [2](#prot25007-fig-0002){ref-type=\"fig\"}(b).\n\nAn even better performance was achieved by the CAPRI scoring experiment (Supporting Information Table S2). Of the 14 groups participating in this experiment, 12 submitted at least two models of medium quality. The best performance was achieved by Kihara (10 medium quality models), closely followed by Zou and Grudinin, with eight and five medium quality models, respectively. As already observed in previous CAPRI evaluations the best performers in the docking calculations were not necessarily performing as well in the scoring experiment, and thus not singling out even their own best models from the uploaded anonymized set of predicted complexes, highlighting yet again the distinct nature of the docking and scoring procedures.\n\nAn important factor in the successful predictions was the overall good accuracy of the 3D models used by predictors in the docking calculations (see Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"} and CAPRI website for detailed values). The best models had an LGA_S score of \u223c85 (backbone rmsd of \u223c3.9 \u00c5), and only a few models had LGA_S scores lower than 40 (backbone rmsd\u2009\\>\u200910 \u00c5) (values for all models are available on the CAPRI website**)**. The accuracy of the 3D models across targets and its influence on the predictions will be discussed in a dedicated section below.\n\nVery good predictions were obtained for **T82 (T0805)**, the nitro\u2010reductase rv3368, a significantly intertwined dimer with unstructured arms reaching out to the neighboring subunit and a subunit interface area of 3250 \u00c5^2^ \\[Fig. [2](#prot25007-fig-0002){ref-type=\"fig\"}(e,f)\\]. The majority of the models of the individual subunits were quite accurate with LGA_S values of 60--85 (backbone rmsd \\<5 \u00c5) (see CAPRI website). As many as 54 medium quality models and 17 acceptable models were submitted by CAPRI participants, 99 models of acceptable quality or better were submitted by CAPRI scorer groups, and 11 acceptable models or better were submitted by three CASP groups (Supporting Information Table S2). The high success rate for both complex predictions and subunit modeling stems from the fact that most predictors made good use of known structures of related homodimers in the PDB in which the intertwining mode was well conserved. These known dimer structures were mainly used in templates for modeling the target dimer (template\u2010based docking).\n\nVery similar participation, number of submitted models and performance, was featured in docking predictions for the other targets in this category (see Supporting Information Tables S2 and S3). The models of individual subunits were also of similar accuracy or higher.\n\nExcellent performance was obtained for targets **T80 (T0819)** and **T93 (T0851)** with \\>100 correct models of which \u223c70 were of medium quality, followed by targets **T90 (T0843)** and **T91 (T0847)**, for which \\>100 correct models, comprising \u223c40 medium quality ones\u2010 were submitted. These targets featured subunits sizes of 176--456 residues.\n\n**T80 (T0801)** was the sugar aminotransferase WecE from E.coli K\u201012, with 376 residues per subunit. Submitted models were evaluated against one interface (1960 \u00c5^2^) between the two subunits of the crystal asymmetric unit \\[Fig. [2](#prot25007-fig-0002){ref-type=\"fig\"}(c)\\]. A total of 27 CAPRI predictor groups submitted 105 models of acceptable quality or better. The majority of these (71 models) were of medium quality. 12 CAPRI groups participated in the scoring experiment and submitted 120 models, of which about half (51) were of medium quality and 14 were acceptable models. Six CASP participants submitted 11 medium quality models, and two models of acceptable quality. The top ranking CAPRI predictor groups for this target were those of Sali, Guerois, and Eisenstein who submitted 10 medium quality models each. These three groups were closely followed by the groups of Seok, Zou, Shen and Lee, each of whom predicted at least five medium quality models. Each of the three participating servers, HADDOCK, GRAMM\u2010X, and CLUSPRO, submitted at least one acceptable model. The best performers from among the scorer groups were those of Zou and Huang with 10 medium quality models each, followed by Gray, Kihara and Weng with at least 5 medium quality models, and by Fernandez\u2010Recio and Bates with four medium quality models. The global landscape of the predictions for this target is shown in Figure [2](#prot25007-fig-0002){ref-type=\"fig\"}(d).\n\nThe subunit models for this target were of very high quality, with the best models featuring a LGA_S score of \u223c95 and a backbone rmsd of 1.3 \u00c5. The quality of the best models for targets T90 and T91 for which a similarly high performance was achieved was only somewhat lower, with LGA_S values of 70--88 and backbone rmsd of 2.0--5.0 \u00c5.\n\nInterestingly, **T91 (T0847),** the human Bj\u2010Tsa\u20109, was predicted to be a dimer by PISA, but assigned as a monomer by the authors. The good docking performance for this target and the fact that the dimer interface (1320 \u00c5^2^) is within the range expected for proteins of this size (176 residues),[45](#prot25007-bib-0045){ref-type=\"ref\"} suggests that this protein forms a dimer.\n\nA somewhat lower performance was achieved for **T92 (T0849**) the glutathione S\u2010transferase domain from *Haliangium ochraceum*), and for **T94 (T0852)**, an uncharacterized 2\u2010domain protein (putative esterase according to Pfam) Coch_1243 from *Capnocytophaga ochracea*. A total of 98 acceptable models were submitted for T92, of which only 12 were of medium quality, but the models were contributed by a large fraction of the participating groups (17 out of 23). On the other hand, the scorer performance was very good with 68 acceptable models of which almost half (33) were of medium quality. These models were contributed across most scorer groups (10 out of 11). CASP participants achieved a particularly good performance. Of the 23 models submitted by CASP groups, 17 were of acceptable quality or better, and those were contributed by six of the seven participating groups. The accuracy of the subunit models was in general lower, with LGA_S \u223c70 and rmsd \u223c7 \u00c5 for the best models, and LGA_S values of 50 -- 60 for most other models.\n\nIn T94, predicted complexes were assessed only against the largest interface (1190 \u00c5^2^), formed between large domains of the adjacent subunits, as the second largest interface was much smaller (620 \u00c5^2^). In total, 97 acceptable homodimer models only, were contributed for this target: 58 models by CAPRI predictors, 37 by CAPRI scorers, and 2 by CASP groups \\[see Supplementary Table S2, and Fig. [2](#prot25007-fig-0002){ref-type=\"fig\"}(g,h) for a pictorial summary\\]. The lower accuracy of the subunit models for this target (LGA_S score \u223c58 and rmsd \\>6 \u00c5, for the best model) may have limited the accuracy of the modeled complexes, without however compromising the task of achieving correct solutions.\n\n### Difficult or problematic homodimer targets: T68, T72, T77, T79, T86, T88 {#prot25007-sec-0013}\n\nThis category comprises 6 targets, representing particular challenges to docking calculations for reason inherent to the proteins involved, or targets for which the oligomeric state was probably assigned incorrectly at the time of the experiment.\n\nWith the exception of T72, targets in this category are much smaller proteins, than those of the easy dimer targets (Table [1](#prot25007-tbl-0001){ref-type=\"table-wrap\"}). In three of the targets (T68, T79, T86) the largest interface area between subunits in the crystal is small (470--860 \u00c5^2^) and their oligomeric state assignments were often ambiguous. In the following, we comment on the insights gained from the results obtained for several of these targets.\n\nNo acceptable homodimer models were contributed by CAPRI or CASP groups for targets T68, T77 and T88. The main problem with **T68 (T0759)**, the plectin 1 and 2 repeats of the Human Periplakin, was that the crystal structure contains an artificial N\u2010terminal peptide representing the His\u2010tag (MGHHHHHHS...) that was used for protein purification. The N\u2010terminal segments of neighboring subunits, which contain the artificial peptide, associate to form the largest interface between the subunits in the crystal (1150 \u00c5^2^) \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(a)\\]. Submitted model were assessed against this interface and the second largest interface (860\u00c5^2^), but not against the 2 much smaller interfaces (240 and 160 \u00c5^2^).\n\n![Target structures and prediction results for difficult or problematic dimer targets. **T68 (T0759)**, Plectin 1 and 2 Repeats of the Human Periplakin, PDB code 4Q28. (**a**) Target structure in cartoon representation, displaying 4 subunits in the crystal. The His\u2010Tag sequence, highlighted in black, mediates contacts at the largest interface. (**b**) Global docking prediction results displaying one subunit in cartoon representation, with the center of mass of the second subunit in the target (red sphere), and in docking solutions submitted by CAPRI predictors (light blue spheres), CAPRI scorers (dark blue spheres), and CASP predictors (yellow spheres). **T77 (T0780),** conserved hypothetical protein (SP_1560), *Streptococcus pneumoniae TIGR4* PDB code 4QDY. (**c**) Target structure, highlighting the assessed interface (dashed line). (**d**) Global docking prediction results by different predictor groups (see legend (b) for detail). **T88 (T0825)**, synthetic wrap five protein (structure unreleased). (**e**) Target structure. (**f**) Global docking prediction results by different predictor groups. **T72 (T0772)**, SusD homolog (BT2259) from *Bacteroides thetaiotaomicron* VPI\u20105482, PDB code 4Q69. (**g**) Target structure, highlighting the three assessed interfaces. (**h**) Global docking prediction results for the three interfaces, by different predictor groups. **T79 (T0792)**, OSKAR\u2010N, PDB code 5a49. (**i**) Target structure, highlighting the three assessed interfaces. (**j**) Global docking prediction results for the three interfaces by different predictor groups. **T86 (T0815)** Putative polyketide cyclase (protein SMa1630) from *Sinorhizobium meliloti*, PDB code 4U13. (**k**) Target structure, showing three interfaces. (**l**) Global docking prediction results for the two interfaces by different predictor groups (the interface with the yellow monomer was not assessed).](PROT-84-323-g003){#prot25007-fig-0003}\n\nMost predictor groups (from both CASP and CAPRI) carried out docking calculations without the His\u2010tag, which they assumed was irrelevant to dimer formation *in\u2010vivo*. They were therefore unable to obtain docking solutions that were sufficiently close to the largest interface of the target \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(b)\\]. As well, no acceptable solutions were obtained for second largest interfaces, indicating that it too was unlikely to represent a stable homodimer.\n\nThe quality of the subunit models was also lower than for many other targets (the best model had an LGA_S score of \u223c57), as most groups ignored the His\u2010Tag in building the models as well (see Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"} and CAPRI website for details). Considering that the His\u2010Tag containing peptide contributes significantly to the largest subunit interface, the protein is likely a monomer in absence of the artificial peptide. This is in fact the authors\\' assignment in the corresponding PDB entry (4Q28), and in retrospect this target should not have been considered for the CAPRI docking experiments.\n\nDifferent factors contributed to the failure of producing acceptable docking solution for **T77 (T0780)**, the conserved hypothetical protein (SP\u20101560), from *Streptococcus pneumonia* TGR4 \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(c,d)\\]. The protein consists of two YbbR\u2010like structural domains (according to Pfam) arranged in a crescent\u2010like shape. The domains adopt rather twisted \u03b2\u2010sheet conformations with extensive stretches of coil, and are connected by a single polypeptide segment, suggesting that the protein displays an appreciable degree of flexibility both within and between the domains. Probably as a consequence of this flexibility, the structures of most templates identified by predictor groups (which approximated only one domain), were not close enough to that of the target (Supporting Information Table S5). As a result, the subunit models were generally quite poor, with the best model featuring an LGS\u2010A score of only \u223c40 (rmsd \u223c7 \u00c5). Although the largest interface of the target is of a respectable size (1600 \u00c5^2^) and involves intermolecular contacts between one of the domains only, the docking calculations were unable to identify it. The best docking model was incorrect as it displayed an *L\u2010rms* \u223c19 \u00c5, and an *I\u2010rms* \u223c10 \u00c5 (see Supporting Information Table S4).\n\nA very different issue plagued the docking prediction of **T88 (T0825)**, the wrap5 protein. The information given to predictors was that the protein is a synthetic construct built from 5 sequence repeats, and is similar to 2YMU (a highly repetitive propeller structure). It was furthermore stated that the polypeptide has been mildly proteolyzed, yielding two slightly different subunits, in which the N\u2010terminus of the first repeat was truncated to different extent, and that therefore the dimer forms in a non\u2010trivial way. Predictors were given the amino acid sequence of the two alternatively truncated polypeptides.\n\nIt turned out that the longer of the two chains, with the nearly intact first repeat forms the expected 5\u2010blade \u03b2\u2010propeller fold, whereas the chain with the severely truncated first repeat forms only four of the blades, with the remainder of the first repeat forming an \u03b1\u2010helical segment that contacts the first repeat \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(e)\\].\n\nBoth CAPRI and CASP predictor groups were quite successful in building very accurate models for the less truncated subunit (rmsd\u2009\\<\u20090.5 \u00c5, LGA_S \u223c90). But subunit models for the more truncated subunit were much poorer (rmsd 6.5--10 \u00c5), and since the helical region of the shorter subunit contributes significantly to the dimer interface, whose total area is not very large (\u223c1300 \u00c5^2^), no acceptable docking solutions were obtained \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(e,f)\\].\n\nFor the other three targets in this category, T72, T79, and T86, the homodimer prediction performance remained rather poor, with only very few acceptable models submitted. The main issue with **T79 (T0792)**, the OSKAR\u2010N protein, and **T86 (T0815)**, the polyketide Cyclase from *Sinorhizobium meliloti*, was likely their very small subunit interface (Table [1](#prot25007-tbl-0001){ref-type=\"table-wrap\"}). T79 was predicted by PISA to be a dimer, but the area of its largest subunit interface is only 680 \u00c5^2^. T86, predicted to be dimeric by both PISA and the authors (as stated in the PDB entry, 4U13), has even smaller size subunit interfaces with the largest one burying no \\>470 \u00c5^2^. In both cases these interfaces are much smaller than the average size required in order to stabilize weak homodimers.[46](#prot25007-bib-0046){ref-type=\"ref\"} It is therefore likely that these two proteins are in fact monomeric at physiological concentrations. Furthermore, T79 and T86 are quite small proteins (80 residues for T79, and 100 residues for T86), and it is not uncommon that proteins of this size cannot form large enough interfaces unless they are intertwined.[47](#prot25007-bib-0047){ref-type=\"ref\"}\n\nThis notwithstanding, a few acceptable homodimer models were contributed for all three assessed interfaces (interfaces 1,2,3) of T79 (Supporting Information Table S2).\n\nAmong predictor groups, 17 acceptable docking solutions (of which five were medium quality models) were obtained for the largest interface (interface 1). Twelve acceptable solutions, of which one medium quality one, were obtained for the second smaller interface (440 \u00c5^2^), and no acceptable quality solutions were obtained for the third assessed interface (400 \u00c5^2^) \\[see Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(i,j) for an overview of the prediction results\\]. Seven CAPRI predictor groups, 1 CASP group and one server (GRAMM\u2010X) contributed the correct models for interface 1, and seven CAPRI groups submitted acceptable models for interface 2.\n\nInterestingly scorers did less well than predictors for interface 1, but better for interface 2, and two scorer groups submitted two acceptable models for interface 3, whereas none were submitted by predictor groups.\n\nOverall, the models for the T79 subunit were quite accurate, with the best model having and LGA_S score of \u223c89 and rmsd \u223c1.9 \u00c5.\n\nNot too surprisingly, the dimer prediction performance for T86 was significantly poorer, with only three acceptable models submitted by CAPRI predictors (Ritchie and Negi) for the largest interface (470 \u00c5^2^). Scorers identified five acceptable models for interface 1 (Fernandez\u2010Recio and Gray), and two acceptable (or better) models for interface 2 (Seok and Kihara). None of the 19 models submitted by the seven CASP groups were correct \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(k,l) for a pictorial summary\\].\n\nDifferent problems likely led to the weak prediction performance for Target **T72 (T0770)**, the SusD homolog (BT2259) from *Bacteroides Thetaiotaomicron*. While the largest subunit interface is of near average size (1120 \u00c5^2^), the interface itself is poorly packed and patchy, an indication that it may not represent a specific association. Not too surprisingly, therefore, this led to a poor prediction performance. Overall only three models of acceptable quality were submitted by CAPRI dockers, namely by the HADDOCK and SWARMDOCK servers, and the Guerois group, each contributing 1 such model. The best of these models (contributed by Guerois) had *f(nat*) \u223c29% and *L\u2010rms* and *I\u2010rms* values of 8.85 and 3.57 \u00c5, respectively. Seven acceptable models were submitted by scorers. Bonvin contributed two models, and the groups of Huang, Grudinin, Gray, Weng and Fernandez\u2010Recio, respectively, submitted one model. The best quality models had *f(nat)* \u223c18%, and *L\u2010rms* and *I\u2010rms* values of \u223c7.29 and 4.28 \u00c5, respectively. No acceptable models were submitted by CASP participants. The target structure and the distribution of the all the docking solutions are depicted in Figure [3](#prot25007-fig-0003){ref-type=\"fig\"}(g,h).\n\nThe accuracy of the subunit models for T72 was reasonable, with the best models having a LGA_S score of \u223c70 (backbone rmsd \u223c3.8 \u00c5). The three successful CAPRI predictor groups (HADDOCK, SWARMDOCK and Guerois) all had somewhat lower quality subunit models with LGA_S scores in the range of 55 -- 67.\n\n### Targets assigned as tetramers: T70, T71, T73, T74, T78 {#prot25007-sec-0014}\n\nFive targets were assigned as tetramers at the time of the prediction experiment. As described in Assessment Procedure and Criteria, models for tetramer targets were assessed by systematically comparing all the interfaces in each model to all the relevant interfaces in the target, and selecting the best\u2010predicted interfaces. Most predictor groups used a two\u2010step approach to build their models. First they derived the model of the most likely dimer, and then docked the dimers to one another. Some groups imposed symmetry restraints as part of the docking procedures, or combined this approach with the two\u2010step procedure.\n\nIn three of the targets (T70, T71, T74) predictors faced the problem that all the pair\u2010wise subunit interfaces were quite small (440--720 \u00c5^2^), making it difficult to identify stable dimers to initiate the assembly procedure.\n\n**T70 (T0765)**, the modulator protein MzrA from *Klebsiella Pneumoniae* Sub Species, was assigned as a tetramer at the time of the predictions, but is listed as a dimer (predicted by PISA and assigned by the authors) in the PDB entry (4PWU). Only two of its interfaces in the crystal bury an area exceeding 400 \u00c5^2^ \\[Fig. [4](#prot25007-fig-0004){ref-type=\"fig\"}(a)\\]. The assembly built by propagating these two interfaces appears to form an extensive layered arrangement across unit cells in the crystal, rather than a closed tetramer.\n\n![Target structures and prediction results for tetrameric targets. **T70 (T0765)**, Modulator protein MzrA (KPN_03524) from *Klebsiella pneumoniae* subspecies. (**a**) Target structure in cartoon representation, highlighting the two assessed interfaces (dashed lines). (**b**) Global docking prediction results displaying one subunit in cartoon representation, with the center of mass of the second subunit in the target (red spheres), and in docking solutions submitted by CAPRI predictors (light blue spheres), CAPRI scorers (dark blue spheres), and CASP predictors (yellow spheres). **T71 (T0768)** Leucine\u2010rich repeat protein (BACCAP_00569) from *Bacteroides capillosu*, PDB code 4QJU. (**c**) Target structure in cartoon representation, highlighting the two relevant interfaces (interfaces 1 and 3) (dashed lines). (**d**) Global docking prediction results for the assessed interfaces by different predictor groups (monomer color corresponding to (c), that is, the red spheres represent the same, blue, monomer). **T73 (T0772),** Putative glycosyl hydrolase, PDB code 4QHZ. (**e**) Target structure in cartoon representation, highlighting the two assessed interfaces (interface 1 and 2) (dashed lines). (**f**) Global docking prediction results for the assessed interfaces by different predictor groups.](PROT-84-323-g004){#prot25007-fig-0004}\n\nInterestingly, acceptable or better models were submitted only for the smaller interface (475 \u00c5^2^) (Supporting Information Table S2). CAPRI predictors submitted 37 acceptable models, of which 27 were of medium quality, and scorers submitted 27 acceptable models (including 21 medium quality ones) \\[Fig. [4](#prot25007-fig-0004){ref-type=\"fig\"}(b)\\]. Indeed no acceptable models were submitted for the largest interface (560 \u00c5^2^), which is assigned as the dimer interface in the PDB entry for this protein.\n\nThe failure to model a higher order oligomer for this target was not due to the quality of the subunit models as the latter was quite high (see Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"} and CAPRI website**)**, and is probably rooted in the pattern of contacts made by the protein in the crystal, which suggest that this target is likely a weak dimer. Considering that all the acceptable docking models involve a different interface than that assigned in the corresponding PDB entry, it is furthermore possible that the interface identified in these solutions is in fact the correct one. But given the very small size of either interface, the protein could also be monomeric.\n\nA similar situation was encountered with **T74 (T0774)**, a hypothetical protein from *Bacteroides vulgatus*. Here too the target was assigned as a tetramer by PISA at the time of the predictions, but is listed as a monomer by the authors in the PDB entry (4QB7). Associating the subunits according to the two largest interfaces (520 and 490 \u00c5^2^), also produced an open\u2010ended assembly rather than a closed tetramer, and this time no acceptable solutions were produced for either interface, strongly suggesting that the protein is monomeric as specified by the authors. It is noteworthy that the subunit models for this target were particularly poor (LGA_S values \u223c40, and rmsd \u223c7 \u00c5), which could also have hampered identifying some of the binding interfaces.\n\n**T71 (T0768)**, the leucine\u2010rich repeat protein from *bacteroides capillosus*, was a difficult case for other reasons. Subunit contacts in the crystal are mediated through three different interfaces, ranging in size from 470 \u00c5^2^ to 720 \u00c5^2^. A closed tetrameric assembly can be built by combining interfaces 1 and 3, associating the dimer formed by subunits A and B with the equivalent dimer of subunits C and D, as shown in Figure 4(c). Interfaces 1 and 3 were also those for which some acceptable predictions were submitted. One acceptable model was contributed for the largest interface, by the GRAMM\u2010X, an automatic server. Eleven acceptable models were submitted for the third interface (470 \u00c5^2^) by 4 CAPRI predictor groups, and six acceptable models were submitted by four CAPRI scorer groups. All the models submitted by a single CASP group were wrong. No group succeeded in building the tetramer that comprises the correct models for interfaces 1 and 3 at the same time. Some models looked promising, but when superimposing equivalent subunits (in the model vs. the target) the neighboring subunit of the model (the one across the incorrectly predicted interface) had its position significantly shifted relative to that in the target, resulting in an incorrect structure of the tetrameric assembly.\n\nThe remaining two targets, **T73 (T0772)**, a putative glycosyl hydrolase from *Parabacteroides distaspnos,* and **T78 (T0786),** a hypothetical protein from *Bacillus cereus*, were genuine tetramers assigned as such by both PISA and the authors. Both targets are proteins of similar size (\u223c260 residues) adopting an assembly with classical D~2~ symmetry, which comprises two interfaces, a sizable one (\\>1000 \u00c5^2^) and a smaller one. But the main bottleneck for both targets was that their larger interface was intertwined. Available templates did not seem to capture the intertwined associations, as witnessed from the overall poorer models derived for the individual subunits. For both targets, the best models had an LGA_S score \u223c50 and a backbone rmsd of \u223c5--10 \u00c5. For T73, a total of only nine acceptable models were submitted by the CAPRI predictor groups of LZERD, Zou and Kihara for the largest interface, and two acceptable models were submitted by the Lee group for the second interface. None of the predicted tetramer models simultaneously captured both interfaces, as illustrated in Figure 4(e,f). For T78, no acceptable solutions were submitted by any of the participating groups, but the subunit models were only marginally more accurate than those of T73.\n\nThe conclusions to be reached from the analysis of these five targets are twofold. One is that the oligomeric state assignment for higher order assemblies such as tetramers is more error prone than that of dimer versus monomers. Tetramers often involve smaller interfaces between subunits, especially those formed between individual proteins when two dimers associate, and therefore predictions on the basis of pair\u2010wise crystal contacts such as those by PISA become unreliable. Independent experimental evidence is therefore required to validate the existence of a higher order assembly. The second conclusion to be drawn is that the prediction of higher order assembly by docking procedures remains a challenge. Acceptable models derived for the largest dimer interface are probably not accurate enough to enable the identification of stable association modes between two modeled dimers. This indicates in turn that the propagation of errors is the problem that currently hampers the modeling of higher order assemblies from the structures of its components in absence of additional experimental information.\n\n### Heterocomplex targets: T81, T89 {#prot25007-sec-0015}\n\n**T81 (T0797/T0798)** and **T89 (T0840/T0841)** were the only two *bona\u2010fide* heterocomplex targets in Round 30. T81 is the complex between the cGMP\u2010dependent protein Kinase II leucine zipper (44 residues) and the Rab11b protein (198 residues) (PDB code 4OJK). T89 is the complex between the much larger RON receptor tyrosine kinase subunit (669 residues) and the macrophage stimulating protein subunit (MSP) (253 residues).\n\nThe crystal structure of T81 features two Rab11b proteins binding on opposite sides of the centrally located leucine zipper, in a quasi\u2010symmetric arrangement, which likely represents the stoichiometry of the biological unit \\[Fig. [5](#prot25007-fig-0005){ref-type=\"fig\"}(a)\\]. A total of 3 interfaces were evaluated for this targets: Interface 1 (chains C:A, leucine zipper helix 1/one copy of the Rab11b protein), Interface 2 (C:D, leucine zipper helix 1/helix 2), interface 3 (equivalent to interface 1). The two Rab11b/zipper helix interfaces were not exactly identical (780 \u00c5^2^ for interface 1 and 630 \u00c5^2^ for interface 2). The interface between the helices of the leucine zipper was somewhat larger (780 \u00c5^2^). Overall, the interface area of a single copy of the Rab11b protein binding to the leucine zipper dimer measures 1070 \u00c5^2^.\n\n![Target structures and prediction results for heterocomplex targets. **T81 (T0797/T0798)**, cGMP\u2010dependent Protein Kinase II Leucin Zipper and Rab11b Protein Complex, PDB code 4OJK. (**a**) Target structure in cartoon representation, highlighting the interface of the leucine zipper dimer (2), and the two equivalent interfaces (1,3), between the zipper dimer and the two Rab11b proteins (dashed lines). (**b**) Global docking prediction results displaying one of the Rab11b subunits in cartoon representation, with the center of mass of the leucine zipper dimer in the target (red sphere), and in docking solutions submitted by CAPRI predictors (light blue spheres), CAPRI scorers (dark blue spheres), and CASP predictors (yellow spheres). **T89 (T0840/T0841)**, complex of the RON receptor tyrosine kinase subunit and the macrophage stimulating protein subunit (MSP) (structure not released). (**c**) Target structure in cartoon representation. (**d**) Global docking prediction results displaying the RON receptor kinas subunit, in cartoon representations, and the center of mass of the MCP proteins in the target and in docking solutions submitted by different predictor groups.](PROT-84-323-g005){#prot25007-fig-0005}\n\nConsolidating correct predictions for the equivalent interfaces (Interfaces 1 and 3), the prediction performance for this complex as a whole was disappointing. Only 12 correct models were submitted by the 7 CAPRI predictor groups of Guerois, Seok, Huang, Vajda/Kozakov, SWARMDOCK, CLUSPRO (a server) and Bates. Five of those (submitted by Guerois, Seok and Huang) were of medium quality. The performance of CAPRI scorers was better, with 54 correct models of which 16 of medium quality. All 11 scorer groups contributed these models, and the best scorer performance was achieved by the groups of Bates, followed by those of LZERD, Oliva, Huang, Fernandez\u2010Recio and Seok. The prediction landscape for this target is shown in Figure 5(b).\n\nT89, the RON receptor kinase subunit complex with MSP, was a simpler target, given the clear, binary character of this heterocomplex. But the large size of the receptor subunit, and the relatively small interface it formed with MSP, represented a challenge for the docking calculations. The prediction performance for this complex was quite good overall, with a total of 87 correct models submitted by predictors, representing 41% of all submitted predictor models. Unlike for many other targets of this round, scorers did only marginally better, with 42% of correct models. CASP groups were specifically invited to submit models for this target, and 55 groups did, nearly ten times more than for other targets in this round. But their performance was much poorer than that of CAPRI groups. Only 23 models out of the 223 submitted by CASP groups (10%) were correct, and 6 of these were medium accuracy models.\n\nThe best performance among CAPRI predictor groups was by the HADDOCK server, followed by the groups of Vakser, Seok, Guerois, Grudinin, Lee, Huang and Tomii (see Supporting Information Table S2). A pictorial summary of the prediction performance for this target is provided in Figure 5(c,d).\n\nResults across targets and groups {#prot25007-sec-0016}\n---------------------------------\n\n### Across target performance of CAPRI docking predictions {#prot25007-sec-0017}\n\nResults of the docking and scoring predictions for the 25 assessed targets of Round 30, obtained by all groups that submitted models for at least one target, are summarized in Figure 6 and in the Supporting Information Table S3. For a full account of the results for this Round the reader is referred to the CAPRI web site (/).\n\nThe results summarized in Figure 6 show clearly that the prediction performance varies significantly for targets in the four different categories. As expected, the performance is significantly better for the 12 dimer targets in the \"easy\" category, than for those in the other categories. For 10 of the 12 \"easy\" targets, at least 30% of the submitted models per target are of acceptable quality or better, and for most of these (eight out of 10), at least 20% of the models are of medium quality. The accuracy of the subunit models (top panel, Fig. 6) is rather good for most of these targets. With the exception of T93, for which the quality of the subunits models spans a wide range (LGA_S \u223c40--80), the models of the remaining 11 targets achieve high LGA_S scores with averages of 80 or above.\n\nThe two less well\u2010predicted targets in this category are T92 and T94, probably due to the lower quality of the subunit models (average LGA_S\u2009\\<\u200960) (top panel, Fig. 6).\n\nThe docking prediction performance is quite poor for the six \"difficult or problematic\" dimer targets, where a few acceptable models were submitted for only three of the targets (T72, T79, T86), and no acceptable models were submitted for the remaining three targets. This very poor performance was not rooted in the docking or modeling procedures but rather in the targets themselves. In 4 of the targets in this category (T68, T72, T79, T86) the oligomeric state (dimer in this case), often predicted only by PISA, but sometimes also provided by the authors, was likely incorrectly assigned. In T68, the His\u2010tag used for protein purification and included in the crystallization forms the observed dimer interface, which is therefore most certainly non\u2010native. In T72 the main problem was its very poorly packed and patchy interface, suggesting that the dimer might be a crystal artifact, whereas in T79 and T86, all the pair\u2010wise interfaces in the crystal structure were too small for any of them to represent a stable dimer.\n\nThe only genuinely difficult dimer targets were T77 and T88. For T77, the subunits of this flexible 2\u2010domain protein were rather poorly modeled (average LGA_S 30--40), making it difficult to model the \"handshake\" arrangement of the subunits in the dimer \\[Fig. [3](#prot25007-fig-0003){ref-type=\"fig\"}(c,d)\\]. In T88, the synthetic wrap5 protein, most predictor groups failed to meet the challenge of correctly modeling the shorter of the two subunits, in turn leading to incorrect solutions for the heterodimer.\n\nAs already mentioned, a very poor performance was achieved for the five targets assigned as tetramers at the time of the predictions. This is illustrated at the level of the individual interfaces in these targets (Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"}). However, here too the problem was not necessarily rooted in limitations of the docking or modeling procedures. Two of the targets, T70 and T74, seem to have been erroneously assigned as tetramers at the time of the prediction by PISA, as described above. T70 was assigned as a dimer, and T74 as a monomer, by the respective authors in the PDB entry. In agreement with the authors\\' assignment, no acceptable solutions were identified for any of the interfaces in T74. Somewhat surprisingly, the quality of the subunits models for this target was particularly poor as well (average LGA_S \u223c30).\n\n![Pictorial summary of the prediction results per assessed interface of the targets in CAPRI Round 30. The lower panel depicts the fraction of models of acceptable and medium quality respectively, submitted by CAPRI and CASP predictor groups, for the 42 assessed interfaces in all 25 targets (listed along the horizontal axis). The digit following the CAPRI target number represents the assessed interface. The symmetry transformation corresponding to the assessed interfaces in each target are listed in the Supporting Information Table S1. The fraction of correct models is shown separately for the four main target categories: Easy dimer targets, difficult (or problematic) dimer targets, tetrameric targets, and heterocomplex targets. The middle panel displays the same data for models submitted for the same interfaces by CAPRI scorer groups. The top panel shows box plots of the LGA_S score values of the subunits in submitted models for the targets listed along the horizontal axis. The LGA_S score is one of the CASP measures of the accuracy of the predicted 3D structure of a protein.[35](#prot25007-bib-0035){ref-type=\"ref\"} The red dots represent the LGA_S score of the subunit structure of the best quality homo or heterocomplex model submitted for each target. The best quality model is defined as the one with the lowest *I\u2010rms* (see **Fig**. [1](#prot25007-fig-0001){ref-type=\"fig\"} for details).](PROT-84-323-g006){#prot25007-fig-0006}\n\nIn T70, the docking calculations were able to identify only the smaller of the two interfaces as forming the dimer interface (Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"}), but this interface differs from the one assigned by the authors. This result leaves open the possibility that this protein may indeed be a weak dimer, in agreement with the author\\'s assignment, albeit a different dimer than the one that they propose. Thus for both of these seemingly erroneously assigned tetrameters, the docking calculations actually gave the correct answer, which supports the author\\'s subsequent assignments, which were not made available at the time of the prediction experiment.\n\nFor the other three tetrameric targets, T71, T73 and T78, the poor interface prediction performance reflects the genuine challenges of modeling higher order oligomers. In T71 the small size of the individual interfaces was likely the reason for the paucity of acceptable dimer models, and those were moreover not accurate enough to enable the correct modeling of the higher order assembly (dimer of dimers). In T73 and T78, the very few acceptable models for interfaces in the former, and the complete failure to model any of the interfaces in the latter (Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"}), likely stem from the lower accuracy of the corresponding subunit models (average LGA_S \u223c50--60).\n\nThe docking prediction performance was better, but not particularly impressive for the two heterocomplex targets T81 and T89, which represent the type of targets that the CAPRI community commonly deals with. For T81 only \u223c5% of the submitted models were of acceptable quality or better, whereas for T89 the corresponding model fraction was 40%, similar to that achieved for the easy dimer targets. The poorer performance for T81 can be readily explained by the fact that this target was in fact a hetero tetramer, two copies of the Rab11b protein binding to opposite sides of a leucine zipper, which had to be modeled first.\n\nThese results taken together indicate that homology modeling techniques and docking calculations are able to predict rather well the structures of biologically relevant homodimers. In addition we see that the prediction performance for such targets is on average superior than that obtained for heterocomplexes in previous CAPRI rounds, where on average only about 10--15% of the submitted models are correct for any given target (), compared to 25% obtained for the majority of the genuine dimer targets in this Round, including both easy and difficult homodimers. This result is not surprising, as interfaces of homodimers are in general larger and more hydrophobic than those of heterocomplexes,[45](#prot25007-bib-0045){ref-type=\"ref\"} properties which should make them easier to predict.\n\nAnother noteworthy observation is that docking calculations can often help to more reliably assign the protein oligomeric state, especially in cases where available assignments were ambiguous. Such cases were encountered for several of the difficult or problematic targets, and for targets assigned as tetramers. On the other hand, the main challenge in correctly modeling tetramers is to minimize the propagation of errors caused by even small inaccuracies in modeling individual interfaces, which can in turn be exacerbated by inaccurate 3D models of the protein components.\n\n### Across target performance of CAPRI scoring predictions {#prot25007-sec-0018}\n\nAs shown in the middle panel of Figure 6, CAPRI scorer groups achieved overall a better prediction performance than predictor groups. The scoring experiment involves no docking calculations, and only requires singling out correct solutions from among the ensemble of models uploaded by groups participating in the docking predictions. Clearly, such solutions cannot be identified if the ensemble of uploaded models contains only incorrect solutions. Therefore no correct scoring solutions were submitted by scorers for targets where no acceptable docking solutions were present within the 100 models uploaded by predictor groups for given target.\n\nHowever, for targets where at least a few correct docking models were obtained by predictors, scorers were often able to identify a good fraction of these models, as well as other models that were not identified amongst the 10 best models by the groups that submitted them (Fig. [6](#prot25007-fig-0006){ref-type=\"fig\"}). This was particularly apparent for the easy dimer targets, where scorers often submitted a significantly higher fraction of acceptable\u2010or\u2010better models (\\>50%) than in the docking experiment, where this fraction rarely exceeded 40%. A similar result was achieved for the heterocomplexes, and was particularly impressive for T81, where nearly half of the submitted models by scorers were correct, compared to only 5% for the docking predictions.\n\nThe seemingly superior performance of scorers over dockers has been observed in previous CAPRI assessments[16](#prot25007-bib-0016){ref-type=\"ref\"}, [19](#prot25007-bib-0019){ref-type=\"ref\"} where it was attributed in part to the generally poor ranking of models by predictors. Their highest\u2010ranking models are often not the highest\u2010quality models, and acceptable or better models can often be found lower down the list and amongst the 100 uploaded models. Another reason is the fact that the search space that scorers have to deal with is orders of magnitude smaller (a few thousands of models), than the search space dockers commonly sample (tens of millions of models). This significantly increases the odds of singling correct solutions in the scoring experiment.\n\nClearly however, there is more to the scorers\\' performance than chance alone, particularly in this CAPRI Round, where the main challenge was to model homo\u2010oligomers. Some groups that have also implemented docking servers had their server perform the docking predictions completely automatically, but carried out the scoring predictions in a manual mode, which still tends to be more robust. In addition, a meta\u2010analysis of the uploaded models, such as clustering similar docking solutions and selecting and refining solutions from the most populate clusters can also lead to improved performance.\n\nThis notwithstanding, the actual scoring functions used by scorer groups must play a crucial role. But this role is currently difficult to quantify in the context of this assessment.\n\n### Performance across CAPRI and CASP predictors, scorers and servers {#prot25007-sec-0019}\n\nThe ranking of CAPRI\u2010CASP11 participants by their prediction performance on the 25 targets of Round 30 is summarized in Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"}. The per\u2010target ranking and performance of participants can be found in the Supporting Information Tables S2 and S3.\n\n###### \n\nParticipant ranking by Target performanceParticipant\n\n Participated targets Performance\n --------------------------------------- ---------------------- ---------------\n **CAPRI Predictor Ranking** \n Seok 25 **15/14\\*\\***\n Huang 25 **16/13\\*\\***\n Guerois 25 **16/12\\*\\***\n Zou 25 **14/11\\*\\***\n Shen 25 **13/11\\*\\***\n Grudinin 24 **11/10\\*\\***\n Weng 25 **13/9\\*\\***\n Vakser 25 **11/9\\*\\***\n Vajda/Kozakov 24 **15/8\\*\\***\n Fernandez\u2010Recio 25 **11/8\\*\\***\n Lee 20 **10/7\\*\\***\n Tomii 20 **8/6\\*\\***\n Sali 12 **6/4\\*\\***\n Negi 25 **7/3\\*\\***\n Eisenstein 6 **3\\*\\***\n Bates 25 **7/2\\*\\***\n Kihara 23 **7/2\\*\\***\n Zhou 25 **4/2\\*\\***\n Tovchigrechko 12 **3/1\\*\\***\n Ritchie 8 **2/1\\*\\***\n Fernandez\u2010Fuentes 14 **1**\n Xiao 11 **1**\n Gong 8 **0**\n Del Carpio 3 **0**\n Wade 2 **0**\n Haliloglu 1 **0**\n **CAPRI SERVER Ranking** \n HADDOCK 25 **16/9\\*\\***\n CLUSPRO 25 **16/8\\*\\***\n SWARMDOCK 25 **11/4\\*\\***\n GRAMM\u2010X 22 **6/1\\*\\***\n LZERD 25 **3**\n DOCK/PIERR 2 **1**\n **CAPRI Scorer Ranking** \n Bonvin 25 **18/14\\*\\***\n Bates 24 **17/13\\*\\***\n Huang, Seok 25 **16/13\\*\\***\n Zou, Kihara 25 **15/12\\*\\***\n Fernandez\u2010Recio 25 **14/12\\*\\***\n Weng 25 **16/11\\*\\***\n Oliva 22 **14/11\\*\\***\n Grudinin 25 **13/10\\*\\***\n Gray 17 **10/7\\*\\***\n LZERD 25 **6\\*\\***\n Lee 5 **3/2\\*\\***\n Sali 1 **0**\n **CASP Predictor and Server Ranking** \n Umeyama 19 **13/8\\*\\***\n ROSETTASERVER 13 **9/8\\*\\***\n Dunbrack 12 **11/6\\*\\***\n SEOK_SERVER 22 **7/5\\*\\***\n Luethy 8 **5/4\\*\\***\n Nakamura 12 **7/3\\*\\***\n Baker 8 **3\\*\\***\n Wallner 2 **1\\*\\***\n Skwark, Lee, RAPTOR\u2010X_Wang, NNS_Lee 1--4 **1**\n *39 participants not listed* 1--5 **0**\n\nFor each target only the best quality solution is counted; in total 25 targets were assessed. Column 2 indicates the number of targets for which predictions were submitted. In Column 3, the numbers without stars indicate models of acceptable quality or better, and the numbers with \"\\*\\*\" indicate the number of those models that were of medium quality.\n\nThe ranking in Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"} considers only the best quality model submitted by each group for every target. The ranking in the Supporting Information Table S2 takes into account both the total number of acceptable models, and the number of higher quality models (medium quality ones for this Round, as detailed in the section on assessment criteria). When two groups submitted the same number of acceptable models, the one with more high quality models is ranked higher, and when two groups submitted the same number of high quality models, the group with more acceptable models is ranked higher.\n\nOverall, a total of 11 CAPRI predictor groups submitted correct models for at least 10 targets, and medium quality models for at least seven targets. These groups submitted models for at least 20 of the targets. Among those, the highest\u2010ranking groups in this Round are Seok, Huang, and Guerois, with correct models for 15 or 16 targets, and medium quality models for 12--14 of these targets. These are followed by Zou, Shen and Grudinin (correct models for 11--14 targets, and medium quality models for 10 or 11 of those). The remaining five highest ranking groups, Weng, Vakser, Vajda/Kozakov, Fernandez\u2010Recio and Lee, achieve correct predictions for 10--15 targets and medium quality predictions for 7--9 of those. It is noteworthy that two of the three top ranking predictor groups (Seok and Guerois), and at least one other group (Vakser) made heavy use of template\u2010based modeling, an indication that this approach can be quite effective.\n\nThe remaining groups listed in Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"} were ranked lower, as they corrected predicted between 1 and 8 targets only, and produced only a few medium quality models for these targets. However some of these groups submitted predictions for a smaller number of targets. Their performance can therefore not be fairly compared to that of other groups.\n\nOf the 6 CAPRI automatic docking servers ranked in Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"}, HADDOCK and CLUSPRO rank highest, followed by SWARMDOCK, and GRAMM\u2010X.\n\nIt is interesting to note that two top ranking CAPRI servers submitted correct predictions for 16 targets, just as many as the top ranking predictor groups. But the latter groups still produce more medium accuracy models (\\>10) than the servers (no more than 9). Thus as already noted in previous CAPRI assessment, some CAPRI servers perform nearly on par with more manual predictions.\n\nAmong the CASP predictor and server groups listed in Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"}, the groups of Umeyama and Dunbrack rank highest, and both would rank among the best CAPRI predictor groups as their success rate (fraction of correct over submitted models) was also high. Of the servers, ROSETTASERVER and SEOK_SERVER rank highest, with a performance level similar to SWARMDOCK. Thirty\u2010nine CASP groups submitting models for 1--5 targets, none of which were correct, are not explicitly listed in the Table.\n\nLastly, judging also by the best model submitted for each target, CAPRI scorers outperform CAPRI predictors, as already mentioned when analyzing the performance across targets. Highly ranking scorer groups submitted on average correct models for 1--2 more targets than CAPRI predictors, and the number of medium quality models that groups submit for these targets is also somewhat higher.\n\nOf the 13 scorer groups that submitted an accurate model for at least one target, 11 have correctly predicted at least 10 targets and submitted medium quality models for seven of those.\n\nThe best performing groups are those of Bonvin, Bates, Huang, Seok, Zou and Kihara, followed closely by four other groups that correctly predicted at least 13 targets, and produced medium quality models for at least 10 of these (Table [4](#prot25007-tbl-0004){ref-type=\"table-wrap\"}).\n\nFactors influencing the prediction performance {#prot25007-sec-0020}\n----------------------------------------------\n\nUnlike in previous CAPRI rounds, Round 30 comprised solely targets where both the 3D structure of the protein subunits and their association modes had to be modeled. Deriving the atomic coordinates of the predicted homo\u2010oligomers therefore involved a number of steps each requiring the use of specialized software and making strategic choices as to how it should be applied.\n\nAs mentioned in Synopsis of the Prediction Methods, the approaches for modeling the subunit structures and generating the oligomer models vary widely amongst predictor groups, and across targets. It is therefore difficult to reliably pinpoint specific factors that contributed or hampered successful predictions. Nonetheless some general trends can be outlined. Even though Round 30 comprised only targets whose subunits could be readily modeled using templates from the PDB, the subunit modeling strategy had an important influence on the final oligomer models. Groups that used several different subunit models for the same target increased their chance of deriving at least an acceptable oligomer model. Such different models were obtained either by using different templates (some groups used as many as five templates for the same target), or by starting from the same template and modifying it by optimizing loop conformations and subjecting it to energy refinements. These optimizations seemed particularly effective when carried out in the context of the oligomers representing the highest\u2010ranking template\u2010based or docking models.\n\nAs already mentioned, information on oligomeric templates in the PDB was another important element contributing to improve the prediction performance. This information was the main ingredient for two of the best performing groups that heavily relied on template\u2010based docking. Other groups that performed well used mainly *ab\u2010initio* docking methods of various origins, but either guided the calculations or filtered the results based on structural information from homologous oligomers.\n\nOther important elements, such as selecting representative members of clusters of docking solutions, and the final scoring functions used to rank models and select those to be submitted, also played a role as already mentioned here and in previous CAPRI reports.[19](#prot25007-bib-0019){ref-type=\"ref\"}\n\nIn the following we examine in more detail the impact of two important elements of this joint CASP\u2010CAPRI experiment. We evaluate the influence of the accuracy of individual subunits models on the oligomer prediction performance, and estimate the extent to which procedures that rely on docking methodology and those that employ specialized template\u2010based modeling confer an advantage over straightforward homology modeling.\n\n### Influence of subunit model accuracy {#prot25007-sec-0021}\n\nThe subunit models used to derive the models of the oligomers were generated either by CAPRI groups, those with more homology modeling expertise, or borrowed from amongst the models submitted by CASP servers, which were made available to CAPRI groups in time for each docking experiment. The subunit structures in models submitted by CAPRI and CASP groups for all 25 targets of Round 30 were assessed using the standard CASP GDT_TS and LGA_S scores, as well as the backbone rmsd of the submitted model versus the target structures. The values of these measures obtained for models submitted by all participants in Round 30 for each target can be found at the CAPRI website together with the assessment results for this Round.\n\nTo gauge the relation between the accuracy of subunit models and the docking prediction performance, the LGA\u2010S scores of subunit models in the predicted complexes for the 25 targets in this Round are plotted in Figure 7 as a function of the *I\u2010rms* value. The LGA_S measure was used because it does not depend on the residue numbering along the chain, which may vary at least in a fraction of the models submitted by CAPRI participants. The *I\u2010rms* measure was used as it represents best the accuracy level of the predicted interface.\n\nEach point in Figure 7 represents one submitted model, and points are colored according to the quality of the predicted complex (incorrect, acceptable and medium quality). The plot clearly shows that medium quality predicted complexes (*I\u2010rms* values between 1 and 3 \u00c5) tend to be associated with high accuracy subunit models (LGA_S values \\>80). We also see that predicted complexes of acceptable quality (*I\u2010rms* values of 2--4 \u00c5) are associated with subunit models that span a wide range in accuracy levels (LGA_S between 30 and 90). This range is comparable to the subunit accuracy range associated with incorrect models of complexes (*I\u2010rms* \\>4 \u00c5; see Table [3](#prot25007-tbl-0003){ref-type=\"table-wrap\"} for details on how *I\u2010rms* contributes to rank CAPRI models). Identical trends are observed when plotting the GDT\u2010TS scores as a function of the *I\u2010rms* values for the fraction of the models with correct residues numbering (Supporting Information Fig. S2).\n\nThat both accurate and inaccurate subunit models are associated with incorrectly modeled complexes is expected. Inaccurate subunit models may indeed prevent the identification of the correct binding mode, and docking calculations may fail to identify the correct binding mode even when the subunit models are sufficiently accurate. It is however noteworthy that complexes classified as incorrect by the CAPRI criteria do not necessarily represent prediction noise, as a recent analysis has shown that residues that contribute to the interaction interfaces are correctly predicted in a significant fraction of these complexes.[48](#prot25007-bib-0048){ref-type=\"ref\"}\n\nSomewhat less expected is the observation (Fig. [7](#prot25007-fig-0007){ref-type=\"fig\"}) that in a significant number of cases, acceptable and to a smaller extent also medium quality docking solutions can be identified even with lower accuracy models of the individual subunits. This is an encouraging observation, as it suggests that docking calculations can lead to useful solutions with protein models built by homology, and that these models need not always be of the highest accuracy. What probably matters more for the success of docking predictions is the accuracy with which the binding regions of the individual components of the complex are modeled, rather than the accuracy of the 3D model considered in its entirety.\n\n![Subunit model accuracy and the quality of predicted complexes in CAPRI Round 30. The CASP LGA_S scores of subunit models in the predicted complexes for the 25 targets in this Round (vertical axis) are plotted as a function of the *I\u2010rms* values (horizontal axis). Each point in this Figure represents one submitted model, and points are colored according to the quality of the predicted complex, respectively, incorrect (yellow), acceptable (blue) and medium (green) quality (see Table [1](#prot25007-tbl-0001){ref-type=\"table-wrap\"} and the text for details).](PROT-84-323-g007){#prot25007-fig-0007}\n\n### Round 30 predictions versus standard homology modeling {#prot25007-sec-0022}\n\nTo estimate the extent to which docking methods or template\u2010based modeling procedures confer an advantage over straightforward homology modeling, the accuracy of the submitted oligomer models for each target was compared to the accuracy of the models build using the best oligomer templates for that target available in the PDB at the time of the prediction. Only dimer targets (and templates) were considered, given the uncertainty of the oligomeric state assignments for some of the tetrameric targets.\n\nThree categories of the best dimeric templates were considered (see Assessment Procedures and Criteria): templates identified on the basis of sequence alignments alone, templates identified by structurally aligning the target and template monomers, and templates identified by structurally aligning the target and template oligomers. Only the sequence\u2010based template selection reconstitutes the task performed by predictors, to whom only the target sequence was disclosed at the time of the prediction. The resulting templates thus represent the best templates available to predictors during the prediction Round. Obviously, the structurally most similar templates could not be identified by predictors, but are considered here in order to evaluate the advantage, if any, conferred by such templates over those identified on the basis of sequence alignments.\n\nTable [5](#prot25007-tbl-0005){ref-type=\"table-wrap\"} lists the best templates from each category identified for all dimeric targets of Round 30 and the corresponding template\u2010target TM\u2010scores. These templates represent those with the highest TM\u2010score among the best 10 templates from each category detected for a given targets. Not too surprisingly targets with more similar templates, those featuring high TM\u2010scores (\u22650.7), are the easy targets, whereas difficult targets are those with poorer templates (lower TM\u2010scores). Many of the best templates from all three categories were also detected and used by predictor groups (see Supporting Information Table S5), even though these groups only had sequence information to identify them during the prediction round.\n\n###### \n\nBest available templates detected based on sequence (\"Sequence\"), experimental monomer structure (\"Monomer\"), and experimental oligomer structure (\"Oligomer\")Target\n\n Target released Database released Best template TM\u2010score (detected template) \n ----- ----------------- ------------------- -------------------------------------------- -------------- --------------\n T68 May 01, 2014 April 24, 2014 0.348 (3njd) 0.370 (3fse) 0.370 (3fse)\n T69 May 05, 2014 April 24, 2014 0.852 (1qlw) 0.852 (1qlw) 0.852 (1qlw)\n T70 May 06, 2014 April 24, 2014 0.639 (2f06) 0.644 (3c1m) 0.652 (3tvi)\n T71 May 07, 2014 April 24, 2014 0.509 (2id5) 0.618 (3jur) 0.618 (3jur)\n T72 May 08, 2014 April 24, 2014 0.510 (3otn) 0.510 (3otn) 0.510 (3otn)\n T73 May 09, 2014 April 24, 2014 [a](#prot25007-note-0008){ref-type=\"fn\"} 0.554 (1hql) 0.554 (1hql)\n T74 May 12, 2014 April 24, 2014 0.340 (4jrf) 0.340 (4jrf) 0.340 (4jrf)\n T75 May 13, 2014 April 24, 2014 0.880 (3rjt) 0.880 (3rjt) 0.880 (3rjt)\n T77 May 15, 2014 April 24, 2014 0.393 (2xwx) 0.375 (4iib) 0.375 (4iib)\n T78 May 20, 2014 May 17, 2014 0.315 (3c6c) 0.370 (1o0s) 0.403 (2f3o)\n T79 May 23, 2014 May 17, 2014 0.440 (2bnl) 0.469 (2xig) 0.471 (2w57)\n T80 June 02, 2014 May 17, 2014 0.938 (1mdo) 0.943 (2fnu) 0.943 (2fnu)\n T82 June 04, 2014 May 17, 2014 0.846 (4dn2) 0.846 (4dn2) 0.846 (4dn2)\n T84 June 09, 2014 May 17, 2014 0.939 (2btm) 0.941 (1b9b) 0.941 (1b9b)\n T85 June 10, 2014 May 17, 2014 0.889 (3ggo) 0.889 (3ggo) 0.889 (3ggo)\n T86 June 11, 2014 May 17, 2014 0.459 (4h3u) 0.467 (3gzr) 0.470 (3hk4)\n T87 June 13, 2014 May 17, 2014 0.922 (3get) 0.922 (3get) 0.922 (3get)\n T90 July 03, 2014 June 06, 2014 0.921 (4qgr) 0.927 (2oga) 0.927 (2oga)\n T91 July 08, 2014 June 06, 2014 0.750 (4gel) 0.750 (4gel) 0.808 (3hsi)\n T92 July 09, 2014 June 06, 2014 0.785 (1tu7) 0.837 (3h1n) 0.837 (3h1n)\n T93 July 10, 2014 June 06, 2014 0.896 (4a7p) 0.896 (4a7p) 0.896 (4a7p)\n T94 July 11, 2014 June 06, 2014 0.655 (3gff) 0.655 (3gff) 0.655 (3gff)\n\nTM\u2010score of the templates that have the highest TM\u2010score among top 10 selected templates for each target and the PDB IDs of the templates are listed.\n\nNo protein with the desired oligomer state was found among the top 100 HHsearch entries.\n\nThe accuracy levels of the models built using the three categories of best templates for each target and the best models from each of the participating CAPRI predictor groups submitted for the same target are plotted in Figure [8](#prot25007-fig-0008){ref-type=\"fig\"}. The model accuracy is measured by the *I\u2010rms* value, representing the accuracy level of the predicted interface in the complex. Each entry in the Figure represents one model, and for each template category (based on sequence alignments, on structural alignment of the monomers and dimers, respectively), up to 10 best models are shown per target and colored according to the template category.\n\n![Accuracy of Round 30 homodimer models predicted by protein docking methods and template\u2010based modeling versus models derived by standard homology modeling. The *I\u2010rms* values, representing the accuracy level of the predicted interface, are plotted (vertical axis) for different models for each target (listed on the horizontal axis using the CAPRI target identification). Each point represents one model. The best models submitted by individual CAPRI predictor groups are represented by green triangles. The remaining models are those built in this study by standard homology modeling techniques[42](#prot25007-bib-0042){ref-type=\"ref\"} on the basis of homodimer templates from the PDB. Up to 10 best models are shown per target and template category (see text). Models based on templates identified using sequence information (black triangles), models based structural alignments of individual monomers (red lozenges), and those based on structural alignments of the entire dimers (blue triangles). The targets (only dimers) are subdivided into easy and difficult targets (see text). Dashed horizontal lines represent *I\u2010rms* values delimiting models of high, medium, acceptable and lower (incorrect) quality by CAPRI criteria.](PROT-84-323-g008){#prot25007-fig-0008}\n\nInspection of Figure 8 indicates that models submitted by CAPRI predictor groups, a vast majority of which employed docking methods as part of their protocol, tend to be of higher accuracy. For most of the easy targets, the 10 models submitted by CAPRI groups more consistently display lower *I\u2010rms* values then the models built from the best templates. This is the case not only for models derived from the sequence\u2010based templates but also for the most structurally similar templates of the monomer or dimer categories. Considering only the best models for each targets the performance results are mode balanced. For seven out of the 12 easy targets the best models overall were submitted by CAPRI participants, whereas for the remaining five targets the most accurate models were those derived from the structurally most similar template. Overall however, acceptable or medium quality models were obtained with all the approaches and for nearly all the easy targets.\n\nOn the other hand it is remarkable that for three of the difficult targets (T72, T79, and T86), the docking procedures were able to produce acceptable models, with one medium quality model for T79, whereas all the template\u2010based models were incorrect.\n\nOverall these results do confirm that protein docking procedures represent an added value over straightforward template\u2010based modeling. One must recall however, that docking was often combined with template\u2010based restraints and hence, can in general not be qualified as *ab\u2010initio* docking in the context of this experiment. It is also important to note that for two targets, T82 and T85, the highest accuracy models were predicted by the group of Seok, who employed specialized template\u2010based modeling techniques augmented by loop modeling and refinement. But the accuracy of these models was not vastly superior to that of the best docking models.\n\nLastly, not too surprisingly, oligomer models build using the sequence\u2010based best templates were generally of inferior accuracy than models built from templates of the two other categories. Interestingly, models derived from the most structurally similar dimer templates were not generally more accurate than those derived from the structurally most similar monomers. This may stem from differences in the structural alignments that were used to detect the templates, which in turn could have affected the performance of the homology modeling procedure (MODELLER).\n\nCONCLUDING REMARKS {#prot25007-sec-0023}\n==================\n\nCAPRI Round 30, for which results were assessed here, was the first CASP\u2010CAPRI experiment, which brought together the community of groups developing methods for protein structure prediction and model refinement, with groups developing methods for predicting the 3D structure of protein assemblies. The 25 targets of this round represented a subset of the targets submitted for the CASP11 prediction season of the summer of 2014. In line with the main focus of CASP, the majority of these targets were single protein chains, forming mostly homodimers, and a few homotetramers. Only two of the targets were heterodimers, similar to the staple targets in previous CAPRI rounds. Unlike in most previous CAPRI rounds both subunit structures and their association modes had to be modeled for all the targets. Since the docking or assembly modeling performance may crucially depend on the accuracy of the models of individual subunits, the targets chosen for this experiment were proteins deemed to be readily modeled using templates from the PDB. Interestingly, templates were used mainly to model the structures of individual subunits, to limit the sampling space of docking solution or to filter such these solutions. Only a few groups carried out template\u2010based docking for the majority of the targets, and two of those ranked amongst the top performers, indicating that this relatively recent modeling strategy has potential.\n\nAs part of our assessment we established that the accuracy of the models of the individual subunits was an important factor contributing to high accuracy predictions of the corresponding complexes. At the same time we observed that highly accurate models of the protein components are not necessarily required for identifying their association modes with acceptable accuracy.\n\nFurthermore, we provide evidence that protein docking procedures and in some cases also specialized template\u2010based methods generally outperform off\u2010the\u2010shelf template\u2010based prediction of complexes. These findings apply to templates identified on the basis of sequence information alone, as well as to templates structurally more similar to the target. The added value of docking methods was particularly significant for the more difficult targets, where the structures of the identified best templates differed more significantly from the target structure\n\nThus, the assessment results presented here confirm that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is feasible, especially for stable dimers that feature interface areas of 1000--1500 \u00c5^2^, whose size is comparable or larger than the one associated with transient heterocomplexes. They also confirm that docking procedures can represent a competitive advantage over standard homology modeling techniques, when those are applied without further improvements to model the complex.\n\nOn the other hand, difficulties arise when the subunit interface in the target is similar in size to those associated with crystal contacts.[45](#prot25007-bib-0045){ref-type=\"ref\"} Such cases were associated with a number of targets where the oligomeric state assignment was ambiguous or inaccurate. Such ambiguous or inaccurate oligomeric state assignments represented a confounding factor for the docking prediction in this round. The problem arose mainly from the fact that the authors\\' assignments, usually based on independent experiment evidence, were not available to predictors at the time of the prediction experiment. Instead, predictors were provided with tentative assignments, inferred on the basis of computational analysis of the crystal contacts. Quite encouragingly, for most targets with ambiguous assignment, or for which the tentative assignments were later overruled by the authors upon submission to the PDB, the docking predictions were shown to provide useful information, which often confirmed the final assignment or helped resolve ambiguous ones. This occurred for both homodimer and homotetramer targets.\n\nLastly, we find that the docking prediction performance for the genuine homodimer targets was superior to that obtained for heterocomplexes in previous CAPRI rounds, in line with the expectation that, owing to their higher binding affinity (and larger and more hydrophobic interfaces), homodimers are easier to predict than heterodimers. Much poorer prediction performance was however achieved for genuine tetrameric targets, where the inaccuracy of the homology\u2010built subunit models and the smaller pair\u2010wise interfaces limited the prediction performance. Accurately modeling of higher order assemblies from sequence information is thus an area where progress is needed.\n\nSupporting information\n======================\n\n###### \n\nSupporting Information\n\n###### \n\nClick here for additional data file.\n\nWe are most grateful to the PDBe at the European Bioinformatics Institute in Hinxton, UK, for hosting the CAPRI website. Our deepest thanks go to all the structural biologists and to the following structural genomics initiatives: Northeast Structural Genomics Consortium, Joint Center for Structural Genomics, NatPro PSI:Biology, New York Structural Genomics Research Center, Midwest Center for Structural Genomics, Structural Genomics Consortium, for contributing the targets for this joint CASP\u2010CAPRI experiment. MFL acknowledges support from the FRABio FR3688 Research Federation \"Structural & Functional Biochemistry of Biomolecular Assemblies.\"\n"], ["Introduction {#S1}\n============\n\nDue to their direct life cycle, morphological adaptation, and high host specificity, gill monogeneans of fish are commonly studied parasites in the context of coevolution and biogeography of host-parasite systems \\[[@R37], [@R58], [@R60]\\]. The reconstruction of the evolutionary history of parasites and the investigation of their origin is the first step in coevolutionary studies. However, despite the enormous diversity of both freshwater fish and their monogenetic fauna (e.g., \\[[@R1], [@R8], [@R14], [@R46]\\]), coevolutionary studies of fish and their monogenean parasites from the Neotropical Region are still limited.\n\n*Anacanthorus* Mizelle and Price, 1965 is one of the most diverse monogenean genera living on fish in the Neotropical Region. Of the 15 genera parasitizing serrasalmids, *Anacanthorus* currently comprises 75 nominal species, which are distributed on species of Bryconidae (15 species), Characidae (22), and Serrasalmidae (38) \\[[@R9], [@R33], [@R42]\\]. However, undescribed species of *Anacanthorus* have also recently been recorded on species of Erythrinidae \\[[@R19], [@R20]\\]. *Anacanthorus* belong to Anacanthorinae Price, 1967, which is restricted to the Neotropical Region, and at present this group accommodates only *Anacanthorus* and *Anacanthoroides* Kritsky & Thatcher, 1974, the latter being represented by only two species recorded on the Prochilodontidae.\n\nThe freshwater fish of the Serrasalmidae, representing the most common host group for *Anacanthorus*, include piranhas, pacus, and their relatives, and currently comprise 98 valid species distributed throughout South America \\[[@R16]\\]. Several species of this fish group are economically important for commercial fishing and aquaculture, especially in the Amazon region \\[[@R3], [@R25], [@R36]\\]. Many phylogenetic studies based on different molecular markers (e.g., mtDNA control region, 12S and 16S rRNA) have suggested that the Serrasalmidae form three major phylogenetic lineages, i.e., the \"pacu\" lineage (including *Colossoma*, *Mylossoma* and *Piaractus*), the \"*Myleus*-like pacus\" lineage (including *Mylesinus*, *Myleus*, *Ossubtus* and *Tometes*), and the \"true piranhas\" lineage (including *Catoprion*, *Metynnis*, *Pristobrycon*, *Pygocentrus*, *Pygopristis* and *Serrasalmus*) \\[[@R47], [@R48], [@R64]\\]. Serrasalmid fish exhibit enormous monogenean diversity. So far, 92 monogenean species belonging to 15 genera have been recorded on these fish. Most of these records originated from Brazil during the 1990s, when at least 8 genera and 61 species of monogeneans were described from piranhas and their relatives \\[[@R8]\\].\n\nAccording to morphological analyses carried out by Kritsky and Boeger \\[[@R28]\\], the Anacanthorinae seem to represent a monophyletic group within the Dactylogyridae. Van Every and Kritsky \\[[@R65]\\] used the morphological characters of the haptoral hooks and reproductive organs to infer phylogenetic relationships between species of *Anacanthorus* from the \"true piranhas\" from the central Amazon. They suggested that this host-parasite system is a suitable model for studying biogeography and coevolution in the neotropics, although there are still many gaps in our knowledge concerning their diversity and phylogeny (i.e., the phylogenetic position of *Anacanthorus* within the Dactylogyridae and interspecific relationships within the genus).\n\nUsing the complete SSU (18S rDNA), M\u00fcller et al. \\[[@R45]\\] performed a study on *Anacanthorus penilabiatus* Boeger, Husak & Martins 1995 \\[[@R6]\\] and *Mymarothecium viatorum* Boeger, Piasecki & Sobecka, 2002 \\[[@R7]\\] (Ancyrocephalinae), both parasites of the pacu *Piaractus mesopotamicus* (Holmberg, 1887), focusing on the phylogenetic position of these monogeneans within the Dactylogyridae. Recently, Gra\u00e7a et al. \\[[@R20]\\] investigated the coevolutionary processes between selected species of *Anacanthorus* and their hosts in southern Brazil, and identified host-parasite cospeciation at the level of host families (Serrasalmidae, Bryconidae and Erythrinidae) and their specific *Anacanthorus* spp.\n\nConsidering the richness of *Anacanthorus* (the highest of all genera parasitizing Characiformes in the neotropics), the high host specificity exhibited by *Anacanthorus* species, and the scarcity of phylogenetic studies focused on these dactylogyrids, the aim of this study was to investigate the phylogenetic position of *Anacanthorus* spp. within the Dactylogyridae that infest serrasalmids from two Brazilian river basins based on the analysis of partial 28S rDNA sequences.\n\nMaterials and methods {#S2}\n=====================\n\nSpecimen collection and processing {#S3}\n----------------------------------\n\nFish were caught by local fishermen with gill nets and hooks from the following localities in Brazil: the Miranda River (20\u00b011\u203227\u2033S; 56\u00b030\u203219\u2033W), the Negro River (Mato Grosso do Sul) (19\u00b034\u203240\u2033S; 56\u00b009\u203208\u2033W), the Upper Paran\u00e1 River (20\u00b045\u2032S; 53\u00b016\u2032W), and the Xingu River (3\u00b012\u2032S, 52\u00b012\u2032W) (see [Table 1](#T1){ref-type=\"table\"}). Fish were examined for monogeneans immediately after capture. All experimental handling was carried out in compliance with animal safety and ethics rules issued by the Federal Rural University of Rio de Janeiro (UFRRJ). Gills excised from fish were placed in Petri dishes with tap water and examined for monogeneans using a dissecting microscope. Parasites were placed individually in a drop of water on a slide and the haptor of each specimen was excised from the body and preserved in absolute ethanol for molecular analyses. The rest of the body was mounted in a mixture of glycerine and ammonium picrate (GAP) and kept as a molecular voucher. Additionally, some entire specimens were mounted in GAP and kept as paragenophore specimens (see Astrin et al. \\[[@R4]\\] for terminology). Species determinations were mainly based on the morphology of the male copulatory organ and of the haptoral hooks following the original descriptions by Boeger and Kritsky \\[[@R5]\\], Van Every and Kritsky \\[[@R65]\\], and Boeger et al. \\[[@R6]\\]. After morphological evaluation, specimens fixed in GAP were remounted in Canada balsam according to the procedure described by Ergens \\[[@R13]\\]. Voucher specimens were deposited in the Helminthological Collection of the Institute Oswaldo Cruz (CHIOC), Rio de Janeiro, Brazil, under the catalogue numbers 40046 a--b and 40047 and in the Helminthological Collection of the Institute of Parasitology of the Czech Academy of Sciences, (IPCAS), Czech Republic, under the catalogue numbers M-702 -- M-710.\n\nTable 1Species included in the phylogenetic analyses.Parasite speciesHost speciesHost familyLocalityAccession numberDactylogyridea\u2003Dactylogyridae\u2003\u2003*Actinocleidus recurvatus* Mizelle and Donahue, 1944 \\[[@R41]\\]*Lepomis gibbosus* (Linnaeus)CentrarchidaeRiver Dunaj, SR[AJ969951](http://www.ncbi.nlm.nih.gov/BLAST/AJ969951)\u2003\u2003*Aliatrema cribbi* Plaisance & Kritsky, 2004 \\[[@R50]\\][\\*\\*](#TFN2){ref-type=\"table-fn\"}*Chaetodon citrinellus* (Cuvier, 1831)ChaetodontidaeFrench Polynesia[AY820612](http://www.ncbi.nlm.nih.gov/BLAST/AY820612)\u2003\u2003*Ameloblastella chavarriai* (Price, 1938) \\[[@R53]\\]*Rhamdia quelen* (Quoy & Gaimard, 1824)HeptapteridaeLake Catemaco, MX[KP056251](http://www.ncbi.nlm.nih.gov/BLAST/KP056251)\u2003\u2003*Ameloblastella* sp. 16 (from Mendoza-Palmero et al. \\[[@R39]\\])*Hypophthalmus edentatus* Spix & Agassiz, 1829HypophtalmidaeRiver Nanay, PE[KP056255](http://www.ncbi.nlm.nih.gov/BLAST/KP056255)\u2003\u2003*Ancyrocephalus paradoxus* Creplin, 1839 \\[[@R10]\\]*Sander lucioperca* (Linnaeus)PercidaeRiver Morava, CR[AJ969952](http://www.ncbi.nlm.nih.gov/BLAST/AJ969952)\u2003\u2003*Ancyrocephalus percae* (Ergens, 1966) \\[[@R12]\\]*Perca fluviatilis* (Linnaeus)PercidaeLake Constance, GE[KF499080](http://www.ncbi.nlm.nih.gov/BLAST/KF499080)\u2003\u2003***Anacanthorus amazonicus* Van Every & Kritsky, 1992 \\[** [@R65] **\\]***Serrasalmus maculatus* Kner, 1858SerrasalmidaeRiver Negro, BR[MH843721](http://www.ncbi.nlm.nih.gov/BLAST/MH843721)\u2003\u2003***Anacanthorus jegui* Van Every & Kritsky, 1992 \\[** [@R65] **\\]***Serrasalmus maculatus* Kner, 1858SerrasalmidaeRiver Negro, BR[MH843720](http://www.ncbi.nlm.nih.gov/BLAST/MH843720)\u2003\u2003***Anacanthorus lepyrophallus* Kritsky, Boeger, and Van Every, 1992 \\[** [@R29] **\\]***Serrasalmus maculatus* Kner, 1858SerrasalmidaeRiver Baia, BR[MH843718](http://www.ncbi.nlm.nih.gov/BLAST/MH843718)\u2003\u2003***Anacanthorus maltai* Boeger & Kritsky, 1988 \\[** [@R5] **\\]***Pygocentrus nattereri* Kner, 1858SerrasalmidaeRiver Miranda, BR[MH843716](http://www.ncbi.nlm.nih.gov/BLAST/MH843716)\u2003\u2003***Anacanthorus paraxaniophallus* Moreira, Carneiro, Ruz & Luque, 2019 \\[** [@R42] **\\]***Serrasalmus marginatus* Valenciennes, 1837SerrasalmidaeRiver Paran\u00e1, BR[MH843717](http://www.ncbi.nlm.nih.gov/BLAST/MH843717)\u2003\u2003***Anacanthorus penilabiatus* Boeger, Husak & Martins, 1995 \\[** [@R6] **\\]***Piaractus mesopotamicus* (Holmberg, 1887)SerrasalmidaeRiver Paran\u00e1, BR[MH843719](http://www.ncbi.nlm.nih.gov/BLAST/MH843719)\u2003\u2003***Anacanthorus rondonensis* Boeger & Kritsky, 1988 \\[** [@R5] **\\]***Pygocentrus nattereri* Kner, 1858SerrasalmidaeRiver Miranda, BR[MH843714](http://www.ncbi.nlm.nih.gov/BLAST/MH843714)\u2003\u2003***Anacanthorus thatcheri* Boeger & Kritsky, 1988\\[** [@R5] **\\]***Pygocentrus nattereri* Kner, 1858SerrasalmidaeRiver Miranda, BR[MH843715](http://www.ncbi.nlm.nih.gov/BLAST/MH843715)\u2003\u2003***Anacanthorus* sp. 1***Myleus setiger* M\u00fcller & Troschel, 1844SerrasalmidaeRiver Xingu, BR[MH843722](http://www.ncbi.nlm.nih.gov/BLAST/MH843722)\u2003\u2003*Bravohollisia roseta* Lim, 1995 \\[[@R34]\\]*Pomadasys maculatus* (Bloch, 1793)HaemulidaeGuangdong, CH[DQ537364](http://www.ncbi.nlm.nih.gov/BLAST/DQ537364)\u2003\u2003*Bychowskyella pseudobagri* Akhmerow, 1952 \\[[@R2]\\]*Tachysurus fulvidraco* (Richardson, 1846)BagridaeGuangdong, CH[EF100541](http://www.ncbi.nlm.nih.gov/BLAST/EF100541)\u2003\u2003*Dactylogyrus extensus* Mueller and Van Cleave, 1932 \\[[@R44]\\]*Cyprinus carpio* (Linnaeus)CyprinidaeRiver Morava, CR[AJ969944](http://www.ncbi.nlm.nih.gov/BLAST/AJ969944)\u2003\u2003*Dactylogyrus inversus* (Goto and Kikuchi, 1917) \\[[@R18]\\]*Lateolabrax japonicus* (Cuvier, 1828)LateolabracidaeCH[AY548928](http://www.ncbi.nlm.nih.gov/BLAST/AY548928)\u2003\u2003*Euryhaliotrema perezponcei* Garc\u00eda-Vargas, Fajer-\u00c1vila & Lamothe-Argumedo, 2008 \\[[@R17]\\]*Lutjanus guttatus* (Steindachner, 1869)LutjanidaeBay Cerritos, MX[HQ615996](http://www.ncbi.nlm.nih.gov/BLAST/HQ615996)\u2003\u2003*Euryhaliotrematoides annulocirrus* (Yamaguti, 1968) \\[[@R70]\\][\\*\\*](#TFN2){ref-type=\"table-fn\"}*Chaetodon vagabundus* (Linnaeus)ChaetodontidaeAUT[AY820613](http://www.ncbi.nlm.nih.gov/BLAST/AY820613)\u2003\u2003*Euryhaliotrematoides microphallus* (Yamaguti, 1968)\\[[@R70]\\][\\*\\*](#TFN2){ref-type=\"table-fn\"}*Heniochus chrysostomus* Cuvier, 1831ChaetodontidaePalau[AY820617](http://www.ncbi.nlm.nih.gov/BLAST/AY820617)\u2003\u2003*Haliotrema angelopterum* Plaisance, Bouamer & Morand, 2004 \\[[@R49]\\]*Chaetodon kleinii* Bloch, 1790ChaetodontidaePalau[AY820620](http://www.ncbi.nlm.nih.gov/BLAST/AY820620)\u2003\u2003*Haliotrema aurigae* (Yamaguti, 1968) \\[[@R70]\\]*Chaetodon auriga* Forssk\u00e5l, 1775ChaetodontidaeAUT[AY820621](http://www.ncbi.nlm.nih.gov/BLAST/AY820621)\u2003\u2003*Haliotrematoides guttati* Garc\u00eda-Vargas, Fajer-\u00c1vila & Lamothe-Argumedo, 2008 \\[[@R17]\\]*Lutjanus guttatus* (Steindachner, 1869)LutjanidaeBay Cerritos, MX[HQ615993](http://www.ncbi.nlm.nih.gov/BLAST/HQ615993)\u2003\u2003*Haliotrematoides spinatus* Kritsky & Mendoza-Franco in Kritsky, Yang & Sun, 2009 \\[[@R32]\\]*Lutjanus guttatus* (Steindachner, 1869)LutjanidaePacific Coast, MX[KC663679](http://www.ncbi.nlm.nih.gov/BLAST/KC663679)\u2003\u2003*Ligictaluridus pricei* (Mueller, 1936) \\[[@R43]\\]*Ameiurus nebulosus* (Lesueur, 1819)IctaluridaeRiver Moldau, CR[AJ969939](http://www.ncbi.nlm.nih.gov/BLAST/AJ969939)*\u2003\u2003Mymarothecium viatorum* Boeger, Piasecki and Sobecka, 2002 \\[[@R7]\\]*Piaractus mesopotamicus* (Holmberg, 1887)SerrasalmidaeRiver Paran\u00e1, BR[MH843723](http://www.ncbi.nlm.nih.gov/BLAST/MH843723)\u2003\u2003*Onchocleidus similis* (Mueller, 1936) \\[[@R43]\\]*Lepomis gibbosus* (Linnaeus)CentrarchidaeRiver Danube, SR[AJ969938](http://www.ncbi.nlm.nih.gov/BLAST/AJ969938)*Onchocleidus* sp.*Lepomis macrochirus* Rafinesque, 1819CentrarchidaeGuangzhou, CH[AY841873](http://www.ncbi.nlm.nih.gov/BLAST/AY841873)\u2003\u2003*Parasciadicleithrum octofasciatum* Mendoza-Palmero, Blasco-Costa, Hern\u00e1ndez-Mena & P\u00e9rez-Ponce de Le\u00f3n, 2017 \\[[@R40]\\]*Rocio octofasciata* (Regan, 1903)CichlidaeUnnamed creek in Ejido Reforma Agraria, MX[KY305885](http://www.ncbi.nlm.nih.gov/BLAST/KY305885)\u2003\u2003*Pseudodactylogyrus anguillae* (Yin & Sproston, 1948) \\[[@R71]\\]*Anguilla anguilla* (Linnaeus)AnguillidaeRiver Dunaj, SR[AJ969950](http://www.ncbi.nlm.nih.gov/BLAST/AJ969950)\u2003\u2003*Pseudodactylogyrus bini* (Kikuchi, 1929) \\[[@R26]\\]*Anguilla anguilla* (Linnaeus)AnguillidaeNeusiedler Lake, AUS[AJ969949](http://www.ncbi.nlm.nih.gov/BLAST/AJ969949)\u2003\u2003*Pseudohaliotrema sphincteroporus* Yamaguti, 1953 \\[[@R69]\\]*Siganus doliatus* Gu\u00e9rin-M\u00e9neville, 1829SiganidaeGreen Island, AUT[AF382058](http://www.ncbi.nlm.nih.gov/BLAST/AF382058)\u2003\u2003*Quadriacanthus kobiensis* Ha Ky, 1968 \\[[@R22]\\]*Clarias batrachus* (Linnaeus)ClariidaeGuangzhou, CH[AY841874](http://www.ncbi.nlm.nih.gov/BLAST/AY841874)\u2003\u2003*Sciadicleithrum meekii* Mendoza-Franco, Scholz & Vidal-Mart\u00ednez, 1997 \\[[@R38]\\]*Thorichthys meeki* Brind, 1918CichlidaeUnnamed creek in Ejido Reforma Agraria, MX[KY305889](http://www.ncbi.nlm.nih.gov/BLAST/KY305889)\u2003\u2003*Sciadicleithrum splendidae* Kritsky, Vidal\u2010Mart\u00ednez & Rodr\u00edguez\u2010Canul, 1994 \\[[@R31]\\]*Parachromis friedrichsthalii* (Heckel, 1840)CichlidaeLaguna El Vapor, MX[KY305890](http://www.ncbi.nlm.nih.gov/BLAST/KY305890)\u2003\u2003*Tetrancistrum* sp.*Siganus fuscescens* (Houttuyn, 1782)SiganidaeHeron Island, AUT[AF026114](http://www.ncbi.nlm.nih.gov/BLAST/AF026114)\u2003\u2003*Thaparocleidus asoti* (Yamaguti, 1937) \\[[@R68]\\]*Parasilurus asotus* (Linnaeus)SiluridaeChongqing City, CH[DQ157669](http://www.ncbi.nlm.nih.gov/BLAST/DQ157669)*Thaparocleidus siluri* (Zandt, 1924) \\[[@R73]\\]*Silurus glanis* (Linnaeus)SiluridaeRiver Morava, CR[AJ969940](http://www.ncbi.nlm.nih.gov/BLAST/AJ969940)\u2003\u2003*Unibarra paranoplatensis* Suriano & Incorvaia, 1995 \\[[@R61]\\]*Aguarunichthys torosus* Stewart, 1986PimelodidaeSanta Clara, PE[KP056219](http://www.ncbi.nlm.nih.gov/BLAST/KP056219)\u2003\u2003*Vancleaveus janauacaensis* Kritsky, Thatcher and Boeger, 1986 \\[[@R30]\\]*Pterodoras granulosus* (Valenciennes, 1821)DoradidaeRiver Itaya, PE[KP056240](http://www.ncbi.nlm.nih.gov/BLAST/KP056240)\u2003Pseudomurraytrematidae\u2003\u2003*Pseudomurraytrema* sp.[\\*](#TFN1){ref-type=\"table-fn\"}*Catostomus ardens* Jordan & Gilbert, 1881CatostomidaeSnake River, Idaho[AF382059](http://www.ncbi.nlm.nih.gov/BLAST/AF382059)Tetraonchinea\u2003Anoplodiscidae\u2003\u2003*Anoplodiscus cirrusspiralis* Roubal, Armitage & Rohde, 1983 \\[[@R56]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Sparus aurata* (Linnaeus)SparidaeSydney, AUT[AF382060](http://www.ncbi.nlm.nih.gov/BLAST/AF382060)\u2003Tetraonchidae\u2003\u2003*Tetraonchus monenteron* (Wagener, 1857) \\[[@R67]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Esox lucius* (Linnaeus)EsocidaeRiver Morava, CR[AJ969953](http://www.ncbi.nlm.nih.gov/BLAST/AJ969953)\u2003Monocotylidea\u2003\u2003*Calicotyle affinis* Scott, 1911 \\[[@R57]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Chimaera monstrosa* (Linnaeus)ChimaeridaeNorway[AF382061](http://www.ncbi.nlm.nih.gov/BLAST/AF382061)\u2003\u2003*Clemacotyle australis* Young, 1967 \\[[@R72]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Aetobatus narinari* (Euphrasen, 1790)MyliobatidaeHeron Island, AUT[AF348350](http://www.ncbi.nlm.nih.gov/BLAST/AF348350)\u2003\u2003*Decacotyle lymmae* Young, 1967 \\[[@R72]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Aetobatus narinari* (Euphrasen, 1790)MyliobatidaeHeron Island, AUT[AF348359](http://www.ncbi.nlm.nih.gov/BLAST/AF348359)\u2003*Dendromonocotyle octodiscus* Hargis, 1955 \\[[@R23]\\][\\*](#TFN1){ref-type=\"table-fn\"}*Dasyatis americana* (Hildebrand & Schroeder, 1928)DasyatidaeGulf of Mexico[AF348352](http://www.ncbi.nlm.nih.gov/BLAST/AF348352)[^1][^2]Species sequenced in this study are shown in bold.Abbreviations: AUS -- Austria, AUT -- Australia, BR -- Brazil, CH -- China, CR -- Czech Republic, GE -- Germany, MX -- Mexico, PE -- Peru, SR -- Slovak Republic.\n\nDNA extraction, amplification, and sequencing {#S4}\n---------------------------------------------\n\nDNA extraction was carried out in 200\u00a0\u03bcl of a 5% suspension of Chelex\u2122 in deionized water containing 2\u00a0\u03bcl proteinase K, followed by incubation at 56\u00a0\u00b0C for 3\u00a0h and boiling at 95\u00a0\u00b0C for 8\u00a0min. The partial 28S rRNA gene region (D1--D3) was amplified using primers C1 and D2 \\[[@R24]\\] or U178 and L1642 \\[[@R35]\\]. For the C1 and D2 primers, PCR reactions were performed in a final volume of 15\u00a0\u03bcl containing 1\u00a0\u00d7\u00a0PCR buffer, 1.5\u00a0mM of MgCl~2~, 0.2\u00a0mM of dNTPs, 0.5\u00a0mM of each oligonucleotide primer, 1\u00a0U of Taq DNA polymerase (Fermentas), 6.6\u00a0mg/ml of BSA, and 5\u00a0\u03bcl of genomic DNA, using the following cycling parameters: denaturation at 94\u00a0\u00b0C for 2\u00a0min, followed by 39 cycles of 94\u00a0\u00b0C for 20\u00a0s, annealing at 58\u00a0\u00b0C for 30\u00a0s, and elongation at 72\u00a0\u00b0C for 1\u00a0min 30\u00a0s, with a final elongation at 72\u00a0\u00b0C for 10\u00a0min. For the second pair of primers, PCR reactions were performed in a final volume of 25\u00a0\u03bcl containing 1\u00a0\u00d7\u00a0PCR buffer, 3\u00a0mM of MgCl~2~, 0.2\u00a0mM of dNTP's, 0.5\u00a0mM of each oligonucleotide primer, 1\u00a0U of Platinum Taq DNA polymerase (Invitrogen), 0.4\u00a0mg/ml of BSA, and 2.5\u00a0\u03bcl of genomic DNA, using the cycling profile described in Mendoza-Palmero et al. \\[[@R39]\\]. The PCR products were checked on 1% agarose gel and purified using an ExoSAP-IT kit (Ecoli, Bratislava, Slovakia), following the manufacturer's instructions. Purified products were directly sequenced using PCR primer pair C1--D2 or U178--L1642 and two additional internal primers (1200F and 1200R, see Lockyer et al. \\[[@R35]\\]) with a BigDye Terminator Cycle Sequencing kit (Applied Biosystems, Foster City, CA, USA). Sequencing was performed on an ABI 3130 Genetic Analyzer (Applied Biosystems).\n\nContiguous sequences were assembled in Geneious (Geneious ver. 9 created by Biomatters, available from ) and deposited in the GenBank database under the accession numbers listed in [Table 1](#T1){ref-type=\"table\"}.\n\nPhylogenetic analyses {#S5}\n---------------------\n\nNine species of *Anacanthorus* and *Mymarothecium viatorum* (host species are shown in [Table 1](#T1){ref-type=\"table\"}) were sequenced for the partial 28S rRNA gene and aligned with 35 species belonging to the Dactylogyridea and four species of the Monocotylidea retrieved from GenBank (see [Table 1](#T1){ref-type=\"table\"}). Sequences were aligned with the CLUSTAL W algorithm \\[[@R63]\\] implemented in Geneious. Ambiguously aligned regions were removed from the alignment with GBlocks v. 0.91 \\[[@R62]\\], using less stringent selection. Phylogenetic analyses were performed using species of Monocotylidae, Tetraonchidae, Anoplodiscidae, and Pseudomurraytrematidae as outgroups (see [Table 1](#T1){ref-type=\"table\"} for species). The substitution model TVM\u00a0+\u00a0I\u00a0+\u00a0G (the transversion model including the proportion of invariable sites and a gamma distribution), selected by the jModelTest \\[[@R52]\\] using the Bayesian information criterion, was used for Maximum Likelihood (ML) and Bayesian Inference (BI) analyses. The search for the ML tree and bootstrap resampling with 1000 replications were performed using PHYML \\[[@R21]\\] implemented in Geneious. BI analyses were performed using MrBayes v. 3.2 \\[[@R55]\\], running four Monte Carlo Markov chains for 10^7^ generations, with trees sampled every 10^3^ generations and the first 1000 samples discarded as \"burn in\". In order to check the convergence and to confirm that the effective sample size (ESS) of each parameter was adequate for providing reasonable estimates of the variance in model parameters (i.e., ESS values \\>200), Tracer v. 1.6 \\[[@R54]\\] was used.\n\nResults {#S6}\n=======\n\nNew partial 28S rDNA sequences were obtained for nine species of *Anacanthorus* (*Anacanthorus amazonicus* Van Every & Kritsky, 1992 \\[[@R65]\\], *Anacanthorus jegui* Van Every & Kritsky, 1992 \\[[@R65]\\], *Anacanthorus lepyrophallus* Kritsky, Boeger, and Van Every, 1992 \\[[@R29]\\], *Anacanthorus maltai* Boeger & Kritsky, 1988 \\[[@R5]\\], *Anacanthorus paraxaniophallus* Moreira, Carneiro, Ruz & Luque, 2019 \\[[@R42]\\], *Anacanthorus penilabiatus* Boeger, Husak & Martins, 1995 \\[[@R6]\\], *Anacanthorus rondonensis* Boeger & Kritsky, 1988 \\[[@R5]\\], *Anacanthorus thatcheri* Boeger & Kritsky, 1988 \\[[@R5]\\] and *Anacanthorus* sp. 1) and *Mymarothecium viatorum*, and varied from 612\u00a0bp to 716\u00a0bp (when using the C1 and D2 primers) and from 1425\u00a0bp to 1434\u00a0bp (when using the U178 and L1642 primers). Specimens identified as *Anacanthorus* sp. 1 represented an undescribed species parasitizing *Myleus setiger*. An unambiguous alignment of all analyzed species of the Dactylogyridea and Monocotylidea spanned 391 positions and included 205 parsimony-informative characters, 227 variable characters, and 164 conserved characters. ML and BI analyses generated phylogenetic trees with similar general topology and the monophyly of *Anacanthorus* was strongly supported by both analyses ([Fig. 1](#F1){ref-type=\"fig\"}). The Anacanthorinae, represented only by *Anacanthorus* spp. in this study, appeared to form a monophyletic group clustering with clade A comprising solely freshwater species of Ancyrocephalinae and the clade of *Ancylodiscoidinae* spp. *Anacanthorus penilabiatus* showed the basal position within the clade of *Anacanthorus* spp. Even though the ML and BI phylogenetic trees displayed the same topology, the status of *Anacanthorus* as a sister group to clade A of freshwater Ancyrocephalinae was only weakly supported by ML analysis. *Mymarothecium viatorum*, the next host-specific monogenean representative parasitizing serrasalmids, was positioned within clade A of the Ancyrocephalinae.\n\nFigure 1Consensus Bayesian topology from the phylogenetic analysis of partial 28S rDNA of 49 species of monogeneans. BI posterior probabilities and ML bootstrap values are shown at the nodes. Posterior probabilities \\<0.7 are not reported. Bootstrap values \\<50 are not reported. Species sequenced in the present study are shown in bold.\n\nClade B, including representatives of both freshwater and marine species of the Ancyrocephalinae, was well supported by BI analysis and weakly supported by ML analysis. The Dactylogyrinae formed a monophyletic sister group to Pseudodactylogyrinae (only weakly supported) and clustered with clade B of Ancyrocephalinae. Within clade B of Ancyrocephalinae, five marine species formed a well-supported group (clade C) on the basis of both analyses.\n\nWithin the Anacanthorinae i.e., *Anacanthorus*, the phylogenetic relationships among *Anacanthorus* seemed to reflect the phylogeny of their serrasalmid hosts. *Anacanthorus penilabiatus* from *Piaractus mesopotamicus*, a member of the \"pacu\" clade, showed the basal position within *Anacanthorus*; the following position was held by *Anacanthorus* sp. 1 from *Myleus setiger* (this position was weakly or moderately supported by BI and ML analyses, respectively), a representative of the \"*Myleus*-like pacus\" clade. Finally, the large group of *Anacanthorus* included two clades of species from hosts representing the \"true piranhas\" lineage, the first one well supported and including *A. lepyrophallus* and *A. amazonicus*, the second one including *A. paraxaniophallus*, *A. jegui*, *A. thatcheri* and *A. maltai*. *A. rondonensis* from *Pygocentrus nattereri*, a representative of the \"true piranhas\" lineage, showed the basal position in this large *Anacanthorus* group.\n\nDiscussion {#S7}\n==========\n\nIn the present study, and for the first time, the phylogenetic position of *Anacanthorus* within the Dactylogyridae was evaluated on the basis of analyses of partial 28S rDNA sequences. Representatives of five subfamilies within the Dactylogyridae, i.e., Ancyrocephalinae, Ancylodiscoidinae, Anacanthorinae, Dactylogyrinae, and Pseudodactylogyrinae, were included in the analyses. Using molecular data, we confirmed the monophyly of the Anacanthorinae (here represented only by *Anacanthorus*), in accordance with previous studies based on morphological characters \\[[@R28], [@R65]\\]. We did not include any member of *Anacanthoroides* in the phylogenetic analyses.\n\nOur results show that phylogenetic patterns between *Anacanthorus* spp. correspond to those between the Serrasalmidae. Ort\u00ed et al. \\[[@R47]\\] inferred the first molecular phylogeny of Serrasalmidae using mtDNA (12S and 16S rRNA) markers, and found three major lineages, (i) a clade including the \"pacus\" in the most basal position, followed by (ii) the clade including \"*Myleus*-like pacus\" species, and (iii) a clade including the most diverse group of Serrasalmidae, represented by the \"true piranhas\". They also determined the placement of *Acnodon* as a sister group to the two last lineages and suggested the paraphyly of some genera, i.e., *Myleus*, *Pristobrycon* and *Serrasalmus*. Later, Ort\u00ed et al. \\[[@R48]\\] performed analyses based on complete sequences of the mtDNA control region (D-loop) and on partial sequences of 12S and 16S rRNA, and their findings corroborated the previous division into three main lineages; they also suggested that other serrasalmid genera are not monophyletic. Finally, more recently, Thompson et al. \\[[@R64]\\] performed a robust phylogenetic analysis based on the sequences of 10 nuclear genes (two exons and eight introns) and the mtDNA control region. Their results agreed with previous studies on the phylogenetic relationships within serrasalmids based on mtDNA and confirmed that there are still many gaps to fill with regard to the taxonomy of this fish group.\n\nOur results may suggest that cospeciation processes played a role between *Anacanthorus* spp. and their serrasalmid hosts (at least at the level of three serrasalmid lineages). Recently, Gra\u00e7a et al. \\[[@R20]\\] suggested that there is cospeciation between *Anacanthorus* and their host lineages representing different families (Serrasalmidae, Bryconidae and Erythrinidae), even though duplications were the most frequent coevolutionary event in the speciation of *Anacanthorus* parasitizing species of the same family. In fact, cospeciation between monogeneans and their hosts was not found to be significant in some extensively studied groups such as *Lamellodiscus* \\[[@R11]\\], *Gyrodactylus* \\[[@R74]\\], *Dactylogyrus* \\[[@R58], [@R59]\\] and *Cichlidogyrus* \\[[@R37]\\].\n\nThe phylogenetic relationships among *Anacanthorus* spp. also seem to reflect the similarity in the morphology of the copulatory complex. Although we did not analyze all species previously morphologically evaluated by Van Every and Kritsky \\[[@R65]\\], their phylogenetic reconstruction using the morphology of copulatory complex is similar to our phylogenetic reconstruction using molecular data (i.e., the species analyzed in both studies exhibited the same phylogenetic relationships). However, to effectively investigate the congruence of phylogenies built on molecular and morphological data, the sequencing of a larger dataset of *Anacanthorus* species is necessary in future studies, potentially focusing on mapping the characters of the copulatory complex into the molecular phylogenetic reconstruction.\n\nAccording to our results, species of *Anacanthorus* formed a clade including the group of freshwater members of the Ancyrocephalinae (clade A) and the group of species of Ancylodiscoidinae. At the same time, we showed that *Mymarothecium viatorum*, an exclusive parasite of the \"pacu\" lineage, was positioned within freshwater Ancyrocephalinae. Using complete 18S rDNA sequences, M\u00fcller et al. \\[[@R45]\\] showed that *M. viatorum* clustered with *A. penilabiatus*. Both ribosomal markers (28S and 18S rDNA) have been widely used to reconstruct the phylogenies of monogeneans, and in many cases they have produced similar topologies (e.g., Plaisance et al. \\[[@R51]\\], Francov\u00e1 et al. \\[[@R15]\\], Verma et al. \\[[@R66]\\]); thus, the finding of M\u00fcller et al. \\[[@R45]\\] is due to the lack of sequences of species closely related to *Anacanthorus* species (i.e., the absence of the representatives of freshwater Ancyrocephalinae).\n\nWe conclude that *Anacanthorus* and their serrasalmid hosts can provide a useful model for studying host-parasite biogeography and coevolution in the neotropics. However, to perform cophylogenetic analyses, future studies are needed focusing on a wider spectrum of host species and their specific *Anacanthorus* spp. Additional sampling of the representatives of other monogenean genera parasitizing serrasalmids will allow us to investigate the phylogenetic relationships among such diverse monogeneans parasitizing the same host group.\n\nWe wish to thank Camila Pantoja, Maria Catarina Moraes and Philippe Alves from the Universidade Federal Rural do Rio de Janeiro, Rio de Janeiro (Brazil) and Tom\u00e1\u0161 Scholz from the Institute of Parasitology, \u010cesk\u00e9 Bud\u011bjovice (Czech Republic) for their help with material collection and parasitological examination. We are also grateful to Emil Jos\u00e9 Hern\u00e1ndez Ruz and J\u00e2nio da Silva Carneiro from the Universidade Federal do Par\u00e1, Altamira (Brazil), and Luiz Eduardo Roland Tavares from the Universidade Federal do Mato Grosso do Sul, Campo Grande (Brazil) for providing the facilities during the field trips. We would also like to thank Dr. Douglas McIntosh from the Department of Animal Parasitology, UFRRJ and Krist\u00fdna Koukalov\u00e1 from the Department of Botany and Zoology of the Faculty of Science of the Masaryk University, Brno (Czech Republic) for their technical support on sample sequencing. JM was funded by a Doctoral fellowship from CAPES (Coordena\u00e7\u00e3o de Aperfei\u00e7oamento de Pessoal de N\u00edvel Superior, Brazil)/PDSE/Process number {88881.134872/2016-01}, and the Conselho Nacional de Desenvolvimento Cientifico e Tecnol\u00f3gico (CNPq) provided grants to JLL (Nos. 474077/2011-0, 304254/2011-8, 402665/2012-0). All molecular analyses as well as personal costs for A\u0160 were funded by the Czech Science Foundation, ECIP project No. P505/12/G112. We kindly thank Matthew Nicholls for English revision of the final draft.\n\n**Cite this article as**: Moreira J, Luque JL & \u0160imkov\u00e1 A. 2019. The phylogenetic position of *Anacanthorus* (Monogenea, Dactylogyridae) parasitizing Brazilian serrasalmids (Characiformes). Parasite **26**, 44.\n\n[^1]: Species used as outgroups.\n\n[^2]: *Euryhaliotrematoides* and *Aliatrema* were placed in subjective synonymy with *Euryhaliotrema* \\[[@R27]\\].\n"]]