• Ei tuloksia

The complete genome ofNostocsp. UHCC 0702 was sequenced to identify the heinamide biosynthetic gene cluster. Simultaneously, a draft genome ofS. hofmannii PCC 7110 was analyzed to identify the scytocyclamide biosynthetic gene cluster. An approximately 100 kb PKS/NRPS pathway was found from both strains (Figure 10). A FAAL enzyme activating a hexanoic acid (LxaA or LxaA1) initiates the pathways. The hexanoic acid is further elongated to octanoic acid by the following PKS modules and aminated by an amino transferase to form ȕ-amino octanoic acid (LxaA2, LxaB, LxaE). The biosynthesis then branches to two alternative NRPS pathways to produce 11- (LxaCD and LxaC1) or 12-residue (LxaIJKL and LxaI1J1K1) laxaphycins. The NRPS enzymes assemble and cyclize the peptides.

29

Figure10. Biosynthetic pathways of laxaphycins heinamides and scytocyclamides.A: Organization oflxaBGC inNostocsp. UHCC 0702 and Scytonema hofmanniiPCC 7110 genomes. B: Organization of catalytic domains in Lxa biosynthetic enzymes.

30

The biosynthesis of laxaphycins is initiated by FAAL as in puwainaphycins. Also the two laxaphycin producer strains have different organization of FAAL enzymes,S. hofmannii PCC 7110 with a FAAL+ACP enzyme and distinct FAAL and ACP enzymes inNostocsp. UHCC 0702. This is similar to the reported alternative starters in puwainaphycin biosynthesis PuwI and PuwC/PuwD (Mareš et al., 2014; Galica et al., 2017; Mareš et al., 2019). A prediction of laxaphycins being produced through PKS/NRPS pathway initiated by a FAAL is found in the literature (Bornancin et al., 2015).

(2S,4R)-4-hydroxy proline (OHPro) is found in 11-residue laxaphycins in position 4 and 10, and in 12-residue laxaphycins in position 10. We discovered that the enzyme LxaN hydroxylates Pro in laxaphycins (Figure 11). The activity of the enzyme was proven by expressingNostocsp. UHCC 0702 LxaN inE. coliheterologously. OHPro was detected in extract of theE. coli by MS and with Marfey’s method it was determined specifically (2S,4R)-4-hydroxy-L-proline. ThelxaN gene is present in both investigated strains,Nostocsp. UHCC 0702 andS. hofmanniiPCC 7110. ThelxaN gene for the production of the enzyme is not located directly in the laxaphycin biosynthetic gene cluster in either of the genomes (Figure 10).

Figure 11. LxaN hydroxylating proline to (2S,4R)-4-hydroxy proline.

3-hydroxy-4-methyl proline (OHMePro) biosynthesis has not been previously described in cyanobacteria to my knowledge. Reported biosynthesis exist for fungal echinocandin analog pneumocandin (Houwaart et al., 2014), but the responsible enzymes have no homologs inNostoc sp. UHCC 0702 genome. Methylproline biosynthesis in cyanobacteria has been described in several compounds such as nostopeptolides, spumigins and nostocyclopeptides (Hoffmann et al., 2003; Luesch et al., 2003; Becker et al., 2004; Fewer et al., 2009; Hibi et al., 2013). InNostocsp.

UHCC 0702 genome we found homologs of the three known methylproline synthetic enzymes ORF, NosE and NosF of nostopeptolide BGC (Luesch et al., 2003; Hibi et al., 2013), that we predict to participate in the OHMePro production. These homologs were named LxaO, LxaP and LxaQ. The three genes are flanked by proline hydroxylase LxaN and oxygenase LxaR (Figure 10).

We predict that LxaO-P together with LxaR produce the OHMePro product (Figure 12).

InNostocsp. UHCC 0702 strain a heinamide variant B5 contains MePro10. This suggest that some MePro is produced in the cell and LxaO-Q are the most likely producers of the compound. Amino acid feeding experiments were performed, where different MePro isomers were supplied to the growth medium ofNostoc sp. UHCC 0702. In the cultures with abundant MePro in the growth medium, it was incorporated in the laxaphycin structure and variants with MePro10 became dominant (Figure 13). This suggests that free MePro is not hydroxylated to OHMePro in the cell and incorporated MePro residues are not hydroxylated either. From this, it is concluded that free OHMePro is produced in the cell and OHMePro production has no MePro intermediate, the hydroxylation occurs in another stage of OHMePro synthesis (Figure 12). OHMePro is recognized by LxaK12 adenylation domain with binding pocket amino acid residue sequence “DVQFIAHAAK”.

The sequence has 90% similarity to known Pro recognizing sequences and we propose this binding pocket is selective for OHMePro, with some affinity to MePro and Pro.

31

Figure 12. Proposed biosynthetic scheme of OHMePro production.

Figure 13. MS signal intensities of 12-residue heinamides grown with supplemented MePros in the growth medium.

Dehydrobutyrine (Dhb) is a non-proteinogenic amino acid present in several SMs. It is the dehydration product of threonine. Dehydroalanine (Dha) is a similar dehydration product of serine found in SMs. Dehydrobutyrine is found in laxaphycins, microcystins, and hassallidins amongst others. Products containing Dhb are produced through NRPS pathways or as ribosomally

synthesized and post-translationally modified peptides (RiPP). Mechanisms for Dhb dehydration in RiPPs have been shown, but the formation in NRPSs cannot be explained by this mechanism. The mechanism of Thr dehydration in NRPS has been speculated to be affected by the condensation domain following the adenylation of Thr or Ser (Tillett et al., 2000; Moffitt and Neilan, 2004). With the expanding genomic information, a phylogenetic examination was done to the known Dhb related condensation domains. All these condensation domains were found to be phylogenetically related and forming a distinct clade of modified AA C- domains by NaPDos (Ziemert et al., 2012).

We included in the analysis available NRPS BGCs with a Dhb or Dha: nodularin (Jokela et al., 2017), hassallidin (Vestola et al., 2014), puwainaphycin (Mareš et al., 2019), scytocyclamide and heinamide. Microcystin BGC was already included in the bioinformatics tool and with bleomycin BGC formed the basis of the “modified AA” clade. The analyzed condensation domains grouped with the existing modAA sequences (Figure 14).

32

Figure 14. Phylogeny of Dhb related C-domains in cyanobacterial NRPS products. A simplified phylogenetic tree based on NaPDoS results. All studied C domains following a dehydrated Thr module clustered in the “modified AA” clade.

The connection of dehydrated amino acids and modified AA clade of C-domains was recently shown experimentally in the biosynthesis of albopeptide (Wang et al., 2021). Albopeptide is a Val-Dha-Dhb tripeptide produced byStreptomyces albofaciens. The C domains following Dha and Dhb were confirmed to belong to the modAA C-domain class. In a assay with truncated NRPS enzymes they showed that Ser and Thr were activated by the adenylation domains and the dehydration occurs following the condensation reaction (

Figure15) (Wang et al., 2021).

33

Figure 15. NRPS modules incorporating Thr and dehydrating it to Dhb.

In heinamides and scytocyclamides, we found3-hydroxylated amino acids typical to laxaphycins.

These hydroxylated amino acids appear in 12- residue laxaphycin positions 3, 5 and 8. Position 3 of every known variant has OHLeu, position 5 has OHLeu or Leu and position 8 has OHAsn, Asn or OHHse (Table 2). In the laxaphycin gene clusters we found cupin-like domain 8 -genes, two in PCC 7110 and three in UHCC 0702, that we predict to hydroxylate abovementioned amino acids.

The location of these enzymes in bothlxabiosynthetic gene clusters suggests a function in the biosynthetic process. Cupin-like domain 8 class of enzymes has few characterized members that are known to hydroxylate amino acids and amino acid residues Asn, Asp, His, Lys, Arg, and RNA in animal proteins (Wilkins et al., 2018). The enzymes of this family are Fe(II) or Zn(II) and Į-ketoglutarate (Į-KG) dependent oxygenases and act as hydroxylases and demethylases (Markolovic et al., 2016). Į-KG-dependent oxygenases are known to 3-hyroxylate NRPS product peptide amino acids, such as L-Arg of viomycin, L-Asn of daptomycin-like peptides, D-Glu of kutzneride and L-p-aminophenylalanine of chloramphenicol (Yin and Zabriskie, 2004; Strieker et al., 2007; Strieker et al., 2009; Makris et al., 2010). As the adenylation domains of the Lxa NRPS proteins are predicted to recognize non-hydroxylated Leu and Asn with a 100 % match on known adenylation domain substrate binding pockets, we predict that the hydroxylation occurs after their incorporation to the peptide chain. Alternative mechanism for the 3-hydroxylation of laxaphycin amino acids has been proposed, in which cytochrome p450 enzymes would perform the

hydroxylation (Bornancin et al., 2015). However, no homologs to the cytochrome p450 enzymes in question were found in the laxaphycin producer genomes.

All 12-residue heinamides containO-carbamoyl-L-homoserine (cHse) in position 4. O-carbamoylated amino acids have not been reported in laxaphycins or any other SMs in my knowledge. In cyanobacteriaO-carbamoylation is known to occur in cyclodextrin (Entzeroth et al., 1986), carbamidocyclophanes (Bui et al., 2007; Preisitsch et al., 2016), banyaside A (Pluotno and Carmeli, 2005), and in saxitoxins (Kellmann et al., 2008). For saxitoxin and carbamidocyclophanes the carbamoyltransferases SxtI and CabL have been identified. In other organisms there are severalO-carbamoyltransferases that usually act with glycosides, like tobramycin biosynthetic enzyme TobZ (Kharel et al., 2004) and NolO and NoeI acting in nodulation factor tailoring

34

(Jabbouri et al., 1998). In theNostocsp. UHCC 0702 I found three putative carbamoyltransferases but was unable to argue which one would act in the heinamide biosynthesis.

4.3. Pseudospumigins (III)

Pseudospumigins are new aeruginosin variants found inNostocsp. CENA 543 (Figure 16). The structure is typical to known aeruginosin with Hpla1, Hty/Hph2 and Argal4 but with the difference of Ile/Leu/Val3 instead of more typical Choi or MePro.

Figure 16. Chemical structure of pseudospumigins A-F with their relative intensities, fromNostocsp.

CENA 543.

The pseudospumigin BGC is similar to known aeruginosin gene clusters, with highest resemblance tospugene cluster, which produces aeruginosin subclass spumigins (Figure 17). It lacks the tailoring enzymes to produce Choi3 or MePro3 and has adenylation domain substrate specificity for Ile3instead. The origin of Hty and Hph is not obvious from the gene cluster. However, Hty and Hph biosynthesis in cyanobacteria has been shown in cyanobacterial SM synthesis and spumigin production (Koketsu et al., 2013; Lima et al., 2017). The requiredhph genes were characterized in theNostocsp. CENA 543 genome in a subsequent article as a part of anabaenopeptin and namalide biosynthetic pathways (Shishido et al., 2017). Incorporation of Hty and Hph to pseudospumigins is most likely the result of cross talk between these BGCs.

Pseudospumigin trypsin inhibitory activity was tested with semi-pure pseudospumigin mixture, with about 80 % pseudospumigin A. The pseudospumigin A mixture has a weak trypsin inhibitory activity similar to spumigin E and nostosin A (Figure 18). The activity was measured on

fluorometric assay (Kawabata et al., 1988), where trypsin cleaves substrate Boc-Gln-Ala-Arg-MCA which releases fluorescent 4 methylcoumaryl 7 amide. With a trypsin inhibitor present, the cleaving reaction is slowed down and the difference of fluorescence can be measured.

The trypsin inhibitory effect of pseudospumigin A mixture showed time dependency similar to nostosin A and spumigin E, where the inhibition is increased until the end of 1 h measurement period (Figure 18). IC50 value at the end of the measurement, 4.5 ȝM, was assigned for pseudospumigin A mixture.

35

Figure 17. Biosynthetic gene clusters of pseudospumigin inNostocsp. CENA 543 and spumigin in Nodularia spumigenaCCY9414. Spumigin gene cluster includes the MePro biosynthetic genes not present in pseudospumigin BGC.

Figure 18. Time dependent trypsin inhibition (IC50) of pseudospumigin A mixture (Psp A mix), nostosin A (Ns A) and spumigin E (Sp E). Nostosin A and spumigin E data are from (Fewer et al., 2013).

The uppermost curve compares the IC50of pseudospumigin A mixture and spumigin E, showing that pseudospumigin A mixture has 1/70 of the inhibition strength of spumigin E. In the insert the IC50curves of pseudospumigin A mixture and spumigin E are enlarged to IC50 values <8, showing that the interactions were not in equilibrium after 55 min of reaction time. The figure is an unaltered reproduction of Figure 6 in article III of this thesis reproduced under CC-BY licence.

36 4.4. Nodularin (III)

Nodularin-R was analyzed fromNostocsp. CENA 543 biomass, and it is the first reported free-living nodularin producingNostoc. Among theNostocgenus nodularin-R has previously been reported only from one strain, living in a symbiosis with a cycad and in low concentration (Gehringer et al., 2012)(Table 5). High amounts of nodularin have also been found in lichens, where the most common cyanobacterial symbionts are members of the genusNostoc

(Kaasalainen et al., 2012). The benthicNostocsp. CENA 543 strain does produce nodularin in a level corresponding toNodularia strains, with 4.3 mg/g of nodularin in freeze-dried biomass (Table 5). Thenod BGC inNostocsp. CENA 543 is a standard nodularin gene cluster.

Desmethylnodularin-R was also detected as a minor nodularin variant with relative intensity of 4%

inNostocsp. CENA 543.

37

Table 5. Nodularin producers and their nodularin yield. Adapted from III.

Collection Year Location, strain mg/g Reference

Laboratory grown strains

4.2b (Beattie et al., 2000)

1987 Baltic Sea,Nodularia

1985-1987 Baltic Sea 0.1–2.4 (Sivonen et al., 1989)

Cycads and lichens

- Australia, cycad

Nostoc symbiont

0.0025a (Gehringer et al., 2012)

2009 Lichen talli <10–60 (Kaasalainen et al.,

2012)

a - mg NOD/g wet weight.

b - [Har2 ]NOD.

38 5. Conclusions

In this work, I describe the biosynthetic gene clusters fromNostoc sp. UHCC 0702,Scytonema hofmanniiPCC 7110 andNostocsp. CENA 543 producing heinamides, scytocyclamides and pseudospumigins, with the chemical structures and bioactivities of these compounds. I also show the production of high levels of nodularin inNostocsp. CENA 543. These results present new information on genetic organization of biosynthetic gene clusters and how they are spread among cyanobacterial genera.

The evolution of specialized metabolite BGCs and structural diversity proceeds through a variety of mechanisms, among which subcluster mobility and gene sharing have been shown to play a role, termed as the brick and mortar model (Medema et al., 2014). BGCs comprise of bricks, the modular enzymes forming the backbone of the product, and mortar, enzymes with tailoring and other functions in the biosynthesis. The chemical diversity of NRPS/PKS products would be much lower without the mobility of these elements. In this thesis, I show examples of the mobility of these enzymes leading to structural diversity in the biosynthesis of both laxaphycins and

pseudospumigins. In heinamides, the OHMePro production seems to be a result of a mobile tailoring element, which has been acquired by theNostoc sp. UHCC 0702, as the AA is not seen in other laxaphycins. In addition, OHPro is not a component of laxaphycins in all known producers (Table 1, Table 2) (Luo et al., 2014; Luo et al., 2015), which is most probably explained by the lack of the tailoring enzyme LxaN. Mobile tailoring elements are also in play in aeruginosin amino acid position 3. This amino acid can be MePro or Choi as in most aeruginosins, or Ile/Leu/Val now seen in pseudoaeruginosins (Fewer et al., 2009; Ishida et al., 2009; Liu et al., 2015; Hasan-Amer and Carmeli, 2017). The tetrapeptide scaffold remains roughly the same, with different tailoring enzymes supplying diverse substrates to the assembly line generating diversity in structure. In aeruginosins, the amino acid in position two can contain Phe/Tyr or Hph/Hty depending on if enzymes producing the homoamino acids are present (Fewer et al., 2009; Hasan-Amer and Carmeli, 2017). A mobile tailoring element can affect multiple PKS/NRPS pathways, for example, Hph and Hty appear in both aeruginosins and anabaenopeptins inNostoc sp. CENA 543 and Sphaerospermopsis torques-reginae ITEP-024 (Lima et al., 2017; Shishido et al., 2017).

Specialized metabolite chemical diversity is derived from genetic variation and has the potential to affect new targets in the producer organisms environment, improving the fitness of the producer, thus expanding the available genetic variation. When these chemicals are investigated in the context of human physiology or pathogenic organisms, new pharmaceutically beneficial targets can be found. This work broadens the knowledge of compounds with potentially beneficial bioactivities, while they do not fit the mold on conventional drugs. Laxaphycins have molecular weight around 1200-1400, when typical drugs have molecular weight under 500 (Atanasov et al., 2021). This along with other characteristics such as the need of two synergistic partners makes laxaphycins an atypical drug lead candidate. However, current antifungal treatments include echinocandins, important molecules with molecular weight around 1100 and several structural similarities to laxaphycins (Wiederhold, 2018). In the literature on natural product drug development, finding a lead compound requires high throughput screening to find candidates that fit the mold of typical drugs (Lewis, 2020; Atanasov et al., 2021), when in fact any single discovered NP has the potential to have significant clinical importance. Sticking to known practices can inhibit new discovery as seen in the results of Waksman platform drying up (Schatz et al., 2005; Lewis, 2020). The weak trypsin inhibitory activity of pseudospumigins is not very promising for drug development. However, describing the structure and biosynthetic origins of pseudospumigins gives us information on gene mobility and may help in future BGC identification. The experimentally proven production of 4-OHPro and the prediction for OHMePro biosynthesis offer new enzymatic reactions for production of these non-proteinogenic amino acids. OHPro and OHMePro are building blocks of

39

pharmaceutical and cosmetics industry, and new methods for producing them may have an impact if they are applied in the industry (Mohapatra et al., 2006; Hara and Kino, 2020).

The nodularin production ofNostocsp. CENA 543 was surprisingly high, as onlyNodulariastrains were known to produce high concentrations of the compound (Catherine et al., 2017). As

nodularins are typically produced by planktonicNodulariaa producer with benthic lifestyle is also an exception (Wood et al., 2020). This result shows that the presence of benthic nodularin producers should be taken into account in potable water monitoring (Wood et al., 2020). Molecular methods can be used in water monitoring and research to look for specific biosynthetic genes of toxins. This data could be associated with genetic data on taxonomically significant genetic information and include theNostoc genus as potential nodularin producer.

New SMs with different bioactivities are constantly discovered in cyanobacteria and other organisms. At the same time, numerous genomes of these organisms are published, with a considerate amount of cryptic BGCs. Every new described biosynthetic enzyme pathway and mechanism helps us to make better predictions for the possible products of these cryptic BGCs and brings us forward in implementing the information to expressing engineered BGCs. Specific focus in the bioactivity of the studied compounds will drive the results for finding solutions against specific targets. In this thesis, one of the targets was pathogenic fungi, and antifungal agents were discovered and described through antifungal screening. This kind of screening based top-down drug discovery is necessary to bring new motifs and diversity to use in genome mining and rational design approaches. This process is a never-ending puzzle where time to time a group of pieces reveals a concrete and beneficial innovation. My thesis provides a few new pieces to the puzzle.

40 6. Acknowledgements

This work was carried out at the Department of Microbiology, Faculty of Agriculture and Forestry, University of Helsinki, Finland. Funding was provided by Jane and Aatos Erkko Foundation and The Doctoral School in Environmental, Food and Biological Sciences (YEB) financial support for doctoral thesis completion. I would like to thank everyone involved in this thesis project. My primary supervisor Kaarina Sivonen for taking me in to the group to do my master’s thesis and giving me the opportunity and trusting me to continue research and pursue the doctoral title. My co-supervisors David Fewer and Jouni Jokela for their valuable help and contribution with the work.

Matti and Lyudmila for their help in the lab. Anu Humisto for her supervision when I first entered the Cyanogroup as an undergrad student. Tânia Shishido for the time I was working for her projects.

Anna Jortikka for our work with laxaphycins for her master’s thesis. Eliana Veloz for our work screening the culture collection. My student colleagues: Antti, Rafael, Muhammad, Sila, Julia, Maria and Jonna. Thanks to the staff of microbiology department Mika Kalsi, Susanna Holopainen and Pekka Oivanen for making things work.

I would like to thank professor Nadine Ziemert and professor Shmuel Carmeli for the pre-examination of this thesis and professor Nicolas Inguimbert for agreeing to act as my opponent. I thank professor Per Saris for joining the thesis grading committee as the faculty representative.

Thanks to my wife Annika for sharing the home office during the pandemic.

Thanks to everyone whose work I have cited, I am standing on the shoulders of giants.

41 7. References

Adams, DG, Bergman, B, Nierzwicki-Bauer, SA, Duggan, PS, Rai, AN, and Schüßler, A. 2013. "Cyanobacterial-Plant Symbioses," inThe Prokaryotes: Prokaryotic Biology and Symbiotic Associations, eds. E.

Rosenberg, E.F. DeLong, S. Lory, E. Stackebrandt & F. Thompson. Berlin, Heidelberg: Springer Berlin Heidelberg, 359-400.

Agha, R, Gross, A, Rohrlack, T, and Wolinska, J. 2018. Adaptation of a Chytrid Parasite to Its Cyanobacterial Host Is Hampered by Host Intraspecific Diversity. Frontiers in microbiology 9, 921-921.

Alvarino, R, Alonso, E, Bornancin, L, Bonnard, I, Inguimbert, N, Banaigs, B, and Botana, LM. 2020. Biological Activities of Cyclic and Acyclic B-Type Laxaphycins in SH-SY5Y Human Neuroblastoma Cells. Marine

Alvarino, R, Alonso, E, Bornancin, L, Bonnard, I, Inguimbert, N, Banaigs, B, and Botana, LM. 2020. Biological Activities of Cyclic and Acyclic B-Type Laxaphycins in SH-SY5Y Human Neuroblastoma Cells. Marine