Основываясь на исходном XML ниже, я хочу захватить текстовые узлы и <emphasis>
элементов, которые не содержат <emphasis bold="yes">
ПРИМЕЧАНИЕ: </emphasis>
с их соответствующими <emphasis bold="yes">
ПРИМЕЧАНИЕ #: </emphasis>
элемент.Как структурировать плохо структурированный контент, основанный на следующих предикатах предиката/предшествующего словаря sibling?
Источник XML:
<section>
<para>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the first note 1 <emphasis bold="yes">that should only be in the <emphasis italic="yes">first</emphasis> subsection occurance of note one.</emphasis>. This is the second sentence of the first note one. <emphasis italic="yes">Here is some other text</emphasis> that can appear. <emphasis bold="yes">Marvin Gaye is an excellent musician1.</emphasis> Play it for your girlfriend1 <emphasis italic="yes">now1.</emphasis>.
<emphasis bold="yes">NOTE2:</emphasis> This is the text of the first note two2.1 <emphasis italic="yes">The Isley Brothers are also good.2.1</emphasis>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the second note one.1.2 <emphasis italic="yes">My girlfriend loves them1.2</emphasis>
<emphasis bold="yes">NOTE3:</emphasis> This is the text of the first note three3.1.
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the third note one.1.3<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis bold="yes">NOTE3:</emphasis> This is the text of the second note three.3.2<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis bold="yes">NOTE2:</emphasis> This is the text of the second note two.2.2<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</section>
Текущий XSLT:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="*|@*|text()">
<xsl:copy>
<xsl:apply-templates select="*|@*|text()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="section">
<root>
<xsl:apply-templates select="*|@*|text()"/>
</root>
</xsl:template>
<xsl:template match="para/emphasis[preceding-sibling::emphasis[@bold='yes' and text()='NOTE1:']]"/>
<xsl:template match="para/emphasis[preceding-sibling::emphasis[@bold='yes' and text()='NOTE2:']]"/>
<xsl:template match="para/emphasis[preceding-sibling::emphasis[@bold='yes' and text()='NOTE3:']]"/>
<xsl:template match="para/text()[preceding-sibling::emphasis[@bold='yes' and text()='NOTE1:']]"/>
<xsl:template match="para/text()[preceding-sibling::emphasis[@bold='yes' and text()='NOTE2:']]"/>
<xsl:template match="para/text()[preceding-sibling::emphasis[@bold='yes' and text()='NOTE3:']]"/>
<xsl:template match="para/emphasis[@bold='yes' and text()='NOTE1:' and not(position()=last())]">
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis>
<xsl:copy-of
select="following-sibling::text()[not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE2:']) and not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE3:'])] |following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:'))]"
/>
</para>
</subsection>
</xsl:template>
<xsl:template match="para/emphasis[position()=last() and position() > 1 and @bold='yes' and text()='NOTE1:']">
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis>
<xsl:copy-of
select="following-sibling::text()[position() = 1]|following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:'))]"
/>
</para>
</subsection>
</xsl:template>
<xsl:template match="para/emphasis[@bold='yes' and text()='NOTE2:' and not(position()=last())]">
<subsection>
<para>
<emphasis bold="yes">NOTE2:</emphasis>
<xsl:copy-of
select="following-sibling::text()[not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE1:']) and not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE3:'])] |following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:'))]"
/>
</para>
</subsection>
</xsl:template>
<xsl:template match="para/emphasis[position()=last() and position() > 1 and @bold='yes' and text()='NOTE2:']">
<note>
<para>
<emphasis bold="yes">NOTE2:</emphasis>
<xsl:copy-of
select="following-sibling::text()[position() = 1]|following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:')) ]"
/>
</para>
</note>
</xsl:template>
<xsl:template match="para/emphasis[@bold='yes' and text()='NOTE3:' and not(position()=last())]">
<note>
<para>
<emphasis bold="yes">NOTE3:</emphasis>
<xsl:copy-of
select="following-sibling::text()[not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE2:']) and not(preceding-sibling::emphasis[@bold='yes' and text()='NOTE1:'])] | following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:'))]"
/>
</para>
</note>
</xsl:template>
<xsl:template match="para/emphasis[position()=last() and position() > 1 and @bold='yes' and text()='NOTE3:']">
<note>
<para>
<emphasis bold="yes">NOTE3:</emphasis>
<xsl:copy-of select="following-sibling::text()[position() = 1]|following-sibling::emphasis[not(contains(string(), 'NOTE1:')) and not(contains(string(), 'NOTE2:')) and not(contains(string(), 'NOTE3:')) ]"/>
</para>
</note>
</xsl:template>
</xsl:stylesheet>
Токовый выход:
<root>
<para>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the first note 1 <emphasis bold="yes">that should only be in the <emphasis italic="yes">first</emphasis> subsection occurance of note one.</emphasis>. This is the second sentence of the first note one. <emphasis italic="yes">Here is some other text</emphasis> that can appear. <emphasis bold="yes">Marvin Gaye is an excellent musician1.</emphasis> Play it for your girlfriend1 <emphasis italic="yes">now1.</emphasis>.
<emphasis italic="yes">The Isley Brothers are also good.2.1</emphasis>
<emphasis italic="yes">My girlfriend loves them1.2</emphasis>
<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE2:</emphasis>
<emphasis italic="yes">The Isley Brothers are also good.2.1</emphasis>
<emphasis italic="yes">My girlfriend loves them1.2</emphasis>
<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis>
<emphasis italic="yes">My girlfriend loves them1.2</emphasis>
<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
<note>
<para>
<emphasis bold="yes">NOTE3:</emphasis>
<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</note>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis>
<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
<note>
<para>
<emphasis bold="yes">NOTE3:</emphasis>
<emphasis italic="yes">Steak and potatos3.2</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</note>
<subsection>
<para>
<emphasis bold="yes">NOTE2:</emphasis>
<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
</para>
</root>
Desired output:
<root>
<para>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the first note 1 <emphasis bold="yes">that should only be in the <emphasis italic="yes">first</emphasis> subsection occurance of note one.</emphasis>. This is the second sentence of the first note one. <emphasis italic="yes">Here is some other text</emphasis> that can appear. <emphasis bold="yes">Marvin Gaye is an excellent musician1.</emphasis> Play it for your girlfriend1 <emphasis italic="yes">now1.</emphasis>.
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE2:</emphasis>This is the text of the first note two2.1<emphasis italic="yes">The Isley Brothers are also good.2.1</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the second note one.1.2 <emphasis italic="yes">My girlfriend loves them1.2</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE3:</emphasis> This is the text of the first note three3.1.
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE1:</emphasis> This is the text of the third note one.1.3<emphasis italic="yes">She is going to make me dinner tonight1.3</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE3:</emphasis> This is the text of the second note three.3.2<emphasis italic="yes">Steak and potatos3.2</emphasis>
</para>
</subsection>
<subsection>
<para>
<emphasis bold="yes">NOTE2:</emphasis> This is the text of the second note two.2.2<emphasis italic="yes">And then some wine2.2</emphasis>
</para>
</subsection>
</para>
</root>
YIIIIIKES. Пожалуйста, переформатируйте свой вопрос, чтобы ваш код выглядел как код. Это маленькая кнопка «{}» в верхней части вашего редактора. – ABach
Я добавил отступ в XML в вашем quetion, чтобы его можно было прочитать. Я также удалил то, что, как я полагаю, является ложным открывающим тегом 'xsl: template', за которым последовал идентичный. Это незаконно, поскольку шаблоны не могут содержать другие шаблоны. Я также добавил закрывающий тег 'xsl: stylesheet', который отсутствовал. – Borodin
Пожалуйста, объясните, что именно требуется (и любые правила). В его нынешнем виде это не ясно. Кроме того, пожалуйста, укажите желаемый результат преобразования. Если возможно, замените текущий пример на меньший и отформатируйте XML-документ и желаемый результат таким образом, чтобы не требовалась горизонтальная прокрутка. –