2014-05-26 3 views
0

Привет, stackoverflow Привет, У меня проблемы с XML и Ruby.Ruby rexml xpath.each не пересекающиеся узлы

# rexml 


xmlfile = File.new("sample.xml") 
xmldoc = REXML::Document.new xmlfile 
root = xmldoc.root 

# count = 0 

XPath.each(xmldoc, "//CRDoc/speech/speaking") do |element| 
    # puts element.attributes['name'] 
    # puts element.text 
    File.open(file_name + "_" + element.attributes['name'] + "-" + year + ".xml", 'a+') do |f| 
     f.write("<speaker>" + element.attributes['name'] + "</speaker>") 
     f.write("<speech>" + doc.xpath("//speech/speaking[@name='#{element.attributes['name']}']").text + "</speech>" + "\n") 
     # f.write("<speaker>" + element.attributes['name'] + "</speaker>") 
     # f.write("<speech>" + doc.xpath('//CRDoc/speech/speaking').text + "</speech>" + "\n") 
     #f.wrtie("<speech>" + doc.xpath("//CRDoc/speech/speaking[@name=#{element.attributes['name']}]").text + "</speech>" + "\n") 
    end 
end 

Приведенный выше код читается в файле XML, который будет включен в этот пост. Проблема, с которой я сталкиваюсь в этом коде, - это повторять одну и ту же речь снова и снова, в отличие от записи каждой отдельной речи в файле XML в новый XML-файл.

<?xml version="1.0" encoding="UTF-8"?> 
<CRDoc> 
    <volume>141</volume> 
    <number>1</number> 
    <weekday>Wednesday</weekday> 
    <month>January</month> 
    <day>4</day> 
    <year>1995</year> 
<chamber>House</chamber> 
<pages>H3</pages> 
<congress>104</congress> 
    <session>1</session> 
    <document_title>(Applause, the Members rising.)</document_title> 
    <title>TRIBUTE TO THE HONORABLE DONNALD K. ANDERSON</title> 
    <recorder>(Mr. BOEHNER asked and was given permission to address the House for 
1 minute.)</recorder> 
<speech><speaker name="Mr. BOEHNER">Mr. BOEHNER</speaker>.<speaking name="Mr. BOEHNER">Mr. Clerk, before we proceed with the nominations for 
Speaker of the House, on behalf of Republican Members of the House, we 
want to thank you for your 35 years of service to this institution, and 
your 35 years of service to the American people. You have done your job 
ably on behalf of all Members on both sides of the aisle.</speaking> 
    <speaking name="Mr. BOEHNER">And to the other officers of the House, who have served the House so 
ably and the American people so ably, we want to thank them as well for 
their service in this House.</speaking> 
    <speaking name="Mr. BOEHNER">Farewell, and best wishes from all of us.</speaking></speech> 
<speech> <speaker name="Mr. FAZIO">Mr. FAZIO</speaker>.<speaking name="Mr. FAZIO">Will the gentleman yield?</speaking></speech> 
    <speech><speaker name="Mr. BOEHNER">Mr. BOEHNER</speaker>.<speaking name="Mr. BOEHNER">I yield to my friend, the gentleman from California [Mr. 
Fazio].</speaking></speech> 
<speech> <speaker name="Mr. FAZIO">Mr. FAZIO</speaker>.<speaking name="Mr. FAZIO">I appreciate my friend yielding.</speaking> 
    <speaking name="Mr. FAZIO">I, too, would like to add a few words of tribute to our friend.</speaking> 
    <speaking name="Mr. FAZIO">When the 103d Congress came to an official close on noon Tuesday, the 
House literally lived on for the next 24 hours in the person of the 
gentleman from Sacramento, CA, the Clerk of the House, Donnald K. 
Anderson. In serving as the first presiding officer for the purpose of 
organizing the 104th Congress, he fulfilled his last ministerial duty 
to this institution. After four successive terms as Clerk and a career 
with the House that began as a Page when Dwight Eisenhower was 
President and Sam Rayburn sat in the Speaker's chair, Donn Anderson now 
leaves a distinguished career of public service.</speaking> 
    <speaking name="Mr. FAZIO">On a personal level for many of us in this Chamber, it was only 
natural for Donn Anderson to have been the thread of continuity from 
one Congress to the next. For over 30 years, Donn has embodied every 
good virtue of this House. He has been its memory, its defender, its 
champion and often its conscience. He understood perhaps better than 
anyone here the meaning of the word ``bipartisanship'' and he lived it 
daily in his work with the Members. In his 8 years as the second 
highest ranking officer of the House, he worked tirelessly to move the 
House into the information age and so greatly benefited our 
constituents, the American people.</speaking> 
    <speaking name="Mr. FAZIO">As chairman of the Subcommittee on Legislative Appropriations, I 
looked forward to our annual ritual of hearings knowing that I could 
always count on the Clerk for the most splendid testimony. Although 
Donn himself admitted to his preference for Victorian manners, there 
was nothing old-fashioned about the direction of his office. He was 
thoroughly modern in his vision for the future of the House, and he 
fought hard to keep us current with the times. Just as Donn could 
explain the artistic nuances of paintings in the Rotunda, he could just 
as easily give you the technical lowdown of cameras in this Chamber and 
on this floor. As the House moves forward today with the institutional 
reforms and the reorganization, we do so with the solid foundation left 
behind by Donn Anderson.</speaking> 
    <speaking name="Mr. FAZIO">Perhaps in parting we can borrow a phrase from our late and great 
Speaker Tip O'Neill. He simply said on so many occasions, ``So long, 
old pal.''</speaking> 
    <speaking name="Mr. FAZIO">Thank you, Donn Anderson.</speaking></speech> 
</CRDoc> 

Так что, если кто-то должен был выполнить вышеуказанный код, он создавал бы 1 XML-файл для каждого динамика в выборке. Затем он создаст несколько речевых/динамических узлов с SAME speech. Я не понимаю, почему REXML не пересекает узлы и не помещает каждую речь в отличие от одной и той же речи снова и снова.

Я уверен, что есть лучший способ написать код, но я новичок в работе с XML и XPath.

Спасибо!

Ожидаемый результат будет в файле BOEHNER.xml:

<speech><speaker name="Mr. BOEHNER">Mr. BOEHNER</speaker>.<speaking name="Mr. BOEHNER">Mr. Clerk, before we proceed with the nominations for 
Speaker of the House, on behalf of Republican Members of the House, we 
want to thank you for your 35 years of service to this institution, and 
your 35 years of service to the American people. You have done your job 
ably on behalf of all Members on both sides of the aisle.</speaking> 
    <speaking name="Mr. BOEHNER">And to the other officers of the House, who have served the House so 
ably and the American people so ably, we want to thank them as well for 
their service in this House.</speaking> 
    <speaking name="Mr. BOEHNER">Farewell, and best wishes from all of us.</speaking></speech> 
    <speech><speaker name="Mr. BOEHNER">Mr. BOEHNER</speaker>.<speaking name="Mr. BOEHNER">I yield to my friend, the gentleman from California [Mr. 
Fazio].</speaking></speech> 

Как вы можете видеть г-н Бонер имеет 4 различных выступлений в его файле XML. Это соответствует 4 различным речам в файле sample.xml, как указано выше.

Таким образом, каждая речь в файле sample.xml переходит к новому файлу с именем динамиков.

<speaker>Mr. BOEHNER</speaker> 
<speech>Mr. BOEHNERMr. Clerk, before we proceed with the nominations for 
Speaker of the House, on behalf of Republican Members of the House, we 
want to thank you for your 35 years of service to this institution, and 
your 35 years of service to the American people. You have done your job 
ably on behalf of all Members on both sides of the aisle.And to the other officers of the House, who have served the House so 
ably and the American people so ably, we want to thank them as well for 
their service in this House.Farewell, and best wishes from all of us.Mr. BOEHNERI yield to my friend, the gentleman from California [Mr. 
Fazio].</speech> 

Будет в вышеуказанном формате. Каждая речь, сделанная динамиком, будет помещена в [имя_пользователя] .xml В приведенном выше формате.

+0

Не могли бы вы показать нам, что вам нужно сделать? –

+0

Два файла XML, сгенерированные из образца, должны быть: sample.xml_Mr. BOEHNER-1995.xml и sample.xml_Mr. FAZIO-1995.xml Это не позволит мне добавить более подробную информацию, поэтому я отредактирую вопрос выше с более подробной информацией. – CodeHard

ответ

0

Я считаю, что код ниже даст вам то, что вы хотите:

doc.xpath("//speech/speaking/@name").map(&:text).uniq.each do |name| 
    File.open(file_name + "_" + name + "-" + year + ".xml", 'a+') do |f| 
    doc.xpath('//speech').each do |speech| 
     f.write '<speech>' 
     f.write "<speaker name=\"#{name}\">#{name}</speaker>." 
     speech.xpath("speaking[@name='#{name}']").each do |speaking| 
     f.write "<speaking name=\"#{name}\">#{speaking.text}</speaking>" 
     end 
     f.write '</speech>' 
    end 
    end 
end 

(обратите внимание, я использую только nokogiri, не REXML ...)

Кроме того, вы могли бы на самом деле используйте nokogiri, чтобы построить вас XMLs ...

doc.xpath("//speech/speaking/@name").map(&:text).uniq.each do |name| 
    speaker = Nokogiri::XML('<root/>') 
    doc.xpath('//speech').each do |speech| 
    speech_node = Nokogiri::XML('<speech/>') 
    speech.xpath("*[@name='#{name}']").each do |speaking| 
     speech_node.root.add_child(speaking) 
    end 
    speaker.root.add_child(speech_node.root) unless speech_node.root.children.empty? 
    end 
    File.open(file_name + "_" + name + "-" + year + ".xml", 'a+') do |f| 
    f.write speaker.root.children 
    end 
end 
+0

Спасибо, что нашли время, чтобы ответить на этот вопрос. Тем не менее, первый блок кода предоставил каждую речь из обоих динамиков в каждом файле. Второй блок кода создает только один файл со всеми речами, но очень близок к работе. Я продолжу второй вариант и надеюсь заставить его работать. Спасибо за все ваше время! – CodeHard

+0

Проблема здесь: 'File.open (имя_файла +" _ "+ element.attributes ['name'] +" - "+ year +" .xml ", 'a +') do | f |' Он не может найдите имя колонок, чтобы записать файл. – CodeHard

+0

@CodeHard - несчастливая копия + вставка - исправлен код (нет 'element.attributes' - только' name' ...) –

 Смежные вопросы

  • Нет связанных вопросов^_^