Skip to content

Lesson: Define a Basic Terminology

Matt Zumwalt edited this page May 31, 2013 · 1 revision

This Tutorial is known to work with ActiveFedra version 6.0.0.
Please update this wiki to reflect any other versions that have been tested.

Goals

  • Define a simple RDFDatastream Terminology for RDF metadata
  • Create RDFDatastream Documents based on your Terminology
  • Create and update RDF assertions using the Terminology
  • Inspect RDF Documents to find out more about a given Term

Explanation

Steps

Step 1: Think about what the RDF is going to look like

For this first example we want to model simple, flat RDF. Let's say the root node of our XML documents is called fields and we have elements for title and author.

<fields>
  <title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
  <author>Horn, Zoia</author>
</fields>

Note that we do not have any namespaces, attributes on elements, schema declarations, or any other joyful XML features. OM does provide ways to handle these, but it does not require them. We will look at each of those separately in other lessons.

Step 2: Define the Terminology

Now we'll create a file called book_metadata.rb

Paste the following code into that file:

require "om"
class BookMetadata 
  # This include statement adds the behaviors of an OM Document to your class
  include OM::XML::Document

  set_terminology do |t|
    t.root(path: "fields")
    t.title
    t.author
  end

  # This method is called when you create new XML documents from scratch.
  # It must return a Nokogiri::Document.  Other than that, you can make your "default" documents look however you want.
  def self.xml_template
    Nokogiri::XML.parse("<fields/>")
  end
end

Step 3: Create an OM Document based on your Terminology

Open up an irb console (Ruby Interactive Console). Rather than simply calling irb on the command line, Use bundler to ensure that your dependencies are handled predictably.

bundle console
require "./book_metadata"
newdoc = BookMetadata.new
puts newdoc.to_xml
<?xml version="1.0"?>
<fields/> 

Now you have an empty OM document that was initialized using the BookMetadata.xml_template method you defined.

Step 4: Use the Terminology to modify the XML Document and Render it as XML

Because this Document is a BookMetadata object, you can use the Terminology to set and retrieve the values of the Terms you've defined.

newdoc.author = "Horn, Zoia"
 => "Horn, Zoia" 
newdoc.title = "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know."
 => "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know." 
puts newdoc.to_xml
<?xml version="1.0"?>
<fields>
  <author>Horn, Zoia</author>
  <title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
</fields>

As you can see, calling .to_xml has returned an XML document with the title and author set to the values you provided.

OM makes it easy to update these elements.

newdoc.author = ["Horn, Zoia", "Hypatia"]
 => ["Horn, Zoia", "Hypatia"] 
puts newdoc.to_xml
<?xml version="1.0"?>
<fields>
  <author>Horn, Zoia</author>
  <title>ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.</title>
  <author>Hypatia</author>
</fields>

Step 5: Access the Underlying Nokogiri Document and Stored XPath Queries

Each OM Document you create is basically just a wrapper around a Nokogiri Document and the Document's Terminology is basically just a handy structure that remembers XPath queries for you. You can access the inner Nokogiri Document by calling .ng_xml on the OM Document and you can get the stored XPath query by calling .xpath on any of the terms.

Since OM simply runs XPath queries against that underlying Nokogiri document, you don't need to do anything to keep the OM Document in sync with the Nokogiri Document. You can use the Nokogiri API to make any changes you want to the Nokogiri Document and the OM Document will reflect those changes.

newdoc.title.xpath
 => "//title" 
newdoc.author.xpath
 => "//author"
newdoc.ng_xml
 => #<Nokogiri::XML::Document:0x80776da4 name="document" children=[#<Nokogiri::XML::Element:0x8090bbb0 name="fields" children=[#<Nokogiri::XML::Element:0x80818e74 name="author" children=[#<Nokogiri::XML::Text:0x80573868 "Horn, Zoia">]>, #<Nokogiri::XML::Element:0x804a22e0 name="title" children=[#<Nokogiri::XML::Text:0x805795c4 "ZOIA! Memoirs of Zoia Horn, Battler for the People's Right to Know.">]>]>]> 

Step 6: Retrieve the Nokogiri Nodeset returned by a Term's XPath query

When you access a Term's values, OM is just running an XPath query for you and returning the values from the XML Nodes that were returned from the query. If you want to get the Nokogiri Nodeset from the XPath Query instead of the value from those Nodes, call .nodeset on the term.

newdoc.author.nodeset
 => [#<Nokogiri::XML::Element:0x80818e74 name="author" children=[#<Nokogiri::XML::Text:0x80573868 "Horn, Zoia">]>] 

Next Step

Go on to Lesson: Define a Terminology with a nested hierarchy of Terms or return to the Tame your XML with OM page.

Clone this wiki locally