Pushing content into Surfulater from other Programs – Part 2

In Part 1 of this article I showed you how Surfulater’s XML Clipboard Format (SXCF) enables programs to add content to Surfulater. In this second and final part I’ll describe the SXCF in details.

For those familiar with XML you will see that SXCF is XML without an XML declaration or DTD. If you are interested in learning about XML, the Web has plenty of information available. This XML Tutorial is a reasonable starting point.

An SXCF record always starts and ends with:

<SULCONTENT [attributes]>
</SULCONTENT>

attributes consists of a series of compulsory and optional XML attributes which tell Surfulater where the content is coming from, and what to do with it. For example:

action=”add” source=”Clipboard” template=”Clipboard”

tells us to “add” a new article, using the “Clipboard” article template, and that the content has come from the “Clipboard”.

XML elements inside of <SULCONTENT> </SULCONTENT> provide the actual content to add, for each field in an Article, as well as providing additional content like MetaDescription and MetaKeywords, from a Web page.

The XML elements which specify content for a field in an article use the fields name as their element name. eg.

<Title>title text goes here</Title>

In this example the name Title in the XML <Title> element is the name of the article field to place the text “title text goes here”, in.

Article field names are specified in their corresponding HTML Article Template. These field names are typically the same name that you see displayed in Surfulater, but don’t have to be. If you look at an Article Template you will see each field is defined something like this:

<td class=”prompts1″>Title
</td>
<td class=”editbar”>
<img src=”pencil.gif” mce_src=”pencil.gif” idedit=”FLD.Title” />
</td>
<td class=”normal” >
<div class=”celldiv” id=”FLD.Title” />
</td>

The first Title is what’s displayed in the content window. This isn’t the field name. The actual field name is in the <div id=”FLD.Title” ..  /> element. It is this name, after the prefix FLD or FCV that must match the SXCF field name.

Note that I’ve pared this template example down so we can focus on the parts which are important to this discussion.

Handling HTML Markup

Field content that includes HTML markup must be treated in a special way, otherwise it will be processed as XML, instead of HTML. If you look back at Example 2 in Part 1 you will see that the field content inside the <Text> element starts with <![CDATA[ and ends with ]]> This enables normal HTML content to be used within the XML. If CDATA isn’t used, the HTML tags would be treated as XML elements and the result would be a big mess.

The best approach is to always use <![CDATA[]]> for field content, even if it doesn’t include any HTML. See XML CDATA for more information.

Attaching files to Articles 

SXCF elements that contain Surfulater field content include the ability to specify files to store in a Surfulater database. In other words attachments. A standard HTML link to a file looks like this:

<a href=”file://my_test.doc” mce_href=”file://my_test.doc” >

by including an attach=”true” attribute Surfulater will load the specified file and embed it in the database eg. 

<a href=”file://my_test.doc” mce_href=”file://my_test.doc” attach=”true”>

This is used in Example 2 in Part 1

Predefined SXCF Elements 

SXCF implements these predefined XML elements:

<URL></URL>

Attach the specified Web page to the article’s Attachment field. Note that this is only used for attaching Web pages, not files. It is used in conjunction with the action attribute as described below.

<MetaDescription></MetaDescription>

Contains the MetaDescription content from captured Web content. This is not currently used by Surfulater.

<MetaKeywords></MetaKeywords>

Contains the MetaKeywords content from captured Web content. This is not currently used by Surfulater.

<BASE></BASE>

Contains the <BASE href =…> value from the captured Web content. This is used locate HTML items referenced in the content, in the same way it is used in your Browser.

The <SULCONTENT [attributes]> in detail.

Name Value Definition
action Determines how to use the specified SXCF information as follows
action add Add a new record
action attach Attach content to current record. Only used to attach Web Pages as of 26 Apr 2006
action update Reserved for future use
 
source Specifies the source of the SXCF information as follows:
source WebBrowser Content is being sent from a Web Browser
source Clipboard Content is coming from the Clipboard. i.e.. From some other program.
source UserDirect Content comes from Surfulater itself. This is only used by Surfulater.
 
source_application Provides extra information on the ‘source’ provider.
source_application InternetExplorer Content is from IE
source_application Firefox Content is from Mozilla Firefox
source_application MSWord Content is from Microsoft Word or some other Windows Application
 
The following attributes are optional.
template HTML template name The name of the article template to use for creating this article. If it isn’t specified, the IE template is used. If the specified template doesn’t exist the article won’t be created. (1)
thumbnail filename The full path and filename of the thumbnail image to use for the article. The IE template is the only one to include a thumbnail. The other templates can be changed to include a thumbnail as can newly created templates.
deletethumbnail true | false If true or not specified the thumbnail image file is deleted once the article is created. false prevents it from being deleted.
emptytextok true | false Indicates main text field can be empty. If false or not specified and the text field is empty a notification popup tip will be displayed.

 

This should explain everything you need to start pushing content to Surfulater. It may seem a bit overwhelming at first glance, but if you look back at the examples in Part 1 they really are pretty simple. And of course if you need any help post here or in our support forums.

I mentioned a few posts back that parts of Surfulater had taken on a life of their own. Well SXCF was one such part, and there is still more to be done. For example Perry is pushing for the SXCF to be able to replace an existing attached file with a newer one, which I agree with. Other capabilities include the ability to append content to field in an existing article, or replace it completely. SXCF provides a solid foundation to build these and other capabilities on.

6 Replies to “Pushing content into Surfulater from other Programs – Part 2”

  1. Good release,maybe it’s just me but it feels like it starts up quicker.Keep up the good work.

  2. Neville, thanks for this. I was playing when this was undocumented (decompiling Firefox extensions is easier than it ought to be), and now that it is documented (and more powerful), i’ll have to get back onto the case.

  3. Hi Folks, I’ve been wanting to write about the Surfulater XML Clipboard format for a while, pretty much ever since Perry starting pushing and prodding for more info. I’m both pleased and relieved that it is now out there and hope it gets put to good use, which I’m sure it will.

    Perry, +1 for world domination. 😉

  4. and embed it in the database eg.

    It is not a database is it?

    It is an xml file.

    I can’t index fiels, or create stored procedures?

  5. Hi Bill,
    Whatever gets pushed across using the Surfulater XML Clipboard Format gets embedded in the Knowledge Base. This includes images, file attachments, text and complete Web pages.

    Example 2 back in Part 1 shows a combination of HTML, an image and a local file being added to (embedded in) a Surfulater Knowledge Base.

    Surfulater uses a combination of an XML file and a Database file for each Knowledge Base.

    You could index the XML and soon you’ll be able to index content from Surfulater, published as HTML Web pages. For example using Desktop Search.

    As for stored procedures, no. I’d be interested to know what use you’d see these being put to.

Comments are closed.