Improving performance with Multi-threading

A few weeks back a Surfulater customer (lets call her Mary) was having a problem where Surfulater wouldn’t start. It turned out that if she closed Surfulater and then immediately started it again, it wouldn’t start. During the process of trying to work out what was causing the problem I asked if I could get a copy of the Knowledge Base Mary was using. Once I got that I was able to reproduce the problem and fix it. This particular database was reasonably large and it took a few seconds to save, while Surfulater was closing. When you close Surfulater it disappears from the screen straight away, however it is actually still running, saving your knowledge bases and cleaning up after itself. If you started a new copy of Surfulater before the first one had truly closed, things got messy, causing the new copy to crash. So to end this part of the story, that problem is now fixed.

But the real reason for this post is to tell you about an important improvement in performance that I’m just finishing up. For a while now some users have been asking for an option to display a summary list of articles in the content window when you select a folder in the tree, instead of displaying the full content of all the articles. The reason for this request is that if you have lots of articles in a folder, it can take a little while before they are displayed, and during that time you are locked out of doing anything with Surfulater. This was never an issue that overly affected me, as my articles always came up quickly enough. That all changed though when browsing through Mary’s Knowledge Base (with her permission of course). Surfulater could go off with the pixies for many seconds at a time and all you could do was twiddle your fingers and wait. So why was there such a dramatic difference to what I was personally used to? Well it is simply because Mary’s Knowledge Base contains lots of very big articles, and mine don’t. By very big I’m taking about articles with from 100 to 400 KBytes of text and often containing quite a few large’ish images. To display one of these articles, it may take say one second, which is ok. But when you have a folder with 30 like this, you could be waiting around for up to 30 seconds. If you clicked on the folder by mistake, then well that’s just too bad.

So where is this taking us. It was clear from seeing Mary’s Knowledge Base that something needed to be done. I could have provided the requested option of simply displaying a list of article titles when a folder was selected, but I’ve never felt particularly comfortable with that solution. This is because the ability to display all articles in a folder, in their entirety, is very important for some people. Also there is little point displaying a list of articles, when you already have that in the tree. So you may as well just display the folder, without any article information.

A better and more ambitious solution was to hive off the process of displaying articles, so that it wouldn’t interfere with Surfulater’s operation. To accomplish this I’m using what is called multi-threading, which in essence enables you to run multiple processes at the same time, instead of sequentially one after the other. This sounds simple enough, but in fact anyone who has worked on the development of a multi-threaded application knows only too well how complex it is. One overly simple analogy is to think about two (or more) people trying to drive the same car at the same time! The most complex aspect of multi-threading is ensuring the processes don’t get in each others way. If they do the program can easily hang, corrupt files or other resources.

These animations show the improvements I’ve been able to achieve. Both show me holding down the down arrow key to move through the tree. The left hand image shows the behaviour as it has been to date, without multi-threading. Notice that when the tree selection moves to a folder with a number of large articles it pauses waiting for the articles to be displayed. This is the dead time I’ve referred to earlier. The right hand image shows the new multi-threaded implementation, which you can clearly see is much more responsive. In fact it is faster than this image shows, as i was constrained by the recording software I used to create the animations.

 

(You need to use IE to see the movies. For some reason they don’t appear in Firefox!)

The results are clear. You can now click on a folder with lots of large articles and continue working, without having to wait for all articles to be displayed. There may still be a short pause, depending on the size of the article currently being processed. This is due to constraints we have on what we are able to do in our background thread.

For the folks that have in the past requested an option to not display articles when a folder is selected, I’d like to know if this new multi-threaded implementation removes that need. And of course I welcome everyone else’s comments as always.

The plan is to have this available in the next release V1.99.0.0, which should see the light of day early next week, if not sooner.

Data Execution Protection & Rex Winn

Back in Dec 2005 Charles Cowan reported a problem where Surfulater would crash whenever it went to display one of its pop-up tip Windows. I’d had no other reports of this and was unable to reproduce it on any of our test PC’s. As any software developer will tell you these are the worst type of problems to have.

Over the course of a few days I created several special builds of Surfulater that I sent to Charles in an effort to isolate the precise cause of the problem. The results from this process made little sense at the time and I was unable to get to the bottom of it. Charles informed me that he was having problems with other programs as well and in the end we agreed it was likely a problem with his PC, and not Surfulater. I did suggest a work-around which enabled him to continue using Surfulater, but I was not particularly satisfied with the outcome.

Step forward to earlier this month when Rex Winn told me he was having a problem with Surfulater crashing when capturing new articles. Rex and I went back and forth on this and I concluded fairly quickly that it was the same problem Charles had reported. When a new article is created Surfulater pops up a tip window and there came the same crash. As it fortuitously happened Rex is also a software developer and after lengthy exchange of e-mails he was able to pin-point the cause. 

Surfulater was working fine on Rex’s PC until he did two things. First he changed his operating system from Windows XP to Windows 2003 Server and second he upgraded the BIOS on his Motherboard. At that point Surfulater would run ok until a pop-up tip was displayed. But then Rex needed to re-boot his PC and after that Surfulater wouldn’t start at all. Like Charles, Rex mentioned he was also having problems with some other programs and the only way to get these programs to work was to turn off Data Execution Protection.

I hadn’t heard of Data Execution Protection or DEP, so it was time to hit Google. The first thing I found was that DEP wasn’t enabled on my PC, so I quickly rectified that, started Surfulater and it worked perfectly. Next I discovered there are two aspects to DEP, Software DEP and Hardware DEP and I’d only enabled Software DEP. Hardware DEP is only available on newer PC’s, so I didn’t even know if my PC supported it. After some digging around in the BIOS I found the option that enabled Hardware DEP, turned it on, started Windows, then Surfulater and low and behold I was able to reproduce the problem Charles and Rex were having with the pop-up window. Everything else worked fine and the problem where Rex couldn’t even start Surfulater remained a real mystery.

So now it was time to start digging and find out precisely what was causing DEP to throw its hands up in horror and kill Surfulater. Before I do let me provide some information on DEP from Microsoft.

Data Execution Prevention (DEP) is a set of hardware and software technologies that perform additional checks on memory to help prevent malicious code from running on a system. In Microsoft Windows XP Service Pack 2 (SP2) and Microsoft Windows XP Tablet PC Edition 2005, DEP is enforced by hardware and by software.

The primary benefit of DEP is to help prevent code execution from data pages. Typically, code is not executed from the default heap and the stack. Hardware-enforced DEP detects code that is running from these locations and raises an exception when execution occurs. Software-enforced DEP can help prevent malicious code from taking advantage of exception-handling mechanisms in Windows.

The bottom line here is that DEP will help prevent Trojan and Virus programs from running on your PC, which is an admirable goal.

So back to Surfulater. It turned out that some third party code I use was causing the problem by trying to assemble and execute code in a data section. This has been something Windows programmers have been doing forever, but can no longer, at least on Windows XP SP2 and Windows 2003 SP1 with DEP enabled. The good news was that other folks using this same code were aware of the problem and several solutions had been presented. Kevin Hoffman’s code was a perfect fit and after a few hours reworking my code, Surfulater and DEP played happily together. I quickly sent a copy off to Rex to try who ecstatically reported his favourite program was working perfectly again.

To quote Rex:

The good news is that it *WORKED* YOU ROCK!!!

The bad news… THERE ISN’T ANY!!! SWEET!!! I’m back into SUL and
loving it!!! WOOOOHOOOO!!!

As I said at the start these are the worst sorts or problems to track down. Rex’s help was truly invaluable and for that I remain indebted.

A final word of advice for any programmer’s reading this. Ensure you thoroughly test you code with both Hardware and Software DEP enabled.

And to my Surfulater readers Version 1.97, Build 0.20 was released today with this updated code in place. Download here.

Pushing content into Surfulater from other Programs – Part 2

In Part 1 of this article I showed you how Surfulater’s XML Clipboard Format (SXCF) enables programs to add content to Surfulater. In this second and final part I’ll describe the SXCF in details.

For those familiar with XML you will see that SXCF is XML without an XML declaration or DTD. If you are interested in learning about XML, the Web has plenty of information available. This XML Tutorial is a reasonable starting point.

An SXCF record always starts and ends with:

<SULCONTENT [attributes]>
</SULCONTENT>

attributes consists of a series of compulsory and optional XML attributes which tell Surfulater where the content is coming from, and what to do with it. For example:

action=”add” source=”Clipboard” template=”Clipboard”

tells us to “add” a new article, using the “Clipboard” article template, and that the content has come from the “Clipboard”.

XML elements inside of <SULCONTENT> </SULCONTENT> provide the actual content to add, for each field in an Article, as well as providing additional content like MetaDescription and MetaKeywords, from a Web page.

The XML elements which specify content for a field in an article use the fields name as their element name. eg.

<Title>title text goes here</Title>

In this example the name Title in the XML <Title> element is the name of the article field to place the text “title text goes here”, in.

Article field names are specified in their corresponding HTML Article Template. These field names are typically the same name that you see displayed in Surfulater, but don’t have to be. If you look at an Article Template you will see each field is defined something like this:

<td class=”prompts1″>Title
</td>
<td class=”editbar”>
<img src=”pencil.gif” mce_src=”pencil.gif” idedit=”FLD.Title” />
</td>
<td class=”normal” >
<div class=”celldiv” id=”FLD.Title” />
</td>

The first Title is what’s displayed in the content window. This isn’t the field name. The actual field name is in the <div id=”FLD.Title” ..  /> element. It is this name, after the prefix FLD or FCV that must match the SXCF field name.

Note that I’ve pared this template example down so we can focus on the parts which are important to this discussion.

Handling HTML Markup

Field content that includes HTML markup must be treated in a special way, otherwise it will be processed as XML, instead of HTML. If you look back at Example 2 in Part 1 you will see that the field content inside the <Text> element starts with <![CDATA[ and ends with ]]> This enables normal HTML content to be used within the XML. If CDATA isn’t used, the HTML tags would be treated as XML elements and the result would be a big mess.

The best approach is to always use <![CDATA[]]> for field content, even if it doesn’t include any HTML. See XML CDATA for more information.

Attaching files to Articles 

SXCF elements that contain Surfulater field content include the ability to specify files to store in a Surfulater database. In other words attachments. A standard HTML link to a file looks like this:

<a href=”file://my_test.doc” mce_href=”file://my_test.doc” >

by including an attach=”true” attribute Surfulater will load the specified file and embed it in the database eg. 

<a href=”file://my_test.doc” mce_href=”file://my_test.doc” attach=”true”>

This is used in Example 2 in Part 1

Predefined SXCF Elements 

SXCF implements these predefined XML elements:

<URL></URL>

Attach the specified Web page to the article’s Attachment field. Note that this is only used for attaching Web pages, not files. It is used in conjunction with the action attribute as described below.

<MetaDescription></MetaDescription>

Contains the MetaDescription content from captured Web content. This is not currently used by Surfulater.

<MetaKeywords></MetaKeywords>

Contains the MetaKeywords content from captured Web content. This is not currently used by Surfulater.

<BASE></BASE>

Contains the <BASE href =…> value from the captured Web content. This is used locate HTML items referenced in the content, in the same way it is used in your Browser.

The <SULCONTENT [attributes]> in detail.

Name Value Definition
action Determines how to use the specified SXCF information as follows
action add Add a new record
action attach Attach content to current record. Only used to attach Web Pages as of 26 Apr 2006
action update Reserved for future use
 
source Specifies the source of the SXCF information as follows:
source WebBrowser Content is being sent from a Web Browser
source Clipboard Content is coming from the Clipboard. i.e.. From some other program.
source UserDirect Content comes from Surfulater itself. This is only used by Surfulater.
 
source_application Provides extra information on the ‘source’ provider.
source_application InternetExplorer Content is from IE
source_application Firefox Content is from Mozilla Firefox
source_application MSWord Content is from Microsoft Word or some other Windows Application
 
The following attributes are optional.
template HTML template name The name of the article template to use for creating this article. If it isn’t specified, the IE template is used. If the specified template doesn’t exist the article won’t be created. (1)
thumbnail filename The full path and filename of the thumbnail image to use for the article. The IE template is the only one to include a thumbnail. The other templates can be changed to include a thumbnail as can newly created templates.
deletethumbnail true | false If true or not specified the thumbnail image file is deleted once the article is created. false prevents it from being deleted.
emptytextok true | false Indicates main text field can be empty. If false or not specified and the text field is empty a notification popup tip will be displayed.

 

This should explain everything you need to start pushing content to Surfulater. It may seem a bit overwhelming at first glance, but if you look back at the examples in Part 1 they really are pretty simple. And of course if you need any help post here or in our support forums.

I mentioned a few posts back that parts of Surfulater had taken on a life of their own. Well SXCF was one such part, and there is still more to be done. For example Perry is pushing for the SXCF to be able to replace an existing attached file with a newer one, which I agree with. Other capabilities include the ability to append content to field in an existing article, or replace it completely. SXCF provides a solid foundation to build these and other capabilities on.

Pushing content into Surfulater from other Programs – Part 1

Surfulater provides a simple method to enable content to be added to it from other applications (think MS Word etc.) via. the Windows Clipboard. This was developed for Surfulater’s our own internal use, and has now been extended and opened up following user discussions on our Support Forums.

This is accomplished though a simple, clearly defined format, that enables anyone to add new content to Surfulater, in an open extensible way. As an example one of our Surfulater users, Perry Mowbray, has used this to write an add-in for Microsoft Word that enables Word users to add new content to Surfulater.

In essence whenever Surfulater is running it keeps an eye on information that gets placed into the Windows Clipboard, and if it sees something destined for it, it grabs it and creates a nice new article. For this to work the information placed into the Clipboard needs to follow some simple rules. Lets start with an example:

<SULCONTENT action=”add” source=”Clipboard” template=”Clipboard”>
 <Title>Example 1</Title>
 <Text>Some text for example one</Text>
</SULCONTENT>

This is about as simple as it gets. To try it for yourself, start Surfulater, then right click and choose Save Target As.. on this file and save it as SXCF_Example1.txt Next open the file in Windows Notepad, use Edit|Select All, and then Edit|Copy, to copy it to the Windows Clipboard. Surfulater will pick this up and create the following article:

Example 1 
Now for something more complex:

<SULCONTENT action=”add” source=”WebBrowser” source_application=”InternetExplorer” template=”IE” thumbnail=””>
        <Title>Mini HDD</Title>
 <Text><![CDATA[<TR><TH width=”100%” bgColor=#d5d6d5><FONT face=Arial,Helvetica size=2>Tiny
                hard drive garners Guinness World Record as smallest HDD</FONT></TH></TR>
                <TR><TD width=”100%” height=20><FONT face=Arial,Helvetica size=2>Mar. 17, 2004</FONT> —
                <FONT face=Arial,Helvetica size=2><IMG hspace=10 src=”http://deviceforge.com/files/misc/toshiba-tinyhd-thm.jpg” mce_src=”http://deviceforge.com/files/misc/toshiba-tinyhd-thm.jpg”
                align=left vspace=5>Toshiba announced this week that Guinness World
                Records has certified its 0.85-inch hard disk drive (HDD) as the smallest
                HDD in the world. Toshiba claims its 0.85-inch HDD, announced in
                January 2004, is the first HDD to deliver multi-gigabyte data
                storage in a sub-one-inch form factor.</FONT>
                <A href=”http://deviceforge.com/news/NS8560517030.html” mce_href=”http://deviceforge.com/news/NS8560517030.html” >
                <IMG src=”http://deviceforge.com/images/readmore.gif” mce_src=”http://deviceforge.com/images/readmore.gif” align=right border=0></A>
                </TD></TR>]]>
        </Text>
 <Reference><![CDATA[<A href=”http://deviceforge.com/news/200404120922NS2297227508.html” mce_href=”http://deviceforge.com/news/200404120922NS2297227508.html” >
                   http://deviceforge.com/news/200404120922NS2297227508.html</A>]]>
        </Reference>
 <Comments>This is a copy of an article included in the sample Knowledge Base included with Surfulater.
 </Comments>
 <Attachments><![CDATA[I’ve also attached a MS Word Document located on my PC <a href=”file://d:/saig/bin6/test1.doc” mce_href=”file://d:/saig/bin6/test1.doc” attach=”true”>
                     test1.doc</a> Neat huh!]]>
        </Attachments>
</SULCONTENT>

This is a copy of an article from the sample Knowledge Base included with Surfulater, along with some extra content; a Comment and an Attachment. Perform the steps outlined above on this file, SXCF_Example2.txt  to see the result for yourself. The Attachment will be missing unless you just so happen to have a file d:\saig\bin6\test1.doc  And this is what it looks like.

Surfulater Article

These examples should clearly show the capabilities on offer with Surfulater’s XML Clipboard Format.  I have to say that I think this is pretty neat stuff, not rocket science, but impressive none the less.

In Part 2 of this article I’ll explain the SXCF in detail, enabling you to use it your own applications or in program add-ons.

Note: The examples above and the content of these articles require Surfulater V1.96.0.0 or later.

Surfulater, Under the Hood and Down the Road

I’ve been asked to write about my vision for Surfulater and decided a Blog post would be a good place for this. I’m afraid it is a bit long winded as I want to lay down some background material so readers will know where I am coming from. I’m told vision statements contain lots of motherhood gobbledygook. Excuse me for excluding such fluff and for not being as visionary as some may like.

I’ve been designing, developing and publishing software for over 20 years. For a number of years I worked with a team of programmers on vertical market applications, in a company of which I was a director. For the past 15 years I’ve worked predominantly on my own, on a product named ED for Windows which is a full featured programmer’s editor. ED is a very large and complex application, with a large and diverse user base who place many demands on it. It is a highly configurable application and can be extended via a built-in scripting language. It also supports some 35+ programming languages. Bottom line – a big, complex, powerful application that most people will never fully utilize.

For quite some time I’d been keen to develop other products and I finally made a small start in late 2003. I spend a lot of time on the Internet researching all manner of things. A lot of the time it is to do with programming, but also business, travel and other personal interests. I was very frustrated by the poor tools available to collect and save information that I found while surfing, and needless to say Bookmarks and Favorites just don’t cut it. So the idea for Surfulater was born. Continue reading “Surfulater, Under the Hood and Down the Road”

Surfulater Content Style Issues

A few people have mentioned that sometimes content they capture from Web pages into Surfulater doesn’t display using the same styles used on the originating Web page. I want to explain why this occurs.

First let me say that if you capture complete web pages, this isn’t an issue. It only occurs with content displayed in the Surfulater content window, and that’s the key.

All content in Surfulater uses HTML. HTML has default styles which can be overridden by Cascading Style Sheets or CSS as commonly referred to. By styles I’m referring to things such as font size, color, bold, italic etc.

As an example lets look at how a paragraph style works. Text inside a paragraph tag <p> will either use a default style dictated by your Web browser, a style embedded in the web page or and style in an external style sheet. The last two cases override the browsers default style and cause all paragraph text to be displayed in some other style, a larger bold font for example.

Surfulater enables you to view all of the articles in a folder at once, which is a great feature that many people want and one that sets Surfulater apart from the competition. What this means however is that you are most likely viewing HTML captured from a whole range of Web sites, each with its own Cascading Style Sheet. So paragraph’s on one Web site may well be displayed with quite different styles to its next door neighbour .

This mix of styles (CSS) from different Web sites basically makes it very difficult, if not impossible for us to use the same styles as the original site, simply because they will all conflict with each other. For this reason we don’t even bother bringing across external CSS information when we capture selected content.

The story is quit different when we save complete Web pages though. In that case we are only ever dealing with a single web site and only ever display a single page from the one site at a time. We capture all embedded and external CSS when we capture the web page and use that in turn when we display the page again.

Surfulater does use its own Cascading Style Sheets, so we can offer a good degree of control over the styles used in the content window. At present we have two CSS’s. One is used when E-Mailing content from Surfulater (surfulater.css) and the other is used for the Content window and is embedded inside Surfulater. I plan to pull this out and make it an external file in a future release. This will enable styles to be set for headings, paragraphs etc. In fact my example above which used paragraph’s, wasn’t a very good one as heading tags are the ones that show up worst.

So the bottom line is we can improve on the current situation and will, however it is unlikely we’ll see the exact same styles as on the original web page, unless of course you save and attach the entire page!

Update 26 Nov 2005 Putting ones thoughts in writing often helps you to see things that were hidden away before. And such is the case here. I can now see a way clear to get the styles to match up correctly. I just need to write the code and see if I’m right.