Lists all of the journal entries for the day.

Thu, 24 Dec 2009

11:12 AM - The problem with software as a service

I've been reading for years about the future of software.  Most people believe software as a service will take off.  In some ways, it already has.  There are a few fatal flaws with this model.  It requires trust.  Companies have the resources to purchase software as a service and then lose access to it.  It is costly, but they can afford to move to another provider.  If a home user (end user) purchases access to Microsoft Office over the Internet, they expect to be able to access it at any time.  Their child may have a report due the next day.  They maybe working from home to finish a report for the CEO on Monday.  If the software is not available due to a network outage, downtime at the provider, or a billing dispute, the home user is out of luck.  Worse yet, Microsoft could stop offering the service during the middle of the year.  They've done this with Encarta Online, MSN, and other services in the past.  What do you do if the company goes out of business?  

The antivirus industry has moved to this model effectively.  You purchase software with a year subscription now.  At the end of that subscription, your computer is no longer protected.  It's a great model for them.  Your computer may get infected during that time and then their software or another product will look like it just found a bunch of viruses the previous version could not.  Most people don't move quickly on buying new AV software and it's often hard to remove.  

My company has been considering housing our data storage at a third party location.  My concern with this is what happens if there is a falling out with the provider.  Is our data at risk?  People talk about cloud computing like it's this amazing service, but in reality it's just web hosting that might be distributed.  You're still counting on one provider to protect your data and make it available to you just as the home user is expecting MS Office to work.  It's a large risk which involves trust.  Your business future is hanging in the balance of this one provider.  What if they go out of business, get sold to a company that doesn't care about the cloud, have a network failure during a big time in your business cycle? These things can happen to a self hosted solution as well, but you can do something about it.

I'm not trying to pick on Microsoft.  I've never used office online so I don't know what it's like.   

()

11:35 AM - Software Update: A distraction

I have been busy working lately.  With the weather change, I haven't felt much like using my PC.  Today, I booted up Windows VIsta to lookup a book on my ebook reader.  Five different programs started downloading updates at the same time.  When I started the ebook reader, another download began.  

I think Microsoft and Apple should develop a software update framework for their operating systems.  This framework would be usable by all software.  It would allow updates to be downloaded in turn, throttled, scheduled, and offer one place to manage them.  I'm sick of having a bunch of programs running in the background just to check for software updates.  The java update scheduler, real player, acrobat, flash, steam, antivirus, xfire, msn, ... it's just got to stop.  

There is also a security advantage to this idea.  One could go into a single application and see that all their software is up to date.  I'd also like the software to easily wait until I reboot to install.  I don't want a nag message every ten minutes or even four hours.   

()

11:46 AM - Generating and cleaning HTML for PDF generation, Joomla import

At work, I've been working on some software to generate a newsletter.  The software required several formats including PDF, HTML, and text.  Each one used a different template.  The input is from RSS feeds.

For the PDF generation, we went with a commercial product called PDF Reactor.  It's a java library that allows one to create PDF documents from HTML and CSS.  It runs on top of iText and does several post processing tasks.  As part of the process, it uses one of three configurable HTML formatters.  We chose to use the default which has problems with malformed HTML in some cases.  An open anchor tag causes grief.  Since the input is random and from the internet, we needed a way to clean it up.  I setup JTidy to process the input when it's HTML.  That way the HTML is always wellformed going into the system.

 PDF Reactor is licensed per cpu core.  They do check the core count.  We tried to run it on an i5, assuming it would just not take advantage of all the features, and it went into evaluation mode.  We had to physically disable two cores on the CPU to get the system to work.  This only requires 2 lines in /boot/loader.conf in FreeBSD. 

The odd part of this project was importing from Joomla.  I had to take data from various Joomla installs, and import select categories into our new system, then create newsletters from this data after it was cleaned up and categorized.  This meant selecting all the data from Joomla's content table on certain sections and categories.  The sections were constant within the install.  I also had to take the section and categories names and tag the articles in the new system with them.  If the names were duplicates, we had some problems.  The system was overloaded to use categories as two different levels.  It's caused some complications as joomla categories are not this flexible.  

If I had more time, I would have tried to create the PDF generation from HTML myself using iText.  This is something I would like for just journal where I use iText for PDF and RTF generation now.  

()