Hello list,
In the course of this year we have developed a new website to automatically suggest free software for PDF display. Now that we are preparing to deploy a test version I see however a lot of issues, that are still remaining, the removal of which I deem extremely problematic.
The aim of my concerns is less of a technical nature but more of an organisational one - I don't see us bringing up the workforce to keep the site running as fancy as we had originally planned.
So this is it! We do now need to decide how to go on, as we cannot go on the way we did.
Let me start with a synopsis of the development so far:
Léopold and I started laying down the structure of the new site about a year ago in October 2012. We wanted to base the architecture of the new website on the designs of the current site. Even though we quickly developed some reservations about webgen, the current build system, we decided to keep it for the redrafted site. Partly we did this so we could keep existing styles and translations, partly I fear, we didn't want to step on anyones toes by fundamentally altering the sites workings - after all, objectively webgen was doing it's job. We decided to augment the static site with some perl scripts which would recognise the visitors web browser by its user agent string and select an appropriate PDF reader for display. The selection should be based on the availability of the reader on the operating system platform and on its ability to integrate with the web browser. It came as no real surprise, that parsing the user agent strings of web browsers is impossible to do gracefully. However this was kind of in the requirement specification and after different ready made Perl modules proved useless for the task we started implementing our own recognition module, which is more up to date than the available solutions at the time and seems to deliver adequate results so far. By the time Léo ended his internship with FSFE and left the project leaving me as the sole responsible for the development.
Testing and bug squishing of the site proved tedious. Even weeks after I believed the site to be running smoothly Hannes discovered situations in which the software wouldn't deliver a page at all. Without having been able to reproduce the errors, I believe them to be fixed now (not that I didn't believe that before).
It now turns out, that Heiki experiences problems, building the site on the productive server. This is again a problem which I can neither reproduce nor fully understand, though I am confident that together we will be able to fix it quickly - at least as long as the webgen versions in Debian remain compatible. While we can probably handle the remaining technical issues at this point, the site currently lacks the input data (not so much the technology) to make a sensible software recommendation. We had a hand full of readers registered for testing purposes. The data describing platforms and browsers for those readers is incomplete however.
Now, actually this is where it gets worse. At some point we were made aware of a reader called PDF.JS - something which seemingly catched some attention over the last month. Aside from the fact, that the suitability of PDF.JS as a standalone offline reader can be disputed, yet many people expect us to recommend it as an online reader anyway, PDF.JS is basically operating system agnostic which makes it somewhat tricky to even represent in the data structures on which we base recommendations. Only this we could handle.
But the problems continue: to make the installation of a reader as easy as possible we usually want to link directly to an installer. But how do we do this on platforms, on which it is impractical, uncommon or even impossible to just install software via a web link? This is the case today with most mobile platforms and GNU/Linux distributions. I have included a special case to enable aptURL-Links which are as far as I know only reliable in Ubuntu and even there only in Firefox. On most other GNU Linux distributions it would be nonsensical to link to a software installer altogether. On Android in the default configuration users will not even be able to install a software package delivered via the web browser, even if the projects would bother to offer a prebuilt package outside the play store in the first place.
In the upcoming months and years we expect UbuntuPhone, FirefoxOS, and possibly SailfishOS and GnomeOS and who knows what else to hit the market on mobile and desktop devices. This will make the situation infinitely worse. If we didn't expect to handle the runtime architecture of PDF.JS, I don't even dare to take a guess what new PDFreaders we will see on those platforms, let alone how to trigger an installer for them. Who wants to predict what the user agent strings on the new platforms will look like?
As the situation is currently, we lack the workforce to suit our data sets to the present day situation. It will take many times as much work to keep up with the changes we expect to see in the near future. Chances are that we will have to make technical adaptions in addition, to handle new concepts in software behaviour that we haven't seen yet.
Apparently we are not the only ones linking to alternative PDF readers. I've seen some websites of German governmental organisations link to the Heise software directory[1] along with PDF downloads. Heise is a publishing house for a number of German computer magazines. Even their list of Free and Non-Free PDF readers, although excellently maintained by a group of full time journalists, offers only a flat view similar to the one we show on the current version of pdfreaders.org. [1]http://www.heise.de/download/office/pdf/viewer-50000505011/?f=5s
Until today I have spent about 130 hours of FSFE time on the reimplementation of PDFreaders.org, not counting the time Leo and some others invested. With all our dreams and ideas of what the site should ideally look like this could go on indefinitely. We have to cut our losses and draw a line here.
This means, we could trim back our original plans drastically and concentrate on suggesting only a hand full of desktop readers on a few common platforms. In particular we would neglect most Free Operating Systems and mobile platforms. Preferably we would drop the greeter page[1] and take the readers page[2] for the index, so that erroneous suggestions don't carry so much weight. [1] http://pdfreaders.plutonium.fsfeurope.org/index/index.en.html [2] http://pdfreaders.plutonium.fsfeurope.org/readers/index.en.html
Alternatively, we can stick with the current version of pdfreaders.org, maybe paint the site up a little, and see to it, that the table displays up-to-date readers with working links. If we find yet one other setup, for which the new pdfreaders test site displays only a blank page, I'm all for this one.
I believe for both solutions we should anticipate another 10 to 20 hours of work time, under the premise, that even a cut back version of the new site, should spend a month or so in a test deployment while the static version would require less time being checked out by volunteers.