* 'testing' of gitorious.org:statusnet/mainline:
Parse RSS items as activities
Remove hkit and do our own hcard parsing
Work around weird bug with HTML normalization via PHP DOM module; if source had xmlns and xml:lang I ended up with double output, breaking the subsequent parsing. Will have to track this down later and report upstream if not already resolved.
First steps to parsing RSS items as activities. RSS feeds don't seem
to have enough data to make good remote profiles, but this may work
with some "hints".
Parsing hcards for the data we need wasn't hard enough to justify using
hkit. It was dependent on a number of external systems (something to
run tidy), and only could handle XHTML.
We now parse HTML with the PHP dom libraries used elsewhere, and
scrape out our own hcards. Seems to work nicer and faster and most of
all works with Google Buzz profile URLs.
* 'testing' of gitorious.org:statusnet/mainline:
OStatus discover fixes:
Remove xpm support (no one really uses it, and IMAGETYPE_XPM is undefined, causing warnings)
Fix notice warning about unused var -- was renamed during refactoring.
* Subscription::start was sometimes passing users instead of profiles to hooks, which broke OStatus subscription notifications; now normalizing to profiles for processing.
* H-card parsing would trigger a lot of PHP warnings and notices in hKit. Now suppressing warnings and notices for the duration of the call to keep them out of output when display_errors is on.
* H-card parsing would trigger a PHP fatal error if the source page was not well-formed XML and Tidy was not present on the system. Switched normalization to use the PHP DOM module which is always present, as we have no need for Tidy's extra features here.
* Trying to fetch avatars from Google profiles failed and triggered a PHP warning due to the relative URL not being resolved during h-card parsing. Now passing profile page URL into hKit by sneaking a <base> tag in while we normalize the HTML source.
* Profile pages without a "Link" header could trigger PHP notices due to a bad NULL -> array(NULL) conversion in LinkHeader::getLink(). Now checking that there was a return value before converting single return value into array.
Base problem is that our caching-on-insert interferes with relying on column default values; the cached object is missing those fields, so they appear to be empty (null) when the object is retrieved from cache.
Now explicitly setting them when inserting subscriptions, and cleaned up some code that had alternate code paths.
May also have made auto-subscription work for remote OStatus subscribers, but can't test until magic sigs are working again.
We were double-unescaping for <content type="html">, turning <b> escaped chars into literal tags (which then may get removed entirely by the HTML scrubber).