On my test setup, this fixes inbox delivery to 10,000 local recipients from background queuedaemon running with a 32mb memory limit, completes the job within a minute from start.
Pretty much everything in File and File_redirection initial processing needs to be rewritten to be non-awful; this code is very hard to follow and very easy to make huge bugs. A fair amount of the complication is probably obsoleted by the redirection following being built into HTTPClient now.
* Moved notification sending from Notice::saveReplies to distrib queue handler, so it'll pull from the reply set we've saved regardless of how we got it.
* Set up gettext infrastructure for command-line scripts; gets localization mail notifications etc working from background queues.
* Adjusted locale switching: common_switch_locale() works at runtime for bg scripts, forces a message catalog update
Should help with dupes that come in when inbox distrib jobs die and get restarted, etc.
Conflicts:
classes/Inbox.php
Looks like this was implemented on master recently and not copied up to testing. Merging to my version on testing as I've added some doc comments and extracted a couple functions for future ease of use.
to (profile_id, id) instead of (profile_id, created, id).
It's been falling back to PRIMARY instead, which is really
very inefficient for a profile that hasn't posted in a few
months. Even though forcing the index will cause a filesort,
it's usually going to be better. Even for large profiles it
seems much faster than the badly-indexed query.
to (profile_id, id) instead of (profile_id, created, id).
It's been falling back to PRIMARY instead, which is really
very inefficient for a profile that hasn't posted in a few
months. Even though forcing the index will cause a filesort,
it's usually going to be better. Even for large profiles it
seems much faster than the badly-indexed query.
The magic __call() method is used to implement a getter and setter interface, and simply didn't bother to throw an error for things it didn't recognize.
This may expose a number of existing errors where mistyped method names are called and we're not noticing that they're failing.
This bug was hitting a number of places where we had the pattern:
$db->find();
while($dbo->fetch()) {
$x = clone($dbo);
// do anything with $x other than storing it in an array
}
The cloned object's destructor would trigger on the second run through the loop, freeing the database result set -- not really what we wanted.
(Loops that stored the clones into an array were fine, since the clones stay in scope in the array longer than the original does.)
Detaching the database result from the clone lets us work with its data without interfering with the rest of the query.
In the unlikely even that somebody is making clones in the middle of a query, then trying to continue the query with the clone instead of the original object, well they're gonna be broken now.
* Subscription::start was sometimes passing users instead of profiles to hooks, which broke OStatus subscription notifications; now normalizing to profiles for processing.
* H-card parsing would trigger a lot of PHP warnings and notices in hKit. Now suppressing warnings and notices for the duration of the call to keep them out of output when display_errors is on.
* H-card parsing would trigger a PHP fatal error if the source page was not well-formed XML and Tidy was not present on the system. Switched normalization to use the PHP DOM module which is always present, as we have no need for Tidy's extra features here.
* Trying to fetch avatars from Google profiles failed and triggered a PHP warning due to the relative URL not being resolved during h-card parsing. Now passing profile page URL into hKit by sneaking a <base> tag in while we normalize the HTML source.
* Profile pages without a "Link" header could trigger PHP notices due to a bad NULL -> array(NULL) conversion in LinkHeader::getLink(). Now checking that there was a return value before converting single return value into array.
Base problem is that our caching-on-insert interferes with relying on column default values; the cached object is missing those fields, so they appear to be empty (null) when the object is retrieved from cache.
Now explicitly setting them when inserting subscriptions, and cleaned up some code that had alternate code paths.
May also have made auto-subscription work for remote OStatus subscribers, but can't test until magic sigs are working again.
While deletion is in progress, the account is locked with the 'deleted' role, which disables all actions with rights control.
Todo:
* Pretty up the notice on the profile page about the pending delete. Show status?
* Possibly more thorough account disabling, such as disallowing all use for login and access.
* Improve error recovery; worst case is that an account gets left locked in 'deleted' state but the queue jobs have gotten dropped out. This would leave the username in use and any undeleted notices in place.
This bug was hitting a number of places where we had the pattern:
$db->find();
while($dbo->fetch()) {
$x = clone($dbo);
// do anything with $x other than storing it in an array
}
The cloned object's destructor would trigger on the second run through the loop, freeing the database result set -- not really what we wanted.
(Loops that stored the clones into an array were fine, since the clones stay in scope in the array longer than the original does.)
Detaching the database result from the clone lets us work with its data without interfering with the rest of the query.
In the unlikely even that somebody is making clones in the middle of a query, then trying to continue the query with the clone instead of the original object, well they're gonna be broken now.
It's not currently used, and won't be efficient when we update the notice.profile_id_idx index to optimize for our id-based sorting when pulling user post lists for profile pages, feeds etc.
On my test system (without memcache), while testing the LDAP
authentication plugin, when I sign in for the first time, triggering
auto-registration, I get these messages in the output page:
Warning: ksort() expects parameter 1 to be array, null given in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 219
Warning: Invalid argument supplied for foreach() in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 224
Warning: assert() [function.assert]: Assertion failed in /home/jeff/Documents/code/statusnet/classes/Memcached_DataObject.php on line 241
(plus two "Cannot modify header information..." messages as a result of
the above warnings)
This change appears to fix this (although I can't really explain exactly
why).
Also stripping id from foreign HTML messages (could interfere with UI) and disabled failing attachment popup for a.attachment links that don't have a proper id, so you can click through instead of getting an error.
Issues:
* any other links aren't marked and saved
* inconsistent behavior between local and remote attachments (local displays in lightbox, remote doesn't)
* if the enclosure'd object isn't referenced in the content, you won't be offered a link to it in our UI
We only need one author for user feeds: the user themselves. So, show
the user as the activity:subject, and don't repeat the same
activity:actor for every notice unnecessarily.
In a federated system, "@nickname" is insufficient to uniquely
identify a user. However, it's a very convenient idiom. We need to
guess from context who 'nickname' refers to.
Previously, we were using the sender's profile (or what we knew about
them) as the only context. So, we assumed that they'd be mentioning to
someone they followed, or someone who followed them, or someone on
their own server.
Now, we include the notice information for context. We check to see if
the notice is a reply to another notice, and if the author of the
original notice has the nickname 'nickname', then the mention is
probably for them. Alternately, if the original notice mentions someone
with nickname 'nickname', then this notice is probably referring to
_them_.
Doing this kind of context sleuthing means we have to render the
content very late in the notice-saving process.
We add a local_group table to store data about local groups. It has
the unique key for nickname, so /group/<nickname> looks up here.
Updated DB data object classes and data files.
- added rel="ostatus:attention" links for group delivery
- added events for plugins to override group profile/permalink pages
- pulled Notice::saveGroups up to save-time so we can override;
it's relatively cheap and gives us a clean list of target
groups for distrib time even with customized delivery.
- fixed notice::getGroups to return group objects as expected
- added some doc on new parameters to Notice::saveNew
- 'groups' list of group IDs to push to in place of parsing
- messages that come in via PuSH and contain local group targets
are delivered to local group members
- messages that come in via PuSH and contain remote group targets
are delivered to local members of the remote group
Todo:
- handle group posts that only come through Salmon
- handle conflicts in case something comes in both through Salmon and PuSH
- better source verification
- need a cleaner interface to look up groups by URI
- need a way to handle remote groups with conflicting names
Combined the code that finds mentions of other profiles into one place.
common_find_mentions() finds mentions and calls hooks to allow
supplemental syntax for mentions (like OStatus).
common_linkify_mentions() links mentions.
common_linkify_mention() links a mention.
Notice::saveReplies() now uses common_find_mentions() instead of
trying to parse everything again.
The subs_* functions in subs.php have made a lot of assumptions
about users versus profiles. I've refactored the functions to
be methods of the Subscription class instead, and to use Profile
objects throughout.
Some of the checks for blocks or existing subscriptions depended
on users or profiles, so I've moved those methods around a bit.
I've left stubs for the subs_* functions until we get time to replace
them.
- Multiplexing queues into groups and for multiple sites.
- Sharing vs breakout configurable per site and per queue via $config['queue']['breakout']
- Detect how many times a message is redelivered, discard if it's killed too many daemons
- count configurable with $config['queue']['max_retries']
- can dump the items to files in $config['queue']['dead_letter_dir']
Queue daemon memory & resource leak fixes:
- avoid unnecessary reconnections to memcached server (switch persistent connections back in on second initialization, assuming it's child process)
- monkey-patch for leaky .ini loads in DB_DataObject::databaseStructure() - was leaking 200k per active switch
- applied leak fixes to Status_network as well, using intermediate base Safe_DataObject for both it and Memcache_DataObject
Misc queue fixes:
- correct handling of child processes exiting due to signal termination instead of regular exit
- shutdown instead of infinite respawn loop if we're already past the soft memory limit at startup
- Added --all option for xmppdaemon... still opens one xmpp connection per site that has xmpp active
Cache updates:
- add Cache::increment() method with native support for memcached atomic increment
statusnet.links.ini file could not be read anymore due to the entry for nonce containing a comma in its key value.
PHP's parse_ini_file() function no longer allows commas in keys, and rejects the *ENTIRE FILE* if it's present, breaking various automatic joins.
* detection of group feeds is currently a nasty hack based on presence of '/groups/' in URL -- should use some property on the feed?
* listing for the remote group is kinda cruddy; needs to be named more cleanly
* still need to establish per-author profiles (easier once we have the updated Atom code in)
* group delivery probably not right yet
* saving of group messages still triggering some weird behavior
Added support for since_id and max_id on group timeline feeds as a free extra. Enjoy!