gnu-social/plugins/ApiLogger/ApiLoggerPlugin.php

91 lines
3.5 KiB
PHP
Raw Permalink Normal View History

ApiLogger plugin: dumps some information about API hits to aid in researching future HTTP-level cachability improvements. Data are sent to the 'info' level of logging, like so: [lazarus.local:4812.86b23603 GET /mublog/api/statuses/friends_timeline.atom?since_id=1353] STATLOG action:apitimelinefriends method:GET ssl:no query:since_id cookie:no auth:yes ifmatch:no ifmod:no agent:Appcelerator Titanium/1.4.1 (iPhone/4.1; iPhone OS; en_US;) Fields: * action: case-normalized name of the action class we're acting on * method: GET, POST, HEAD, etc * ssl: Are we on HTTPS? 'yes' or 'no' * query: Were we sent a query string? 'yes', 'no', or 'since_id' if the only parameter is a since_id * cookie: Were we sent any cookies? 'yes' or 'no' * auth: Were we sent an HTTP Authorization header? 'yes' or 'no' * ifmatch: Were we sent an HTTP If-Match header for an ETag? 'yes' or 'no' * ifmod: Were we sent an HTTP If-Modified-Since header? 'yes' or 'no' * agent: User-agent string, to aid in figuring out what these things are The most shared-cache-friendly requests will be non-SSL GET requests with no or very predictable query parameters, no cookies, and no authorization headers. Private caching (eg within a supporting user-agent) could still be friendly to SSL and auth'd GET requests. We kind of expect that the most frequent hits from clients will be GETs for a few common timelines, with auth headers, a since_id-only query, and no cookies. These should at least be amenable to returning 304 matches for etags or last-modified headers with private caching, but it's very possible that most clients won't actually think to save and send them. That would leave us expecting to handle a lot of timeline since_id hits that return a valid API response with no notices. At this point we don't expect to actually see if-match or if-modified-since a lot since most of our API responses are marked as uncacheable; so even if we output them they're not getting sent back to us. Random subsampling can be enabled by setting the 'frequency' parameter smaller than 1.0: addPlugin('ApiLogger', array( 'frequency' => 0.5 // Record 50% of API hits ));
2010-10-28 08:30:11 +09:00
<?php
/*
* StatusNet - the distributed open-source microblogging tool
* Copyright (C) 2010, StatusNet, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/**
* @package ApiLoggerPlugin
* @maintainer Brion Vibber <brion@status.net>
*/
if (!defined('STATUSNET')) {
exit(1);
}
class ApiLoggerPlugin extends Plugin
{
const PLUGIN_VERSION = '2.0.0';
ApiLogger plugin: dumps some information about API hits to aid in researching future HTTP-level cachability improvements. Data are sent to the 'info' level of logging, like so: [lazarus.local:4812.86b23603 GET /mublog/api/statuses/friends_timeline.atom?since_id=1353] STATLOG action:apitimelinefriends method:GET ssl:no query:since_id cookie:no auth:yes ifmatch:no ifmod:no agent:Appcelerator Titanium/1.4.1 (iPhone/4.1; iPhone OS; en_US;) Fields: * action: case-normalized name of the action class we're acting on * method: GET, POST, HEAD, etc * ssl: Are we on HTTPS? 'yes' or 'no' * query: Were we sent a query string? 'yes', 'no', or 'since_id' if the only parameter is a since_id * cookie: Were we sent any cookies? 'yes' or 'no' * auth: Were we sent an HTTP Authorization header? 'yes' or 'no' * ifmatch: Were we sent an HTTP If-Match header for an ETag? 'yes' or 'no' * ifmod: Were we sent an HTTP If-Modified-Since header? 'yes' or 'no' * agent: User-agent string, to aid in figuring out what these things are The most shared-cache-friendly requests will be non-SSL GET requests with no or very predictable query parameters, no cookies, and no authorization headers. Private caching (eg within a supporting user-agent) could still be friendly to SSL and auth'd GET requests. We kind of expect that the most frequent hits from clients will be GETs for a few common timelines, with auth headers, a since_id-only query, and no cookies. These should at least be amenable to returning 304 matches for etags or last-modified headers with private caching, but it's very possible that most clients won't actually think to save and send them. That would leave us expecting to handle a lot of timeline since_id hits that return a valid API response with no notices. At this point we don't expect to actually see if-match or if-modified-since a lot since most of our API responses are marked as uncacheable; so even if we output them they're not getting sent back to us. Random subsampling can be enabled by setting the 'frequency' parameter smaller than 1.0: addPlugin('ApiLogger', array( 'frequency' => 0.5 // Record 50% of API hits ));
2010-10-28 08:30:11 +09:00
// Lower this to do random sampling of API requests rather than all.
// 0.1 will check about 10% of hits, etc.
public $frequency = 1.0;
function onArgsInitialize($args)
{
if (isset($args['action'])) {
$action = strtolower($args['action']);
if (substr($action, 0, 3) == 'api') {
if ($this->frequency < 1.0 && $this->frequency > 0.0) {
$max = mt_getrandmax();
$n = mt_rand() / $max;
if ($n > $this->frequency) {
return true;
}
}
$uri = $_SERVER['REQUEST_URI'];
$method = $_SERVER['REQUEST_METHOD'];
$ssl = empty($_SERVER['HTTPS']) ? 'no' : 'yes';
$cookie = empty($_SERVER['HTTP_COOKIE']) ? 'no' : 'yes';
$etag = empty($_SERVER['HTTP_IF_MATCH']) ? 'no' : 'yes';
$last = empty($_SERVER['HTTP_IF_MODIFIED_SINCE']) ? 'no' : 'yes';
$auth = empty($_SERVER['HTTP_AUTHORIZATION']) ? 'no' : 'yes';
if ($auth == 'no' && function_exists('apache_request_headers')) {
// Sometimes Authorization doesn't make it into $_SERVER.
// Probably someone thought it was scary.
$headers = apache_request_headers();
if (isset($headers['Authorization'])) {
$auth = 'yes';
}
}
ApiLogger plugin: dumps some information about API hits to aid in researching future HTTP-level cachability improvements. Data are sent to the 'info' level of logging, like so: [lazarus.local:4812.86b23603 GET /mublog/api/statuses/friends_timeline.atom?since_id=1353] STATLOG action:apitimelinefriends method:GET ssl:no query:since_id cookie:no auth:yes ifmatch:no ifmod:no agent:Appcelerator Titanium/1.4.1 (iPhone/4.1; iPhone OS; en_US;) Fields: * action: case-normalized name of the action class we're acting on * method: GET, POST, HEAD, etc * ssl: Are we on HTTPS? 'yes' or 'no' * query: Were we sent a query string? 'yes', 'no', or 'since_id' if the only parameter is a since_id * cookie: Were we sent any cookies? 'yes' or 'no' * auth: Were we sent an HTTP Authorization header? 'yes' or 'no' * ifmatch: Were we sent an HTTP If-Match header for an ETag? 'yes' or 'no' * ifmod: Were we sent an HTTP If-Modified-Since header? 'yes' or 'no' * agent: User-agent string, to aid in figuring out what these things are The most shared-cache-friendly requests will be non-SSL GET requests with no or very predictable query parameters, no cookies, and no authorization headers. Private caching (eg within a supporting user-agent) could still be friendly to SSL and auth'd GET requests. We kind of expect that the most frequent hits from clients will be GETs for a few common timelines, with auth headers, a since_id-only query, and no cookies. These should at least be amenable to returning 304 matches for etags or last-modified headers with private caching, but it's very possible that most clients won't actually think to save and send them. That would leave us expecting to handle a lot of timeline since_id hits that return a valid API response with no notices. At this point we don't expect to actually see if-match or if-modified-since a lot since most of our API responses are marked as uncacheable; so even if we output them they're not getting sent back to us. Random subsampling can be enabled by setting the 'frequency' parameter smaller than 1.0: addPlugin('ApiLogger', array( 'frequency' => 0.5 // Record 50% of API hits ));
2010-10-28 08:30:11 +09:00
$agent = empty($_SERVER['HTTP_USER_AGENT']) ? 'no' : $_SERVER['HTTP_USER_AGENT'];
$query = (strpos($uri, '?') === false) ? 'no' : 'yes';
if ($query == 'yes') {
if (preg_match('/\?since_id=\d+$/', $uri)) {
$query = 'since_id';
}
}
common_log(LOG_INFO, "STATLOG action:$action method:$method ssl:$ssl query:$query cookie:$cookie auth:$auth ifmatch:$etag ifmod:$last agent:$agent");
}
}
return true;
}
public function onPluginVersion(array &$versions): bool
{
$versions[] = array('name' => 'ApiLogger',
'version' => self::PLUGIN_VERSION,
'author' => 'Brion Vibber',
'homepage' => GNUSOCIAL_ENGINE_REPO_URL . 'tree/master/plugins/ApiLogger',
'rawdescription' =>
// TRANS: Plugin description.
2011-04-06 22:08:39 +09:00
_m('Allows random sampling of API requests.'));
return true;
}
ApiLogger plugin: dumps some information about API hits to aid in researching future HTTP-level cachability improvements. Data are sent to the 'info' level of logging, like so: [lazarus.local:4812.86b23603 GET /mublog/api/statuses/friends_timeline.atom?since_id=1353] STATLOG action:apitimelinefriends method:GET ssl:no query:since_id cookie:no auth:yes ifmatch:no ifmod:no agent:Appcelerator Titanium/1.4.1 (iPhone/4.1; iPhone OS; en_US;) Fields: * action: case-normalized name of the action class we're acting on * method: GET, POST, HEAD, etc * ssl: Are we on HTTPS? 'yes' or 'no' * query: Were we sent a query string? 'yes', 'no', or 'since_id' if the only parameter is a since_id * cookie: Were we sent any cookies? 'yes' or 'no' * auth: Were we sent an HTTP Authorization header? 'yes' or 'no' * ifmatch: Were we sent an HTTP If-Match header for an ETag? 'yes' or 'no' * ifmod: Were we sent an HTTP If-Modified-Since header? 'yes' or 'no' * agent: User-agent string, to aid in figuring out what these things are The most shared-cache-friendly requests will be non-SSL GET requests with no or very predictable query parameters, no cookies, and no authorization headers. Private caching (eg within a supporting user-agent) could still be friendly to SSL and auth'd GET requests. We kind of expect that the most frequent hits from clients will be GETs for a few common timelines, with auth headers, a since_id-only query, and no cookies. These should at least be amenable to returning 304 matches for etags or last-modified headers with private caching, but it's very possible that most clients won't actually think to save and send them. That would leave us expecting to handle a lot of timeline since_id hits that return a valid API response with no notices. At this point we don't expect to actually see if-match or if-modified-since a lot since most of our API responses are marked as uncacheable; so even if we output them they're not getting sent back to us. Random subsampling can be enabled by setting the 'frequency' parameter smaller than 1.0: addPlugin('ApiLogger', array( 'frequency' => 0.5 // Record 50% of API hits ));
2010-10-28 08:30:11 +09:00
}