Basic User-Agent string parser that includes some basic podcasting apps. This project is intended to help parse/group requests for analytics purposes, not for browser feature detection.
The included agents.lock.yml
also includes name/type/os IDs, in case you want
to normalize the strings in your database.
Just npm install --save prx-podagent
. Or to use outside of node, just grab a
database file out of the db/
directory.
const podagent = require('prx-podagent');
const agent = podagent.parse('some-string');
if (agent) {
console.log('Match:', agent.name, agent.type, agent.os);
} else {
console.log('Did not match any known agents');
}
Or to DIY use the json lock file:
$.getJSON('agents.json', function(db) {
db.agents.forEach(function(a) {
a.regex = new RegExp(a.regex, a.ignorecase ? 'i' : undefined);
});
var str = decodeURIComponent('some-agent-string');
var agent = db.agents.find(function(a) { return a.regex.test(str); });
if (agent) {
console.log('matched!', agent);
} else {
console.log('no match for:', str);
}
});
Or in Ruby:
require 'yaml'
DB = YAML.load_file('db/agents.lock.yml')
def match_agent(str)
str = URI.unescape(str)
DB['agents'].find { |a| Regexp.new(a['regex'], a['ignorecase'] ? 'i' : nil).match(str) }&.tap do |match|
%w(name type os).each { |k| match[k] = DB['tags'][match[k]] }
end
end
puts match_agent('Pandora/1812.2 Android/5.1.1 ford (ExoPlayerLib2.8.2)').inspect
# {"regex"=>"/^HardCast.+CFNetwork/", "name"=>"HardCast", "type"=>"Mobile App", "os"=>"iOS"}
puts match_agent('blah blah blah').inspect
# nil
Or in PHP:
<?php
require_once 'spyc.php';
$DB = Spyc::YAMLLoad('db/agents.lock.yml');
function match_agent($str) {
global $DB;
$str = urldecode($str)
$match = NULL;
foreach ($DB['agents'] as $agent) {
$pattern = '/' . $agent['regex'] . '/' . ($agent['ignorecase'] ? 'i' : '');
if (preg_match($pattern, $str)) {
$match = $agent;
foreach (array('name', 'type', 'os') as $key) {
$match[$key] = $DB['tags'][$match[$key]];
}
break;
}
}
return $match;
}
var_dump(match_agent('Pandora/1812.2 Android/5.1.1 ford (ExoPlayerLib2.8.2)'));
?>
Note: it's fairly common to see URI encodings in user agent strings. And often
depends on your particular server setup. The regexps in this library are
intended to be used on the fully-uri-decoded User-Agent
string. So you should
always decodeURIComponent()
the value before attempting to match.
I've also seen a mix of encoded/decoded spaces within a single user agent
string. Where the first space has been encoded to %20
, but subsequent ones
are not. So URI decoding is probably a good idea anyways.
Basic tests are located in the /test
directory, and can be run with npm test
.
Additionally, there is a test/support/testagents.csv
file containing some actual
production logs. The "coverage" and "omissions" tests use this file to check that
the database file accounts for all the major known user agents.
To add a new user agent:
- Edit the
db/agents.yml
file to include your new regular expression, plus an example user-agent string or two. - Run
npm test
(or justmocha test/examples-test.js
) to test that your example strings match the regexp. - Run
npm run lock
to regenerate thedb/agents.lock.yml
. This file normalizes the text tags/labels shared between the various matchers. Check that your change didn't add any unexpected new tags (if you accidentally changed the case of a label or something). - Create a pull request to this repo.
To release a new version:
- Get or set a
GITHUB_TOKEN
ENV (needed for release-it) - Run
npm run release
- Select whether this is a major/minor patch, according to semantic versioning
- Select "yes" on publishing to NPM, Pushing, and creating a Github Release
- Fork it
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create new Pull Request