This repository has been forked in order to facilitate running YAUAA from a Snowflake UDF. This is required to generate a YAUAA context for Snowplow events from the fivetran collector.
Changes to the code are necessary as:-
- The current Snowplow enricher uses v5.23 of YAUAA; however the Snowplow UDF functionality is available from v6.2.
- YAUAA outputs data in camelcase with a capitalized first letter, the Snowplow enricher sets this the first letter lowercase.
Note due to the dependency on running a Linux system to install the dev tools, the development guidelines are not applicable. Instead dependencies (including YAUAA) have been pinned in the Snowflake UDF sub-project's POM.xml
To compile the JAR :-
mvn install -f /src/pom.xml
mvn package -f /udfs/snowflake/pom.xml
This is a java library that tries to parse and analyze the useragent string and extract as many relevant attributes as possible.
The full documentation can be found here https://yauaa.basjes.nl
- Correctly classify the elements in the Google Chrome (and Chromium based MS Edge) User-Agent string that are incorrect when the "Freeze User-Agent request header" flag is enabled.
- New UDF for Elastic Search. I have done limited (local single node) testing. Please report anything you find so I can fix it.
A bit more background about this useragent parser can be found in this blog which I wrote about it: https://techlab.bol.com/making-sense-user-agent-string/
As an example the useragent of my phone:
Mozilla/5.0 (Linux; Android 7.0; Nexus 6 Build/NBD90Z) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.124 Mobile Safari/537.36
is converted into this set of fields:
Field name | Value |
---|---|
Device Class | Phone |
Device Name | Google Nexus 6 |
Device Brand | |
Operating System Class | Mobile |
Operating System Name | Android |
Operating System Version | 7.0 |
Operating System Name Version | Android 7.0 |
Operating System Version Build | NBD90Z |
Layout Engine Class | Browser |
Layout Engine Name | Blink |
Layout Engine Version | 53.0 |
Layout Engine Version Major | 53 |
Layout Engine Name Version | Blink 53.0 |
Layout Engine Name Version Major | Blink 53 |
Agent Class | Browser |
Agent Name | Chrome |
Agent Version | 53.0.2785.124 |
Agent Version Major | 53 |
Agent Name Version | Chrome 53.0.2785.124 |
Agent Name Version Major | Chrome 53 |
You can try it online with your own browser here: https://try.yauaa.basjes.nl/.
NOTES
-
This runs under a "Free quota" on Google AppEngine. If this quota is exceeded then it will simply become unavailable for that day.
-
After a while of inactivity the instance is terminated so the first page may take 15-30 seconds to load.
-
If you really like this then run it on your local systems. It's much faster that way. A ready to run docker image that can be used in both local mode and also in Kubernetes: (more info).
With docker installed do
docker pull nielsbasjes/yauaa docker run -p8080:8080 nielsbasjes/yauaa
and then open
http://localhost:8080/
Stefano Balzarotti is putting a lot of effort into porting Yauaa to run in .NET standard.
You can track his efforts here on Github: Yauaa .NET standard and download his releases via Nuget.
If this project has business value for you then don't hesitate to support me with a small donation.
Yet Another UserAgent Analyzer
Copyright (C) 2013-2021 Niels Basjes
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.