-
Notifications
You must be signed in to change notification settings - Fork 39
Analysis of β version (November 2018)
After two years of beta phase in this page we analyze the code line by line, to assess how to engineer the next rewriting of code. This page describes what the function does, and what would do in the future. The code analyzed has been tagged as last beta release
lib/accesslog.js
- create an {} to save an interaction with the webpage
- append to the DB
- we should use nginx logging system and ecosystem to offer us better analytics.
- we should anonymize the log at nginx level
lib/alarm.js
- this experiment was used to record application and system issues and to raise them to the admin attention
- we should use https://www.npmjs.com/package/winston to manage log messages and to talk with ELK
lib/analysis.js
This file includes three discontinued functions used to research some of the realityCheck functionalities.
- called by /api/v1/user/presence/$consideredPostsNumber/$userId/[d3|json]
- discontinued because userId should not be used anymore in API call
- the experience here should be throw away (
timelineToPresence
function is an awful hack of data manipulation. it spread timeline.startTime in chunks of 5 minutes, questions updated on #52)
- be replaced with https://github.com/tracking-exposed/facebook/issues/52
- called by /api/v1/user/absolute/$consideredPostsNumber/$userId/[d3|json]
- it is broken, the
postInAnalysis
core function reads on timeline when it should be htmls.
- not work anymore, if the user is not providing a verified collection of 100% of the post from a specific page. This can't be addressed in the 1.0 MVP.
- it is broken, the
postInAnalysis
core function reads on timeline when it should be htmls.
- Nothing should be saved by this experiment, it was intended to show how the timeline is not chronological. This can be a nice visualization when the timeline gets re-play to the user but would be done visualization side.
lib/c3convertors.js
This file contains helper to convert the list of JSON objects in the data format readable by c3. These functions could be implemented on the client side.
- three different functions heavily tuned around https://c3js.org/ format
- they are useful and can be reused
- experiments c3 updating mechanism, experiment how it scales on mobile and desktop
- express more clear research question and exploration methods around the graphs
- find how to save the generated SVG on a file
- extend the client-side functions in
sections/webscripts/stats.js
lib/htmlunit.js
- a call to retrived a post unit (html and metadata) used in the revision page.
- the revision page should be improved as UX, but the concept of retrive the full content of a html.id if you know the ID should remain the same
- not change from the backend, only get updated with the new format. if HTML start to be stored outside of the DB, this would be a CPU intensive operation which retrieves the snippet. If the HTML is compressed, here get decompressed (if this operation it is not transparently handled by the DB, which I heard is possible #93)
lib/integration.js
- this API has been implemented to export only the snippet of text to this experimental base of code: tracking-exposed
- this snipped extract an anonymized and reduce set of the posts (all the posts containing text)
- this logic should be applied to do the semantic analysis
- this can be used again as the integration tool, but the difficulty is making the next experimental codebase working properly. This is an example of a reduction/anonymization API which extracts data.
- should be integrated in a wrapper able to seek the proper time window, instead of keeping cached the last returned.
lib/mongo.js
- this is just a wrapped of mongodb using the bluebird promises, the abstraction are ok (they can be used also on postgreSQL, eventually), but the wrapper display some limit in managing the exception and retries.
- be refactored with a more coherent pattern
lib/onboarding.js
- this library was supporting a dead functionality
- be removed
lib/opendata.js
-
metaexp
is a function used during the community research. there is a quite flexible API which targets a specific campaign. This API might represent an excessive sharing functionality and should be discontinued ASAP. -
queryContent
instead let you query, by Id, a timeline, or an impression, or an html -
getNodeNumbers
a function to count the number of elements.
-
getNodeNumbers
should become part of a cached stats
lib/params.js
- a small helper to pull argument as string or integer: should be used
- be effectively used
lib/parser.js
- it is all the functionalities used by the beta-version chain of parsers
- be removed, no code here deserve to be saved
lib/performa.js
- was an experiment to measure performances of mongodb query
- use the proper mongodb function which logs the query which takes more than 100ms
- use a proper method for node profiling
- check this out https://github.com/uber-node/node-flame
lib/personal.js
- A small part described here https://github.com/tracking-exposed/facebook/wiki/Personal-API-restrictions
- two queries on htmls only with different timewindow (one hardcoded to 0, and the other lower from a specific time)
- a complex function to convert in CSV (which includes impressions and htmls)
- dietBaasic an html query with two grouping at API level
- be redesign using a proper wrapper to query data and transform them
lib/realitymeter.js
- just picking some content from 'reality' table, which is a computed reduction after every hour
- be discontinued in the 1.0 MVP
lib/reducer1.js
- a function created during the Italian 2018 analysis, it extracts content starting from timelines, to impressions, to html. this complexity is due to the poor database design, and the next version would replace this needs.
- interesting keep in account the password protected and the campaign specifications
- interesting keep the experience of timezone shifting
- requirement have a parser which extracts groupName and pageName
- it contains fb reactions mapping, all these functions should be at acquisition time rather than extraction time.
- be removed entirely
lib/research.js
- distinct is a function should be preserved for admin
- the stats and data functions are pretty useful and documented here: https://github.com/tracking-exposed/facebook/wiki/Researchers-API-and-how-to-download-the-data
- be changed with the new metadata format, this is not efficient
lib/saveCSV.js
- mixing impressions and htmls, selecting the fields to be saved.
- we should let the field list be configurable from the user
lib/selector.js
- it returns the current facebook selector, usable by the client
- it returns the access token for the owner of the publicKey
- the selector component should remain the same
- the authentication should be stronger
lib/staticpages.js
lib/stats.js
- statistic on users usage, should be kept
- statistic on the hourly activity reduction. We should rewrite this, keeping in mind the use cases of botrex, ELK, static pages on stats.
- getEngagemnt should be kept, the stats would change accordingly with the new reduction
lib/tagId.js
- some code in common with research1.js, a reduction+aggregation of impression+html
- getTimeline function is used in lib/research.js should be merged in perspective of the new db format
lib/timeutils.js
- high-level functions to create mongodb time window
- be planned a robust pattern to build the mongodb queries (for example, "since now to 3 hours ago" should return the mongodb syntax { timeDate: { $ls: new Date(xxx), $gt: new Date(yyy) }} ), we would have many time dependent APIs
lib/utils.js
and lib/various.js
- is a collection of libraries still useful in the system
- everything remains until is not obsolete
lib/version1.js
- this is a technical file containins the list of implemented APIs, was part of the beta version pattern. There is no need in keeping this, at the moment.