-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
literature survey + master thesis: G-Rank learn-to-rank #5313
Comments
Note that the tragedy of the commons is even present in blockchain systems (as a consequence of full state replication). |
New Literature Survey: Autonomous (decentralized?) trading bot:
|
True autonomous AI has full freedom to act, own and profit. This literature survey is focused on ...
Links
|
Next step: ~10 sentences or so, narrowing the scope down to 3-4 directions to go in regarding the survey and prototype. (Next week) |
This is the recent breakthrough in reinforcement learning algorithms (e.g. Asynchronous Proximal Policy Optimization (APPO) algorithm). Entire thesis goal/Survey-prototype could be to get this into the superapp AI bot, get a Tweakers post, deployment to a few hundred phones, measure, plot in thesis, and should scale to millions. Each smartphone has fixed 5-ish GByte slice of a huge dataset of historical market data, making each bot unique. Creates a very concrete focus: apply this ML technique to autonomous trading bots. Specifically, smarter bot trading this recent work on our decentral market; see here. Fancy results. Done. (only fundamentally re-work this technique by distributing the shared memory and decentralising the learner to offer unbounded scalability. Simple. /sarcasm) Key research question: Can we map the “Sample Factory” (a high-throughput training system optimized for a single-machine setting) architecture to the Autonomous trading bot setting?
|
Hey Johan,
This looks interesting at first glance. I will start reading up on this
before our meeting tomorrow. I just got back from a super spontaneous short
holiday with some friends, so the last 5 days I have been away. I did have
some great discussions with my friends though on this vacation (we all did
our data science Bachelor together) so I am thinking I'm narrowing down the
focus. Reinforcement learning is obviously the Mecca of "proper" AI as the
theoretical ceiling on its potential is astronomical compared to
(un)supervised learning, but of course is far more complex than simpler
techniques. I will share more thoughts tomorrow.
Current focus: adversarial training for trading bots, GANs, generative modelling, autonomous/online learning, q-policies, independent modelling, etc.
See you then,
Andrew
…On Sat, Jul 18, 2020 at 11:03 AM Johan Pouwelse ***@***.***> wrote:
This <https://arxiv.org/pdf/2006.11751> is the recent breakthrough in
reinforcement learning algorithms (e.g. Asynchronous Proximal Policy
Optimization (APPO) algorithm). Entire thesis goal/Survey-prototype could
be to get this into the superapp AI bot
<https://github.com/Tribler/trustchain-superapp/blob/master/README.md#ai-trading-bot>,
get a Tweakers post, deployment to a few hundred phones, measure, plot in
thesis, and should scale to millions. Each smartphone has fixed 5-ish GByte
slice of a huge dataset of historical market data, making each bot unique.
Creates a very concrete focus: apply this ML technique to autonomous
trading bots. Fancy results. Done. (only fundamentally re-work this
technique by distributing the shared memory and decentralising the learner
to offer unbounded scalability. Simple. /sarcasm)
Key research question: Can we map the “Sample Factory” (a high-throughput
training system optimized for a single-machine setting) architecture to the
Autonomous trading bot setting?
[image: image]
<https://user-images.githubusercontent.com/325224/87848851-1c3bc800-c8e4-11ea-8954-fe7a3521a423.png>
Website: https://sites.google.com/view/sample-factory
Article: https://arxiv.org/pdf/2006.11751
Press:
https://spectrum.ieee.org/tech-talk/artificial-intelligence/machine-learning/powerful-ai-can-now-be-trained-on-a-single-computer
Code: https://github.com/alex-petrenko/sample-factory
Demo:
<https://github.com/alex-petrenko/sample-factory/blob/master/gifs/battle.gif?raw=true>
<https://github.com/alex-petrenko/sample-factory/blob/master/gifs/duel.gif?raw=true>
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#5313 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADFXMZWJLFO4ZUJE3FUUZ4LR4FQNDANCNFSM4M3KKCZQ>
.
|
Can you find more in this: |
5 x 5ects courses left. plus lit survey. Proposed direction:
|
Interested in quantification of activities to help inform existing financial/investment funds regarding climate change, ecological destruction, financial regulation, Pigouvian taxes and climate regulation, etc. - "making the world a better place through data?" "What would a Climate DAO look like?" Papers and literature to follow. Long-term TODO:
|
|
|
@awrgold : 'i like making trading strategies' |
ToDo: superapp compile from the source. Understand Euro Token, liquidity pool, Taproot, and keep track of documentation/grey_literature. With bit more background info and status understanding we can define the exact topic in 3-4 sentences. |
I hope you don't mind me using this issue as a personal bulletin board for some thoughts. Q4 just started, doing some reading today to get back up to speed. The last 12 months were a blur, I'm trying to get back not just on track but also making sure that what topics I do chose will keep me motivated for a longer period of time. I'm (re)reading a lot of the Tribler/Trustchain documentation and development history, catching up on what's already been attempted, completed, shelved, and abandoned in the past. Particularly this page. As such, here's some topics I think are either useful and/or interesting. Topics of particular interest:
|
Discussion of thesis direction
|
Ultimate goal: graduate by end of Q3 2022 |
Final possible deliverable: report co-authored with Authority Financial Markets... |
Week 5 of this quarter currently. Making progress with investigation of current DAO work. ToDo:
|
Production-DAO needs a service. Smart contracts need to scale somehow. Possible idea is server renting and compute servers in an open market using Bitcoin. See prior working code and the [Plebnet report]. (alternative is a simple btc/eth trading bot). Scientific interest focus of thesis: governance?!? |
Possible methodology: don't build fantasia about how a DAO could look like. Others are doing that for us 😆. Do a hands-on production-DAO without corner-cutting. Bottom-up and driven by a single actual service. Minimal Viable DAO, must have service in the machine economy of the future. Server infrastructure (e.g. Bitcoin VPS). Reversed cloud: not owned by corporation, not owned by humans, but DAO controlled. (https://akash.network/). Other must-have alternatives are a global key-value store: https://blog.sia.tech/skydb-a-mutable-database-for-the-decentralized-web-7170beeaa985 Dependency, that requires servers.
Happening: https://www.coindesk.com/yield-guild-games-dao-funding-round-delphi-scalar |
X-mas day gift: slowdown issue! (memory or something else) Well, this is then a forced moment to clean your code. Using DAS6 would not fix your bug at this late stage. Remember me pushing for straight 45° graphs? This is why. Simple, understandable, and debug-able. My other advise: start conforming what part of the gossip works. No exponential growth. Keep it simple, you're an AI expert. Our natural world is gossip load balancing debugging. Node discovery graph, for each node in the system: how many incoming messages and how many unique discovered other nodes. Usually you see patterns with hundreds of lines (hundreds nodes). |
As an extra Christmas gift my PC decided to forcibly update windows and
restarted as my simulation was approaching 70% finished, so that is great.
…On Sun, Dec 25, 2022, 02:17 Johan Pouwelse ***@***.***> wrote:
X-mas day gift: slowdown issue! (memory or something else)
Well, this is then a forced moment to clean your code. Using DAS6 would
not fix your bug at this late stage. Remember me pushing for straight 45°
graphs? This is why. Simple, understandable, and debug-able.
My other advise: start conforming what part of the gossip works. No
exponential growth. Keep it simple, you're an AI expert. Our natural world
is gossip load balancing debugging. Node discovery graph, for each node in
the system: how many incoming messages and how many unique discovered other
nodes. Usually you see patterns with hundreds of lines (hundreds nodes).
—
Reply to this email directly, view it on GitHub
<#5313 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADFXMZROV6VM67U732CH4HTWPAGJJANCNFSM4M3KKCZQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Some interesting results, when the normalization constant I don't know what the behavior is like over 10000 queries since my PC restarted, but the above is what I had from a smaller simulation. Then, if |
However, I'm not sure what you think of the color scheme above. I'm using Blue to indicate nodes that have not been infected by clicklog entries from malicious nodes, Red to indicate spam/sybil nodes, and Green to indicate non-malicious nodes that have entries from malicious nodes in their local clicklogs. I wonder if this is the most clear way to visualize this metric... |
Alright, so beyond a few optimizations I had a ton of fun learning about string interning in Python. I was doing a whole lot of string comparisons when checking if clicklog items contained the query string, and was doing it inefficiently. Hopefully this leads to a speedup - had to refactor a lot of code. |
good to hear! |
Of course - problem is, I'm basically only waiting for results to discuss at this point, so this was the major bottleneck for me. |
@synctext How does this sound as a new first paragraph of Problem Description: "Security within the domain of decentralized machine learning remains an unsolved problem. There exist numerous additional constraints in decentralized networks that traditional machine learning models need not be concerned with. Trustless, anonymous networks are rife with malevolent usership, and the task of identity verification in such networks also remains unsolved (TODO: SOURCES). Adding an additional layer of complexity, many p2p networks are built upon open-source software, affording any would-be adversary direct insight into potential attack vectors. As such, machine learning models engineered for public p2p networks require exceptional attention to detail across all facets of their design. These constraints disqualify any supervised models from the outset as they violate the trustless nature of p2p networks. Either the engineers of such supervised models must be trusted to train and validate the model, or the network participants must provide training data themselves, thereby introducing a critical vulnerability into the training process. Creating a search engine for a p2p domain that requires no training yet can converge towards an optimal performance as if an error rate is being minimized in a supervised model would constitute a major development in p2p applications." |
FWIW I learned a lot about Python object creation and destruction in the past week or so. Also, string interning is kind of a big deal when you're dealing with huge numbers of string comparisons. Who'da thunk?? I have a method where I find matches in clicklog entries by comparing strings, and I was using Sped things up considerably, by a factor of about 30-50x. Comparisons are definitely O(1) now. |
Interesting preliminary push-pull results. Small sim where nodes "pull" clicklog updates from 1st or 2nd degree neighbors, but spam nodes "push" to whatever nodes they're aware of. I was thinking of having 50% of the nodes accept "pushed" gossip while still pulling, and the other 50% will only pull. Could illustrate the difference, since it does seem that most get infected eventually. |
There is something I can't quite wrap my head around: The performance plateaus pre-attack, and then continues on a downward trajectory post-attack. This happens in every adversarial simulation, to some degree. I am not taking adversarial nodes' query rankings into consideration when measuring a node's ranking performance, so introducing attackers does not directly influence the score - only the gossip of poisoned clicklog entries over time. The only thing I can think of is that adversarial nodes will gossip their infected clicklogs which will eventually propagate throughout the network, and over time those poisoned clicklog entries will affect ranking scores. As you can see, post-attack there is also a plateau that occurs until a node gets lucky enough to receive an update via gossip that positively influences their performance. |
@synctext I know it's 11:59 on the doomsday clock, but I've been having some very intriguing results with the push-pull architecture such that I feel like it's actually worth integrating as a core function of G-Rank, at least within the context of the simulations. Nodes that are accepting of pushed gossip messages are converging faster towards optimality pre-attack, and then very quickly start performing much worse, whereas those that are pull-only nodes are converging at a slower rate but are largely unaffected by malicious gossip. A major reason why I'm thinking this is admittedly due to the fact that I'm dealing with some serious slowdowns in simulations that I cannot explain - it's not a memory issue actually as I've been doing quite a bit of profiling and have done a lot of optimizing. It just has a really hard time calculating similarities for a large number of nodes, which of course become aware of each other quickly during spam attacks. I'm still trying to figure this out. A preliminary side experiment I've been running is the 50% push, 50% pull experiment with a few modifications:
I've also introduced some statistical noise into the rankings to avoid the plateau seen above: there's a 50% chance per query for 2 randomly selected items in the list to swap places in the ranking. This does seem to help a bit with improving ranking over time. Push-pull simulations are really fast. I can do 100k queries in the same time as it takes the other non-push/pull simulations to do 5k queries. Extensive profiling has not shown me why this is happening. I'll have my laptop on my ski trip but I won't be working that much during the trip, just so I can check in on the simulations and maybe do some writing in the car. Let me know what you think? |
quick remarks:
|
Edited for a few necessary clarifications:
I would assume, then, that the Push vs. Pull Experiment is not a comparison across all attack types - instead, we choose one attack (probably Targeted Sybil) and perform a Pull-based experiment, and then compare the results against the regular Targeted Sybil attack, which is under a Push scheme. Currently, the writing is as follows:
As such, the fact that only malicious nodes push 2 messages whereas benign nodes push only 1 is somewhat confusing to me, as it adds extra dimensions to the experiment and also means that malicious nodes have undermined the source code to modify the gossip scheme, which in itself is a pretty major attack vector. I feel like only one of the following should be true:
Does this make sense?
Do we want to change this for all attacks? Right now they're joining at 25% and begin attacking immediately.
Do we change this Persistent Sybil attack scenario into an Epic Attack scenario? Or is "Epic Attack" a new kind of attack that we're also introducing? |
It shows also within the latest figure that after 10 'rounds' of attacks the whole network is polluted and essentially destroyed #5313 (comment)
Always the smart secure option: pull. Just 1 experiment subsection you can elaborate on the push architecture and the consequences.
that sounds like a good storyline
In any distributed systems each node can deviate in a byzantine manner from the protocol. So this experiment (could) explore the rate control that is done in push versus pull. With push architecture an attacker can send messages twice as fast. It is very understandable that a deliberately simple simulator only support fixed message speeds. Feel free to explain that within your text and 'hack' that by letting honest nodes send empty messages with 50% probability. "For reasons of keeping our code simple, reliable, and correct we use a single global message speed for both attackers and honest peers. Attackers use each message to attack, but honest peers obtain a lower effective messaging speed. With 50% probability they send an empty message."
Yes! Except, that pull should be the default. Push has bad security, you can easily flood networks with 1 GBit per second of spam.
All figures. My intuition says that your results will be easier to interpret and understand. Avoiding two things happening at once. But again, its your thesis to write. I'm just paid to help and advise :-)
Yes, that sounds like a more interesting experiment. 100 honest, 300 attackers or so. |
|
Reminder, though, that purely pull-based gossip means that nodes will never become more aware of the network beyond the nodes that they're aware of, so we'd need some kind of propagation method. I think this is why I was doing the 50% push, 50% pull simulation - it allowed for nodes to become more aware of the network, at the risk of being more susceptible to attack. Just so that we're 100% clear: we're using pull-based gossip as the default setting for ALL simulations (except the push vs. pull comparison experiment) which means that I'll need to re-write the gossip section of the thesis. Correct? @synctext Also, I realized that with pull-based gossip, the attackers will perform their attack but then will need to lie in wait until someone requests an update from them, which may dramatically slow them down. If we have attackers push, then that means all adversarial simulations involve a push architecture. How do we handle this? |
|
Found another problem with my data from the experiments, will have to run them again because they were truncating values for some reason. Will upload graphics ASAP. Update: Yet another bug found, only nodes that received gossip REQUESTS were updating during gossip, and not the recipient of the gossip itself, and therefore could not discover newly bootstrapped nodes, so attacks never happened. Fixed, running again. |
Ah, I forgot a question that I think is somewhat important: We are doing bootstrap at 25%, attack at 75% - however, this leaves little time to see the longer-term effects of the attack. What if we did bootstrap at 25%, attack at 50%? |
Baseline simulation does plateau relatively quickly, but some nodes start to move downwards over time. Initial guess is that this is either:
Not that I will prioritize this, but the baseline sim is super fast so I can do a sim 10x as long (or increase the number of request messages) to see what the culprit is. More updates coming after fixing the issue mentioned here: #5313 (comment) |
PUSH: Targeted Sybil Attack with Push gossip scheme. Within 10 rounds of the attack, the entire network converges and plateaus. I'm trying to think of a better way to visualize the infected nodes alongside the sybil nodes, but I don't want to offset them so you can see that the green infected nodes are behind the sybil nodes. I also don't want the sybil nodes obscured by the infected nodes. I guess we can just describe in the figure text that sybil nodes overlap the infected nodes. |
Literature Review and comments
|
Ars Technica: Massive Yandex code leak reveals Russian search engine’s ranking factors. Material for future students? |
🚀 GRADUATED 🚀 |
the above master thesis is also available for 5 students bsc students final thesis project in Q4. Create WEB3 search engine from scratch - De-Google The InternetNumerous initiatives have tried to compete with the success of Google, none succeeded. Using the latest scientific insights you will design a fully decentralised alternative to the Google search engine. Literature and backgroundFirst, read the Wikipedia entry on decentralised search engines. Your key literature for this task is the Delft paper which currently defines the state-of-the-art: G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network. A more introductory blog post on learn-to-rank and dataset details. Early work from 2005 by Delft provides a simple and realistic experimental approach to the problem of media discovery and recommendation, you are required to understand the basic algorithm of semantic clustering (e.g. taste buddies). A paper from 2012 proposes a model where mobile phones use gossip learning to compute a linear model, without revealing local models and without storing the full data set. Another classic attempt from 2003 onwards is the decentralised YaCy search engine with a web crawler, complex hashing, and reverse word index. Problem DescriptionPageRank is the defining centralised algorithm of Google. Understand existing algorithms and architectures for decentralised search engines. Understand the state-of-the-art algorithm within this area: G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network. Contribute to the first hands-on proof-of-principle implementation of G-Rank. Five research questionsSee the initial placeholder implementation for keyword search already present. The following 5 sub-questions will be assigned to a single student each: Dataset engineering. How to design and implement a user model which issues queries and select one entry from presented search engine results. Desired outcome: one search that your model issues will be for the unique words "Punk classical Jazz". This deliberately strange query must select one of the matching musical pieces marked with all these three tags. Required scientific literature for reading: "A user browsing model to predict search engine click data from past observations". For the experimental part, use the existing scrape facility to duplicate the Creative commons music from Pandacd. Critical input for the learn-to-rank algorithms is the popularity of each song or artist. Enhance your model with an existing dataset with 1.4 million artists along with their popularity estimates. Centroids-based clustering. Design and implement semantic clustering of the dataset. Based on the metadata, tags, and popularity you enhance the dataset and user model with Euclidean, Minkowski, Manhattan, or other distance measuring mechanisms. Required background reading. Based on this work you will generate representative taste profiles (e.g. ClickLogs) Decentralised taste exchange. How to design and implement an overlay network to disseminate taste. You task is to create a detailed design of the ClickLog exchanger within the G-Rank algorithm. As a starting point within the literature, read "Random Walks on the Click Graph" Accurate Learn-to-Rank. How to design and implement unsupervised learn-to-rank heuristics with decent response time, at the cost of minor precision and recall. You results need to appear within 2 seconds, therefore providing reasonable computation. Background literature: "Real time search on the web: Queries, topics, and economic value" Real-time learn to rank. How to design and implement unsupervised learn-to-rank heuristics with fast response time, at the cost of significant precision and recall. You results need to appear within 100ms time, therefore very little computation can be performed and pre-calculated indexing techniques must be used. Background literature: "Real time search on the web: Queries, topics, and economic value" Note: this is a challenging assignment that requires thorough understanding of specific scientific literature and ability to engineer algorithms. Recommended for Honor Students only. |
state of the art by Google https://ai.googleblog.com/2023/06/retrieval-augmented-visual-language-pre.html '''A naïve solution for encoding a memory value is to keep the whole sequence of tokens for each knowledge item. Then, the model could fuse the input query and the top-k retrieved memory values by concatenating all their tokens together and feeding them into a transformer encoder-decoder pipeline. This approach has two issues: (1) storing hundreds of millions of knowledge items in memory is impractical if each memory value consists of hundreds of tokens and (2) the transformer encoder has a quadratic complexity with respect to the total number of tokens times k for self-attention. Therefore, we propose to use the Perceiver architecture to encode and compress knowledge items." https://www.deepmind.com/publications/perceiver-general-perception-with-iterative-attention |
Direction changed, txt will be updated soon.
Old stuff:
The text was updated successfully, but these errors were encountered: