-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON/Database Design #2
Comments
User Database Design For the user database, most of what we currently do can be simply migrated over, the basic design hasn't changed much. For Eos, we need to store information about:
|
pyCrest Crest data is special because any of it can be dynamically updated, and it might be user data, it might be market data, it might be static data, or it might be something else entirely different. Do we want to simply use pyCrest to build/update JSON files, and then read the JSON files directly with Flask? |
This is certainly an idea. However, SQL gives us the benefits of SQL queries (while SQLAlchemy gives us the benefits of an ORM). These are very powerful, and without them we would have to load the data into memory and loop over it to find the information we are looking for. This compounds the more data you load related to a record, whereas SQL ORM is quick and easy. I would much prefer to keep with a SQL database and not use plain old JSON as the main data store.
My thought was this: we don't ship the json files, but instead simply ship the cache. Not only that, but I believe (could be mistaken, haven't looked in a while) that EOS can also read data from a SQLite database (and can probably be fanagled to read from a MySQL database as well), so for the purposes of pyfa, it would be json dump of data > sql database. This is shipped, and upon launch of the application the first time eos takes the time to cache the data from the sqlite database. Another thing to note: while I have before been adamant on not shipping the SDE with pyfa (and still am considering it contains data we just don't need), we ought to consider at least being compatible with the SDE schema. I don't yet know the implications of this moving forward (with respect to updates / handling what is essentially data that isn't explicitly shipped with pyfa), but this could allow third parties that use the pyfa server to simply point it at their current SDE implementation. But then again, that would introduce a disconnect - if they upgrade the SDE but not pyfa, then pyfa is broken. If they upgrade pyfa but not the SDE, then pyfa's broken. So scratch that idea, I think we ought to explicitly point to pyfa-supported data only. This opinion is supported by the fact that we may be using CREST data that the SDE simply doesn't support, so having our own schema that we control is in our best interests (and it would also alleviate versioning issues). |
I was also just thinking - since we're going to (eventually) support remote connections, we need to discuss the way data is transferred, and which data is transferred. With a web application, it's pretty simple: the page is static, and it will get ALL information (both user and eve data) from the server. But, branching out to a desktop application, there are two scenarios:
This is something to think about, doesn't need to be solved for immediately. :) |
Eos won't like this. Every time Eos runs, it checks to see if the JSONs match the cache. If it doesn't, it updates the cache. I moved the JSON files (didn't change them or anything, just location), and it triggered a rebuild of the cache. I then tested moving the files and leaving the cache, it bombs out straight off. So we're going to need JSON or possibly SQLite (and JSON is more web app friendly than SQLite is, I think).
The SQLite connection seems to be hard coded in, it's not using sqlAlchemy or something else. So switching it over to support MySQL, MSSQL, Postgresql, DB2....not quite as easy as if sqlAlchemy (or similar) was used. The reading from JSON/SQLite is only used to build the cache anyway. Also, Eos will only read from the cache (@regner actually already looked into this since he'd prefer it not use the cache for web deployments). It's not going to be easy to rip that out (and probably not worth the time). We should change how we pull data in to create the cache, I think, because web deployments probably won't want to use JSONs. I mean, they can but it'd be better to use SQL. But that's a future thing. I think that Eos shouldn't be pulling from SQL directly anyway, if we can point it back to Flask and let Flask handle it through SQLAlchemy (or JSON files) then we have a single point for our data source rather than multiple points. But right now we're stuck with what we have, I think. It's just not worth fiddling with right now, since it'll be fully functional without changing it.
Sorry for quoting a wall of text, but you got a lot going on here. We have different use cases going on here, and I think you're lumping them all in together. For Eos, we're stuck with whatever structure Kadesh setup, which seems to just be what gets dumped by your little tool. That doesn't exactly match what's in the SDE, but that doesn't really matter, because Eos is going to convert it to a cache anyway (and does a massive amount of cutting, which is why we go from over 100 MB of JSONs to 172kb...which that number seems WAY too small, but that's another topic). Now, I did run a quick test, grabbed some item data off Crest and dropped it into a JSON file. Deleted the cache, and Eos recreated it. So it seems that Eos does support Crest style data, but obviously that needs to be fully tested. For pyCrest, most (if not all) of the data that we will pull down will be in a format similar/matching to the SDE. @regner took a look, and the majority of the data Eos uses can be pulled from Crest. So assuming that we want to use Crest data (and there's no real reason not to), we should match that format wherever possible. If we go full in on supporting Crest, and matching that format rather than rolling our own, then it doesn't take an entire rebuild of Pyfa just to get new data. That's a huge load off the developers, and in a Hit-By-A-Bus scenario Pyfa will keep functioning rather than going the way of EFT (which is over 6 months old out of date now). You don't want that, now do you? :) Okay, so far we have:
So we're down to the UI. First off, mostly we just need the invtype table/JSON. Everything else is pretty much just linking things together, and is pure IDs, and can be whatever structure we want. And going back to my earlier statement:
We already have at least 3 people lined up who are wanting a RESTful API (is Flask done yet @regner ?). They don't care about the UI, and probably don't even want it included in the package (but that's another topic). They aren't going to want to be limited to waiting on Pyfa to release an update, just so they can run it as a service. They're going to want to be able to update it on demand, and I think it's a huge mistake to work against what is potentially a very big user base. (Think of how many users Pyfa will have if squizz implements it on zkill!)
Not necessarily. Just like we now actually have separation between Eos and the GUI, we should have separation between our code and the data. We should validate and clean our inputs and outputs (especially since we're adding a HTTP API, we don't want any Little Bobby Tables here) and should throw exceptions if it fails to prevent GIGO. Even if we didn't do all that (which, again, I think we should), the vast majority of the time (like 90%+) there's nothing in the data that will break something. Most of the time it'll be functionality that's not implemented (net new things) rather than old functionality that's modified that breaks things. To use an example here, take the citadel patch. Assume for the sake of argument that we simply treated Citadels like ships, so they were visible as soon as the Crest endpoints updated. Joe User submits an issue. "When I try and open a Citadel to fit it, Pyfa throws an error that there was a problem." Nothing else is broken, and Joe User can use Pyfa just fine. Now lets fast forward a few months, and the mining barge re-balance happens. Squizz triggers an update of the database/JSON. All the mining barge stats are updated, everything works. Unfortunately the new PANIC module doesn't do anything (you can equip it, and it has stats), because CCP did new functionality. But he doesn't have to wait for a new release of Pyfa to get new data. TL;DR For the backend service we should use a data format that matches Crest. If the data is not available in Crest but is in the SDE, we should match the SDE. If it's not available in either, then we do the best we can. For the UI, it doesn't matter, we're in control of it so we can use whatever we want.
Which data are we talking about here? :)
Why does it have to be different? So setting aside genuine web app scenario (most of those users will be ditching our UI and just connecting to the service directly), we have basically 2 platforms, and potentially 2 scenarios. For our platforms, we want to support desktop (Windows/Linux/Mac) and mobile (Android/iOS). For our scenarios, ideally we should support local (service running locally) and remote. Our UI should be completely separate from the backend service. For local use, on Windows/Linux/Mac/Android this is trivial, the backend is run as a service, and we launch the GUI which connects to the service via RESTful HTTP. (iOS I'm not sure of.) Now it'd be really slick if the UI could connect to the backend service running on another computer. So I install Pyfa on my desktop, then on my Android phone plug in my computer name/IP, and away I go. But we're just going to connect via a RESTful HTTP connection. The only thing different is that it points to 192.168.0.201 instead of 127.0.0.1.
On a modern network the amount of data we're transferring is trivial. If we were talking about gigs/s, or megs/s, or even just hundreds of KB a second I'd agree with you, but even if we transferred all of the raw JSON files we're talking about a MASSIVE 100 megs (seriously though 100 megs is not large). Add in the user data, and we're probably up to 101 megs (yeah, the user database is tiny). If we're transferring what Eos stores in it's cache, then we'ere talking about well under a meg of data (maybe a full meg if we dump the entire user DB). Realistically we will have a tiny amount of data to transfer, so doing it live over the wire is no big deal. Additionally, we will also need to cache it local to Angular/React anyway, so we can properly display the trees and whatnot. So it's pretty simple when the app launches to request the data from Flask, cache it locally, and away you go. It might add a few milliseconds to the first load time, but it's not going to be much. As long as we don't have users on 56k modems, we'll be fine. :) Okay, that's a large enough wall of text. |
I think a lot of this has been cleared up in Slack discussions, namely the fact that it's inadvisable to have EOS loop back and talk to our API endpoint to get the data it requires.
This is most certainly broken. EOS was designed specifically for the output defined by phobos, which is how the data is laid out in the EVE cache. CREST certainly has the same fields, but EOS by no means supports CREST. I'd rather it remain that way, TBH. EOS needs 1 thing: data. that data currently comes from JSON or SQLite (which can be branched out to MySQL later on). When I say CREST dumper, we really should have a dumper that produces the SQLite database that is loaded into EOS, rather than have EOS directly get data from CREST. This gives us the benefit of having to do one dump to compile the EVE data source, which EOS then uses to compile it's cache.
Uh, no? We also need categories, market groups, attributes, effects... pretty much everything in the eve database is needed by the UI in one form or another.
My priority is to what pyfa itself needs as an application. the However, if a third-party wants to host a pyfa server (not eos), then I think it's best to develop our own schema that works best for us and pyfa the application, rather than try to support the SDE as well, considering our data may be coming from multiple sources (sde, crest, any new data source down the road)
It's just an efficiency thing, certainly not something we need to be concerned about at the moment.
I'm not sure I follow you. The EOS cache will not help in any user interaction. It's literally a bunch of integers compiled in a way to make lookups fast and calculations faster. |
This was mostly me thinking of Flask more like Django, where it's a full fledged framework instead of simply a HTTP endpoint. In that context, it doesn't make sense to have Eos talk to Flask for data, but I'm still of the opinion that it's beneficial to us for many reasons to have centralized management of our data connections. But we can look at that as it comes up.
The eos-http package is a very simplified version (or possibly a precursor of what we need) for our UI will also use. Since our UI will just be doing RESTful queries to pull data (assuming we go with Angular/React/similar), then it's all just pure HTTP API communication. Or am I misunderstanding how you wanted to design it? I see developers using this in a few potential ways.
If our backend service can point to any data source, as well as being integrated into Crest (what you keep calling the "crest dumper," which sounds like something that involves toothpaste and a porta-potty), then it doesn't really matter what format the data comes in, we'll translate it to whatever we need, and all that is transparent to the developer using our framework. They feed a fit into our API, black box magic happens, and they get an output with stats. I just want the setup to be as simple as possible, for us as developers, for users of the whole Pyfa product (UI and back end), and 3rd party developers like Squizz/Tamber. |
No problem with this
no problem with this, but I don't consider this to be a pyfa service since it would only rely on eos, not on pyfa-specific things. A pyfa service, to me, would be something that requires our server and our set up (database schema). For third parties, this is simply an EOS service that they have an endpoint to. I'm being nitpicky with the terminology :P
This would basically be used on alliance websites so that they can self-host their own fitting thing. I'm not sure where this discussion is going since it seems that we are in agreement on most things. This is still confusing though:
Again, my thought for the flow of data is this:
That would be done during the build and deploy process. This allows us to have a SQL database with the data that can be used by EOS to create it's cache, and also used by the framework to support the other functions (serving the user type data and whatnot) |
Also, EOS does support multiple database versions. This is what the SourceManager is (I think that's what it's called). It would allow us to support, say, the latest patch, but also support other eve data versions (so you can see how a fit has changed with respect to patch). O.smium has this. I think we should to, but definitely not out the gate. That would require a much bigger discussion on how to handle it. Just wanted it noted. |
Just opening a thread on the JSON/Database design for feedback, comments, and ideas.
Just opening a thread on the JSON/Database design for feedback, comments, and ideas.
This is just in regards to the back end, so setting the UI aside for the moment.
Eos doesn't use a database, it grabs all it's data from JSON files and creates a cache file out of them.
Specifically, these are the ones it cares about:
Since we are already going to be including JSON files for Eos (I'm assuming we're not going to spend the time to rewrite that), what else can we leverage those JSON files for?
Do we just want to just use JSONs for the back end, and skip using SQL for static data entirely?
(This would certainly make updating the data much easier than trying to update an existing SQL database.)
Right now for the back end we have the following modules:
Eos
pyCrest
sqlAlchemy
Flask
Since Flask can read the JSONs directly (and load them into memory), I'm not sure what duplicating that data into SQL would buy us.
The text was updated successfully, but these errors were encountered: