-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OK-757: create XML in a streaming way #130
base: master
Are you sure you want to change the base?
Conversation
The point of this change is to preserve memory: the old ElasticClient code loaded all toteutukset into the memory at the same time. Changes publishedToteutukset to a Stream instead of a Future-wrapped Stream; simpler, and easier to make guarantees about when a computation is launched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hyvältä näyttää, muutamia huomioita.
@@ -23,7 +24,7 @@ object ElasticQueries { | |||
} | |||
} | |||
|
|||
object ElasticClient { | |||
object ElasticClient extends Logging { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ei liity tähän PR:ään kun näin lokitus on ilmeisesti toteutettu, mutta ihmetyttää tämä periyttäminen komposition sijasta.
Stream.concat(hits.map(_ \ "_source").toStream, rest)} | ||
} | ||
}} | ||
def listPublished(after: Option[String]): Stream[JValue] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Näyttää itse asiassa tosi selkeältä vaikka noi JValue operaattorioverloadit vähän ahdistaa. Oliko niin että toi #::: operaattori oli lazy mutta concat ei. Siinä tapauksessa laittaisin kommentin ettei joku refaktoroija riko tuota.
Täydellisessä maailmassa olisi myös testi jolla varmistetaan että koodi suoriutuu nyt sellaisesta mistä se ei aikaisemmin suoriutunut. Toisaalta ymmärrän kyllä että hemmetinmoisen datapläjäyksen generoiminen dynaamisesti ES:ään testissä tarkoittaa ainakin pitkää kestoa.
List("1.2.246.562.17.00000000000000000001", "1.2.246.562.17.00000000000000000002")) | ||
}}, 60.second) | ||
val result = ElasticClient.listPublished(None) | ||
assert(result.toArray.length == 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tykkään itse laittaa matalalla kynnyksellä assertteihin kommentteja jotka selittää miksi maailma makaa näin, tyyliin: "testidatassa yhteensä viisi koulutusta joista kaksi julkaistu".
)) | ||
}} | ||
val result = Await.result(future, 60.minute) | ||
lazy val elasticUrl = EuropassConfiguration.config.getString("europass-publisher.elasticsearch.url") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pitäisikö olla jokin tapa kaapata mahdollinen räjähdys lokille?
toteutusXmlStream.foreach{toteutus => { | ||
dest.write(toteutus.toString) | ||
writtenToteutusCount += 1 | ||
if (writtenToteutusCount % 200 == 10) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Miksi 10 eikä 0?
The point of this change is to preserve memory: the old ElasticClient code loaded all toteutukset into the memory at the same time.
Changes publishedToteutukset to a Stream instead of a Future-wrapped Stream; simpler, and easier to make guarantees about when a computation is launched.