jmgimeno · June 26, 2014 17:26 · Jun 26, 2014 · Jun 26, 2014 · Jun 26, 2014
diff --git a/euroclojure2014.org b/euroclojure2014.org
@@ -436,9 +436,9 @@
    - *you should embed /deeply/ into clojure*
 ** links
    - http://twitter.com/otfrom
-   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.core
-   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.demo
-   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.bruce
+   - http://cljsfiddle.net/fiddle/thattommyhall.geomlab.core
+   - http://cljsfiddle.net/fiddle/thattommyhall.geomlab.demo
+   - http://cljsfiddle.net/fiddle/thattommyhall.geomlab.bruce
    - http://www.complexityexplorer.org/
    - http://cljsfiddle.net/fiddle/thattommyhall.ants.core
    - http://ccl.northwestern.edu/tortoise/2013-10-25/Ants.html
@@ -817,10 +817,462 @@
    - in general, if you leave it running over time and let the
      evidence build, it should be fine in the long run
    - but that is definitely a flaw
-* Tommi ?, Schema and Swagger to improve your web APIs
+* Tommi Reiman, Schema and Swagger to improve your web APIs
 ** super simple web api in clojure
+   - just using compojure
+   - "sausage" as example data
+   - ~PUT /foo/sausage/:id~
+   - example:
+     - in Java: immutable value object
+     - in Scala: case class
+     - in Clojure:
+       - free-form map?
+       - constructor fn with bunch of validation?
+       - prismatic/schema!
 ** prismatic schema
+   - define structure of sausage
+   - then call ~s/validate~ to validate
+   - schema can define functions
+
+#+begin_src clojure
+  (s/defn get-sausage :- (s/maybe Sausage) [id :- Long]
+    (@sausages id))
+
+  (s/defn ^:always-validate get-sausage2 :- Sausage [id :- Long]
+    (@sausages id))
+#+end_src
+
+*** schema coercion
+
+#+begin_src clojure
+  (defmodel Pizza {:id Long
+                   :name String
+                   :price Double
+                   :hot Boolean
+                   (s/optional-key :description) String
+                   :toppings #{(s/enum :cheese :olives :ham :pepperoni :habanero)}})
+#+end_src
+    - allows slurping JSON data, but imposing extra types
+    - eg above we can slurp toppings from a JSON array into a Clojure
+      set rather than a vector
+
+*** double schema
+
+    - loose schema for first input
+      - ~(def Customer {...})~
+    - tighter schema for validated input
+      - ~(def ValidCustomer (merge Customer {...}))~
+
+*** schema selectors
+
+    - accept but remove unrecognised params with ~select-schema~
+
+*** generative schema
+
+    - generate random orders for test data
+    - davegolland/generative-schema.clj
+
+*** contribs
+
+    - sfx/schema-contrib
+    - cddr/integrity
 ** swagger
+   - a specification for describing, producing, consuming, visualising RESTful web services
+   - https://helloreverb.com/developers/swagger
+   - existing adapters
+   - clojure options:
+     - octohipster
+     - swag
+     - ring-swagger
+       - compojure-api
+       - fnhouse-swagger
+   - endpoint definitions in JSON
+   - data models as a JSON Schema
+   - swagger UI
+     - visualises the API
+   - code gen
+     - no clojure support yet (anyone?)
+   - swagger-socket
+     - run it all on top of websockets
 ** ring-swagger
+   - https://github.com/metosin/ring-swagger
+   - JSON-Schema has some dates
+     - but prismatic/schema will never support dates, as it's more
+       generic
+   - higher level abstractions on top of swagger, but nothing for the
+     web developer
 ** compojure-api
+   - an extendable web api lib on top of compojure
+   - macros & middleware with good defaults
+   - schema-based models & coercion
+   - ~GET*~ macro to define input and output schemas
 ** fnhouse-swagger
+   - prismatic/fnhouse
+     - launched at clojure/west
+   - ~defnk~ with metadata → annotated handler
+   - fnhouse-swagger
+     - metosin/fnhouse-swagger
+** summary
+   - schema is an awesome tool
+   - describe, validate, coerce your data
+   - building on top of ring-swagger
+     - compojure-api → declarative web apis
+     - fn-swagger → meta-data done right
+     - or do your own!
+   - kekkonen.io
+     - CQRS-lib
+* Renzo Borgatti, The Compiler, the Runtime and other interesting beasts from the clojure codebase
+  - http://twitter.com/reborg
+** an amazing growth:
+   - mar 2006: first commit
+   - oct 2006: 30k loc (7 month old)
+   - oct 2007: clojure announced!
+   - oct 2008: invited to Lisp50 to celebrate 50 years of lisp
+   - May 2009: 1.0 + book!
+   - now: almost 90k loc
+** initial milestones
+   - apr 06: lisp2java sources
+   - may 06: boot.clj appears
+   - may 06: STM first cut
+   - june 06: first persistent data structure
+   - sep 06: java2java sources
+   - aug 07: java2bytecode started
+   - right after: almost all the rest: refs, lockingtx
+** drew on lots of sources of knowledge
+   - collection of papers
+** high-level view:
+   - ~(def lister (fn [& args] args))~
+   - read → analyse → emit/compile → compile
+   - although the lines between the stages get blurred at times
+** reader
+   - takes stream, returns data structures
+   - PersistentList, Symbol, etc
+** analyser
+   - input: data structure
+   - output: exprs
+     - DefExpr
+       - Var
+       - FnExpr
+         - Sym
+         - PersistentList
+           - FnMethod
+             - LocalBinding(Sym("args")),
+             - BodyExpr
+               - PersistentVector
+               - LocalBindingExpr
+** Emission
+   - bytecode generation for Exprs
+   - prerequisite for evaluation
+   - emit() method in Expr interface
+   - Notable exception: called over ??
+** Evaluation
+   - transform Exprs into their "usable form"
+   - eg
+     - new object
+     - a var
+     - namespace
+   - FnExpr is just getCompiledClass().newInstance
+** Compilation
+   - Usually coordination for emit
+   - Compiler.compile namespace -> file
+   - ...
+** Emit
+   - input: Exprs
+   - output: bytecode
+** monsters!
+*** RT
+    - this is how the RT class gets initialised: the first time it gets
+      referenced:
+#+begin_src java
+  final static private Var REQUIRE = RT.var("clojure.core", "require");
+#+end_src
+    - simply referring to it here causes the static initializers to run
+    - RT has a *lot* of behaviour in static initializers
+      - inside it is the ~doInit();~ call
+        - which loads all of ~clojure.core~
+      - all just from referring to RT in some otherwise unrelated class!
+*** Compiler
+    - inner classes for each Expr type
+*** LispReader
+    - inner classes for each token you might encounter
+    - ~<clinit>~
+      - sets up reader macros
+        - ~macros~ and ~dispatchMacros~ (latter for ~#{~ ~#(~ ~#_~ ~#^~ etc)
+*** analyze()
+    - not a class, but a family of methods
+      - ~analyzeSeq~
+      - ~new ConstantExpr~
+      - ~MapExpr.parse~
+    - FnExpr.parse
+      - invokes the compiling phase during parsing phase
+*** emission
+    - ASM lib used to generate bytecode
+    - FnExpr.emitMethods()
+      - generate a method for each of the arities of the function
+*** other beasts
+    - LockingTransaction and Ref
+** DynamicClassLoader
+   - ~clojure.lang.DynamicClassLoader.findClass(String)~
+     - ~RT.classForName()~
+     - ~Compiler$HostExpr.maybeClass()~
+   - Class.forName() goes up the hierarchy of classloaders and asks
+     each what they know
+     - an instance of DynamicClassloader is created for each namespace
+       - and also for each form
+     - (this is true for the bootstrap phase; not always true eg in
+       AOT (ahead-of-time) compilation)
+   - supporting dynamicity
+     - in defineClass:
+       - ~classCache.put(name, new SoftReference(c,rq));~
+     - in findClass:
+       - ~Reference<Class> cr = classCache.get(name);~
+     - SoftReferences are used to save PermGen, since if we redef a
+       var we don't want it to keep consuming PermGen
+** Bonus: clojure was initially implemented in lisp
+   - ~1600 loc to implement read, analyse, compile, eval
+   - although emitting Java code, not bytecode
+   - was also generating C♯
+** Q: some things in bytecode can't be expressed in java
+   - is there anything which clojure generates which can't be
+     decompiled back to Java?
+     - I'm pretty sure yes, but not sure exactly what
+     - Rich:
+       - locals-clearing
+       - constructs which use goto (which exists in bytecode but not
+         Java)
+* Rich Hickey, the insides of core.async channels
+** aside: here's what clojure looks like in a good IDE
+   - (ie IntelliJ)
+   - yes, Compiler.java is massive
+     - but if your IDE has a structure editor, you can navigate them
+       all easily
+     - it's all in one file because I don't want 300 files
+** aside2: the classloader has a cache in a branch
+   - fast-load branch
+** warning! implementation details ahead
+   - subject to change!
+   - informational only
+** the problems
+   - single channel implementation
+     - for use from both dedicated threads and go threads
+       - simultaneously, on same channel
+   - alt and atomicity
+     - Java CSP libraries often didn't support alt well
+     - it's tricky to do atomically
+   - multi-reader/multi-writer
+   - concurrency
+     - construct deals with the ick of threads and mutexes
+   - (this talk: focus on JVM impl; JS version has less of these
+     issues)
+** API
+   - ~>!~ ~>!!~ ~put!~ ~alt!~ → channel → ~<!~ ~<!!~ ~take!~ ~alt!~
+   - it's not an RPC mechanism, it's just a conveyor belt
+** SPI (service provider interface)
+   - ~>!~ ~>!!~ ~put!~ ~alt!~ → ~impl/put! [val handler]~ → channel →
+     ~impl/take! [handler]~ → ~<!~ ~<!!~ ~take!~ ~alt!~
+** anatomy
+   - channel has:
+     - pending puts (fifo)
+     - a buffer (optional) in the middle
+       - contains data
+     - pending takes (fifo)
+     - flag indicating if channel is closed
+   - fifos implemented as linked queues
+   - important to distinguish queues of operations from buffer of data
+** invariants
+   - never pending puts and takes simultaneously
+   - never takes and anything in buffer
+   - never puts and room in buffer
+   - take! and put! use channel mutex
+   - no global mutex
+     - or even multi-channel mutex
+** put! scenarios
+   1. one or more waiting take! operations
+      - gets paired up, takes handler gets completed
+   2. stuff in the buffer, but with room in buffer
+      - puts its stuff in the buffer, succeeds and immediately
+        completes
+   3. buffer full (or no buffer)
+      - enter puts queue, block
+        - results in backpressure
+   4. full buffer, but windowed
+      - sliding buffer: latest information takes priority, drop head
+        of buffer (oldest item in fifo), put! completes immediately
+        and enters buffer
+      - dropping buffer: drop put! on floor, but completes immediately
+      - could have more sophisticated policies in future
+** take! scenarios
+   1. nothing in buffer
+      - enqueued
+   2. buffer has stuff, but no puts waiting
+      - get data, immediately complete
+   3. buffer full (or no buffer), puts pending
+      - get something (either head of buffer or get paired with first
+        put!)
+      - first waiting put! completes (either enters buffer or hands
+        directly to take!)
+** close! scenario
+   - all pending takes complete with nil (closed)
+   - subsequent puts complete with nil (already closed) (relatively
+     new)
+   - subsequent takes consume ordinarily until empty
+     - any pending puts complete with true
+     - takes then complete with nil
+** queue limits
+   - puts and takes queues are not unbounded either
+   - 1024 pending ops limit
+     - somewhat arbitrary, might change
+     - will throw if exceeded
+       - if you're seeing this, it's an architecture smell
+     - most likely if you use put! on the edge of your system
+** alt(s!!)
+   - attempts more than one op
+   - on more than one channel
+   - without global mutex
+   - nor multi-channel locks
+   - exactly one op can succeed
+*** implications
+    - registration of handlers is *not* atomic
+    - completion might occur before registrations are finished, or any
+      time thereafter
+    - completion of one alternative must 'disable' the others
+      atomically
+    - cleanup
+** handlers
+   - wrapper around a callback
+     - callbacks are icky, so we want to hide them
+   - SPI
+     - active?
+     - commit → callback-fn
+     - lock-id → unique-id
+     - ~java.util.concurrent.locks.Lock~: lock, unlock
+** take/put handlers
+   - simple wrapper on callback
+   - lock is no-op
+   - lock-id is 0
+   - active? always true
+   - commit → the callback
+** alt handlers
+   - each op handler wraps its own callback, but delegates rest to
+     shared "flag" handler
+   - flag handler has lock
+     - a boolean active? flag that starts true and makes one-time
+       atomic transition
+   - commit transitions shared flag and returns callback
+     - must be called under lock
+** alt concurrency
+   - no global or multi-channel locking
+   - but channel does multi-handler locking
+     - some ops commit both a put and a take
+   - lock-ids used to ensure consistent lock acquisition order
+     - (avoids deadlock)
+** alt cleanup
+   - "disabled" handlers will still be in queues
+   - channel ops purge
+** SPI revisited
+   - handler callback only invoked on async completion
+     - only 2 scenarios
+   - when not "parked", op happens immediately
+     - /callback is not used/
+     - /non-nil return value is op return/
+   - only time ops park
+     - put! when it gets blocked on full buffer
+     - take! when it gets blocked on empty buffer
+   - only time ops complete asynchronously
+     - take! with pending puts
+     - put! with pending takes
+** wiring !/!!
+   - blocking ops (!!)
+     - create promise
+     - callback delivers
+     - only deref promise on nil return from op
+       - non-nil indicates immediate success (and so callback never
+         gets called)
+   - parking go ops (!)
+     - IOC state machine code is callback
+** summary
+   - you don't need to know any of this
+   - but understanding the "machine" can help you make good decisions
+** Q: why use alt! for putting? what's rationale?
+   - taking multiple channels is like a select(2)
+   - when you have consumers of different capabilities
+     - I want to try to write to everyone, but whenever the first one
+       is ready, I give it to them
+     - Q: what's the difference between that and having four consumers
+       on a single channel?
+       - you might have a priority metric, or a cost metric
+       - though yes sometimes you can achieve same result two
+         different ways
+** Q: why is global or multi-channel mutex not good enough?
+   - well it would be easy! :)
+   - a global mutex could make registration atomic
+   - you'd have to make disabling other alts atomic
+   - you'd have to make rendezvous atomic
+   - you could have two unrelated sets of channel operations, why
+     should they contend?
+   - people hate global locks
+   - rules out by my aesthetic sense :)
+** Q: David Nolen had an example of 10000 go blocks updating a textarea, did he hit the 1024 limit?
+   - no I don't think so, but not sure exactly
+** Q: are buffer & queue sizes useful metrics to monitor?
+   - that would be great, and making them monitorable is on the TODO
+     list
+** Q: other possible extensions?
+   - buffer policies
+     - you might have logic about priority
+   - core.async has proven its utility and it's become important
+     - ~go~ macro is a great PoC of what you can do with a macro with
+       several kLoC behind it
+       - has its own subcompiler inside it
+       - kind of implements a subset of clojure
+     - maybe build async support into the compiler?
+       - move locals from the stack to fields on the method object
+       - I don't need the stack anymore
+       - I can be paused and resumed on another thread
+       - declare a fn as async
+       - comply with this SPI
+       - could build other things like generators & yield
+     - the pride moment of "look you can do this with a macro" is not
+       dominated by the desire to make this performant and more solid
+   - Q: continuations? how do they differ?
+     - continuations are more general
+     - this won't use continuation-passing-style
+     - it's related
+     - it won't be like call/cc
+     - it won't be first-class
+     - you won't be able to resume it more than once
+     - for a specific set of use-cases
+     - Oleg did a talk that just generators are enough to do stuff
+       that people think you need a lot more for
+** Q: is there something planned for dynamic binding and the ~go~ macro?
+   - there are fns which allow you to do the conveyance
+     - don't know if ~go~ allows all of them to work
+** Q: channels on the network?
+   - it's easy to have something you call a channel and put over a wire
+   - pretty hard to have all the semantics of these channels over the
+     wire
+   - already have queues and all sorts of interfaces to do similar
+     things
+   - atomic alt! over more than one wire not going to happen
+   - maybe semantics for ports
+   - or limitations on alt!
+   - the wire has its own semantics, this is the key thing here
+     - failure, queueing, delays
+   - really easy to just take something from the wire and call put!
+** Q: is there a typical way to monitor a go block?
+   - what kind of monitoring?
+   - /see that it's still working, still alive?/
+   - if the channels were monitorable, you could see if things were
+     producing/consuming properly
+** Q: what other options did you consider & reject in the design of core.async
+   - something other than CSP?
+   - the generators stuff
+   - continuations
+   - I liked what golang did
+     - they made a good choice
+     - there's a java csp lib that impls the same kinds of ops
+     - it's difficult to get the semantics correct
+   - wanted ~alts!~ to be a regular fn, not syntax
+     - which feels like an enhancement over go
+   - what we're putting on these channels is immutable
+     - which gives extra robustness
diff --git a/euroclojure2014.org b/euroclojure2014.org
@@ -455,3 +455,372 @@
     - otherwise this is all a recent exploration
     - errors in cljsfiddle are not reported well
       - again problematic for day zero
+* Mathieu Gauthron, JVM-breakglass
+  - using a clojure REPL to troubleshoot live java/JVM processes
+  - http://slides-euroclojure2014.matlux.net
+  - when you see fire, you break glass
+  - when your jvm process is on fire, you use JVM-breakglass
+** troubleshooting a java application
+   - debugger
+     - only powerful when you can narrow down the problem to a series
+       of breakpoints
+     - when the problem is a race condition, it will change the nature
+       of the problem you're studying
+   - log/print statements
+     - you need to plan before compilation
+     - when the problem is in production, it might be too late
+   - jmx
+     - again, you need to plan for it in advance
+   - ad-hoc interactive mechanism
+** what is jvm-breakglass
+   - open source
+   - integrates with any jvm process
+   - console onto a jvm process
+** main features
+   - interactive prompt
+   - see inside private members
+   - call arbitrary methods
+   - create new object instances
+   - create new classes
+   - monitor object state
+   - no need to use clojure to develop the app
+** how does it work?
+   - jvm-breakglass runs inside the JVM and starts an nrepl server
+   - you can then connect using an nrepl client (eg lein)
+** how to use it?
+   - add it to your maven dependencies
+   - add an entry point (as a ~<bean>~ or in java code)
+   - connect with ~lein repl :connect localhost:1112~
+** demo (enterprise application)
+   - tomcat JVM
+   - employee/dept data structure
+   - report generation
+   - java/spring mvc webapp
+   - jvm-breakglass
+   - spring data
+     - in XML, naturally
+*** homepage
+    - oh no! one of the reports isn't working?
+    - "list employees in london" is empty
+      - but we know that employee Mick Jagger lives in london
+      - what's going on?
+*** breakglass to the rescue
+    - view environment:
+      - current directory, System/getProperties
+      - view conf directory
+    - list all loaded Spring beans
+    - instrospect into object private members
+      - ~bean~ builtin fn
+      - ~to-tree~ to do so recursively
+    - view methods or fields for a given object
+    - redefine a class
+      - in this case, ~(proxy [Address] ["1 Mayfair", "SW1", "London"]
+        (getCity [] "London"))~ to define the new version, overriding
+        a method
+      - ~(.setAddress (:Mick employees) address)~ to inject it into
+        the live data
+** take a step back
+   - remember what it's like to be a java programmer?
+   - working with jmx beans and suchlike to try to understand why
+     production is down
+   - this stuff looks like magic
+** Q: how do you convince production people to put nrepl server in place?
+   - short answer: impossible
+   - that's not how you present it
+   - either you do it sneakily (that's bad), and only pull the trump
+     card when the team is desparate
+   - or you convince the team that it would be useful in the UAT
+     environment, and "of course it's never going to be used in
+     production"
+     -
+** Q: have you considered a high-level switch that would prevent you mutating anything in the host application?
+   - don't know how you'd be able to do that
+   - have been thinking about it
+   - maybe using clojail
+   - kind of defeats the point
+** Q: have you tested this with a scala app?
+   - haven't tried
+   - I've reverse-engineered the java bytecode, and it's readable
+   - as long as you know how it compiles, it seems reasonable
+** Q: you were using methods like get-obj and passing string name. how does breakglass know which object to get?
+   - eg if you have multiple instances of Department, how does it know *which* department?
+     - in Spring it's a Spring bean which is named
+     - if you're not using Spring, what's your entry point?
+       - when you create your NreplServer to enable jvm-breakglass,
+         you can add your entry points there
+       - ~new NreplServer(port).put("department"),myObject);~
+       - static methods & fields can be used too
+* Gary Crawford, Using Clojure for Sentiment Analysis of the Twittersphere
+  - leiningen versus the ants, carl stephenson
+  - leiningen versus apache ant?
+  - clojure versus java?
+  - FP versus OO?
+** stratified medicine
+   - determine the best treatment for someone based on their genetic
+     makeup to manage their chronic disease
+** sentiment analysis
+   - Paper: "Twitter mood predicts the stock market"
+     - predicted Dow Jones average through monitoring tweets
+   - people who suffer chronic disease tend to be neurocompromised
+     - what would normally be a minor illness can prove fatal
+   - can we use twitter to predict spread of disease?
+** so we tried
+   - score tweets for flu symptoms
+   - the data science wasn't very difficult
+     - but scaling it was
+   - 30 million geo-tagged tweets sent from UK
+   - couldn't scale, even with
+     - HDFS/hadoop
+     - mongo/aggregation
+     - mongo/mapreduce
+     - postgres
+** how can we do fast, real-time analytics of social media?
+   - application: how do people feel about Scotland's independence
+     referendum?
+   - data increases in value as we analyse it
+     - tweets
+     - analytically prepared data
+     - analysis
+     - insight
+     - predictions
+   - the raw data isn't what you care about
+   - don't store the raw tweets, only store the analytically prepared
+     data
+   - stored in redis using ptaoussanis/carmine
+     - it has great support for bitmaps
+** example
+   - ~(car/setbit sentiment tweet-id 1)~
+   - ~(car/bitcount "SCOTLAND")~ -- tells me how many tweets have
+     mentioned Scotland
+   - how many people in england are happy?
+#+begin_src clojure
+  (wcar*
+   (car/bitop "AND" "ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
+   (car/expire "ENGLAND&JOVIALITY" 10) ;; don't keep the data longer than 10 seconds
+   (car/bitcount "ENGLAND&JOVIALITY"))
+#+end_src
+
+   - further: "how many people in Scotland are tired or grumpy?"
+** getting the data in
+   - adamwynne/twitter-api
+   - you can specify you only want tweets from a certain geographical
+     locality with a bounding box
+     - but this is literally a rectangle
+     - need it around Europe
+   - LMAX-Exchange/disruptor to communicate
+     - journaling
+     - syncing
+   - business logic
+*** what sentiment?
+    - this is hard!
+    - "I'm loving #EuroClojure! :D"
+    - Positive Affect: enthusiastic, active, alert
+    - Negative Affect: subjective distress
+    - actually two separate dimensions, not opposites
+    - Watson et al, 1988
+    - PANAS
+    - then PANAS-x
+    - then PANAS-t
+      - accounts for bias on social media
+      - outlines sanitisation
+      - validate against 10 real events
+**** sanitisation
+     - https://github.com/dakrone/clojure-opennlp
+     - get rid of spam
+     - account for text speak
+     - account for emoticons and emoji
+     - word stemming (or lemmatisation)
+     - part of speech tagging
+*** where? reverse geocoding
+    - don't want to rely on external services
+    - don't want heavy IO
+    - don't want round trips to database
+    - accuracy not too much of a concern
+      - we already lose accuracy in interpreting the sentiment of the
+        tweet
+    - convert a map of the uk to colours:
+      - look up geocode coords in map
+      - check colour → get country code
+    - problem: the world is a sphere
+      - projecting a sphere onto a rectangle
+    - prior art in d3.js
+    - use JavaFX to exploit it
+*** when?
+    - there's a lot of seconds in a day
+    - and even more seconds in a year
+    - really not interested in seconds anyway
+    - want to group tweets by minute
+    - and also group by hour
+    - and also group by day, and month, and year
+** why?
+   - why are we doing this?
+   - online social media are surveillance
+   - the line between public and private is becoming blurred
+   - if we don't need data, we shouldn't collect it
+     - in this example:
+       - we're never more granular than country
+       - we're never more granular than overall sentiment
+       - we're never more granular than minute
+     - hopefully this is enough to prevent anyone being identified
+   - Datensparsamkeit
+** Q: have you used Storm for this?
+   - no
+** Q: any preliminary results on the Scotland referendum analysis?
+   - I've had more luck with tech than data science?
+** Q: which way should we vote?
+   - haha
+** Q: how do you verify your results?
+   - it's very crude at the moment?
+* Paul Ingles, Multi-armed Bandit Optimisation in Clojure
+  - @pingles
+** problem statement
+   - product optimisation cycles are long, complex, and inefficient
+   - the multi-armed bandit model shows lots of things we're getting
+     wrong
+   - eg: online newspapers
+     - fundamentally human-led, editorially-led
+   - people behave irrationally
+   - Dan Ariely & Daniel Kahnemann
+   - (@philandstuff suggestion: Stuart Sutherland, Irrationality)
+   - economist subscription options
+     1. online $59
+     2. print $125
+     3. print & online $125
+     4. the ridiculousness of option 2. makes option 3. seem more
+        reasonable
+   - need machines to optimise at scale; but need humans to provide
+     stuff only they can
+   - running RCTs to optimise sites
+     - doing so on a continuing basis
+     - measuring big effects work with small numbers of participants
+     - but measuring small effects requires ever larger numbers
+     - to the extent that you can only run ~12 experiments a year
+     - which is not really good enough
+** Bandit strategies can help
+   - a product for procrastinators by a procrastinator
+   - Product: Notflix!
+     - video website
+     - http://notflix.herokuapp.com/
+     - shows 3 different videos
+     - show good videos at top of page, and less good at bottom
+     - show best possible thumbnail for each video
+   - optimising with multi-armed bandits
+     - optimising order and thumbnails
+** multi-armed bandit problem
+   - slot machine = one-armed bandit
+   - problem: you have a bunch of money you want to "invest" in a
+     casino
+     - you have a number of different machines to play
+     - each machine has a different probability of reward
+     - you don't know what that probability is up front
+   - need to balance "exploration" and "exploitation"
+     - ie learning about the world vs using that knowledge to maximise
+       income
+     - analogy: trying new foods out vs sticking to what you like
+** bandit model
+   - number of *arms* {1, 2, ..., /K/ }
+   - number of *trials*: 1, 2, ..., /T/
+   - *rewards*: {0,1}
+   - /K/-headlines
+     - options of different text
+   - /K/-buttons
+     - options of button text, colour, etc
+   - /K/-pages
+     - whole page redesigns
+   - explore this space with notflix
+
+** bandit strategy
+
+#+begin_src clojure
+  ;; choose which arm to pull
+  (defn select-arm [arms]
+    ...)
+
+  ;; update arm with feedback
+  (defn pulled [arm]
+    ...)
+  (defn reward [arm x]
+    ...)
+
+  (defrecord Arm [name pulls value])
+#+end_src
+
+*** ε-greedy
+    - "hello world" algorithm
+    - generally exploit
+    - ε (epsilon) is the rate of exploration
+    - eg if ε = 0.1, your strategy is:
+      - with probability 10%, try a random arm with equal
+        probability
+      - with probability 90%, try the best arm based on current
+        knowledge
+    - if ε = 0, always exploit; if ε = 1, always explore
+    - example with bernoulli-bandit
+#+begin_src clojure
+  (bernoulli-bandit {:arm1 0.1 :arm2 0.1 :arm3 0.1 :arm4 0.1 :arm5 0.9})
+#+end_src
+
+    - with ε=0.2, you converge faster on the best arm
+    - but ε=0.1, you exploit it more when you find it
+    - once you've found the best arm, you should be able to double down
+      - ie explore more at the beginning (when you have least
+        knowledge) and less at the end
+      - lots of extensions to ε-greedy to factor things like this in
+
+*** Thompson sampling
+
+    - Arm model
+      - Θ_k: Arm k's hidden true probability of reward (in range
+        [0,1])
+      - can build a distribution for Θ_k based on current knowledge
+      - small number of pulls means wide distribution; large number
+        means narrow distribution
+      - captures uncertainty in value of Θ_k
+    - each iteration, take a random sample from each distribution,
+      take the largest sample
+      - algorithm naturally balances exploration/exploitation
+        trade-off
+      - the more it learns, the narrower the distributions get, and so
+        the more likely it is to choose an arm with a higher expected
+        value
+    - incanter example
+    - Thompson-sampling example with same Bernoulli-bandit from above
+      - compared with ε-greedy, explores much more much earlier, and
+        exploits much more later on
+      - considered optimal convergence
+    - we can use it to rank things (not just select)
+      - take a sample from each arm distribution, then order arms by
+        that value
+      - in notflix, can use for ordering the videos we show
+** applied to notflix
+   - video rank bandit
+   - for each video, a thumbnail bandit
+   - at the end, the best video should be at the top
+     - and each video should show the best thumbnail
+** results
+   - videos, worst to best
+     - "hero of the coconut pain"
+     - "100 Danes eat 1000 chillies"
+     - "3 year-old with a portal gun"
+   - thumbnail bandit data
+   - "we built a fictional but /amazing/ product"
+** links
+   - [bandit/bandit-core "0.2.1-SNAPSHOT"]
+   - https://github.com/pingles/bandit
+** Q: this model assume bandits have same probability through time
+   - can it readapt?
+   - Thompson sampling does adapt
+     - it won't change back as quickly
+** Q: isn't there an interaction between the two bandits?
+   - if the thumbnail is crappy, they might not click the video
+   - made an assumption about this
+   - in general, if you leave it running over time and let the
+     evidence build, it should be fine in the long run
+   - but that is definitely a flaw
+* Tommi ?, Schema and Swagger to improve your web APIs
+** super simple web api in clojure
+** prismatic schema
+** swagger
+** ring-swagger
+** compojure-api
+** fnhouse-swagger
diff --git a/euroclojure2014.org b/euroclojure2014.org
@@ -0,0 +1,457 @@
+#+TITLE: EuroClojure 2014, Krakow
+
+* Fergal Byrne, Clortex: Machine Intelligence based on Jeff Hawkins' HTM Theory
+  - @fergbyrne
+  - HTM = Hierarchical Temporal Memory
+** big data
+   - big data is like teenage sex
+     - noone knows how to do it
+     - everyone thinks everyone else is doing it
+     - so everyone claims to be doing
+     - (Dan Ariely)
+** machine learning is important
+   - people don't trust other people
+     - they have their own agendas
+   - so they place too much trust in machines
+** asimov's take
+   - we gain knowledge faster than we gain wisdom
+     - applies to human knowledge
+     - applies to data: gathering data is easy, drawing conclusions is
+       not
+** a problem in neuroscience
+   - rate of papers published is growing exponentially
+   - 2013: 1 every 32 minutes
+   - 2014 so far: 1 every 17 minutes
+** can AI learn from neuroscience?
+** Jeff Hawkins' goals in HTM
+   - Study the neocortex and establish its principles
+   - open sourced NuPIC in 2013
+** neocortex
+   - the wrinkly part at the surface of the brain
+     - grey matter: processing
+     - white matter: wiring
+   - about 2mm thick, 10cm^2 in area
+   - 30-50MM neurons
+   - 1G connections
+   - *hierarchical*
+   - *uniform*
+     - ie all looks physically the same
+     - all regions have the same algorithm
+** 6 key principles
+*** on-line learning from streaming data
+    - up to 10 million senses feed the brain
+    - we don't (can't) store this data
+    - we build models from live data
+    - models constantly updated
+*** hierarchy jof regions
+    - sensory data enters at the bottom
+    - models are built in every region
+    - things change more slowly as you go up
+    - hierarchy enables sequences of sequences
+      - seq of waves
+      - seq of phonemes
+      - seq of words
+      - seq of sentences
+    - hierarchy works upwards and downwards
+*** sequence memory
+    - all sensory data involves time
+    - sequence memory allows predictions
+    - structure in data elaborated over time
+    - sequences can be c
+*** sparse distributed representations
+    - in each region, many neruons, few active
+    - SDRs represent spatial patterns
+    - fault-tolerant, semantic ops, high-capacity
+    - key to understanding & building intelligent systems
+*** all regions are both sensory and motor
+    - behaviour provides context for sensory data
+    - structure in model navigated via behaviour
+*** attention
+    - use attention to manage the neocortex
+    - planning and previsualisation
+    - whole subhierarchies can be switched on and off
+** layers of neocortex
+   - from molecular upwards
+   - around 5 or 6
+*** neurons
+    - distral dendrites detect coincidence of incoming activity from
+      neighbouring cells
+    - you don't just see what you're seeing now, you predict what
+      you're going to see next
+    - (reality is much more complicated, but this algorithm is
+      sufficient to explain a lot)
+** clortex
+*** background: numenta's nupic
+     - in dev since 2005
+     - partially implements HTM/CLA
+     - python/c++
+     - open source
+**** strengths
+     - skilled dev team
+     - eat their own dog food (grok uses nupic)
+     - operates on subset of HTM/CLA principles
+     - tunable using swarming on your data
+     - works well on streaming scalar data (eg machine-generated)
+     - great community -- http://numenta.org
+**** limitations
+     - codebase has evolved as theory has developed
+     - difficult/scary to rewrite for flexibility
+     - OO with large, coupled, classes (~1500 LoC per class)
+     - need to swarm to find parameters, no real-time control
+     - not easy to extend beyonnd streaming scalar use case
+*** clortex requirements
+    - directly analogous to HTM/CLA theory
+    - transparenntly understandable source code
+      - a neuroscientist should be able to read & review code
+    - directly observable data
+    - sufficiently performant
+    - useful metrics
+    - appropriate platform
+      - portability
+      - scalability
+*** architectural simplicity
+    - first role: be useful!
+    - best software is that which is not needed at all
+    - human comprehension is king
+      - if people can't understand your code, your code is not
+        finished
+      - unit tests are not sufficient in themselves
+    - machine sympathy is queen
+    - software is a process of R&D
+    - software development is challenging & intellectual
+      - more science than engineering
+        - engineering: you have a good model already, you just have to
+          plug in the particular parameters
+        - science: there are a bunch of unknowns which you have to
+          learn & understand
+*** #1: Just use data!
+    - maps, vectors, sets
+    - all done in a one-page datomic schema
+*** #2: Clojure & its ecosystem
+    - clojure data not domain objects
+*** #3: russ miles' life preserver
+    - everything either "core" or "integration"
+    - core: a datomic database for the neocortex
+    - core: each "patch" of neurons is a graph
+    - integration: algorithms, encoders, classifiers, SDRs
+*** key clj libs & tools
+    - datomic (+adi)
+    - quil/processing
+    - incanter
+    - lein-midje-doc for literate documentation
+    - hoplon-reveal-js for presentations
+    - lighttable
+** review
+   - Big Data isn't just Machine Intelligence problem
+   - HTM is exciting
+** links
+   - http://numenta.org
+   - http://inbits.com
+   - https://github.com/fergalbyrne/clortex
+   - writing a leanpub book
+* Logan Campbell, Clojure at a Post OFfice
+** history:
+   - was at clojure user group
+   - a guy turns up and says he's hiring a team of clojure developers
+   - he was at Australia Post
+     - a million lines of Java worked on by a team in India
+     - wanted to bring it back in-house
+** project: digital mailbox
+   - big companies spend a lot of money sending out bills & junk mail
+   - product to seamlessly replace that workflow
+   - switch from physical mail to cheaper model
+   - consumer can sign up to receive water bill online
+   - I was brought on as the "clojure expert"
+     - (I'd been playing with it for a couple of years)
+   - drama:
+     - the people they could hire:
+       - really experienced java devs
+       - keen on FP
+     - they said as they were hiring "you might be doing clojure or
+       you might be doing scala"
+     - first few people were scala fans
+     - scala v clojure battles
+       - "we need static typing"
+       - "we need OO for domain modelling"
+       - "clojure is slow" (?)
+       - "what framework do you use?"
+   - "we need static typing? okay, we'll use core.typed"
+   - domain modelling:
+     - when people are used to domain modelling in OO, telling them to
+       just use maps feels like a cop-out
+     - records + protocols kind of feel like classes
+     - wasn't til I showed them code I'd written and comparing it with
+       their code that they realized that you can just use maps
+   - online scala course
+     - we did it as a team
+     - I also did the exercises in clojure
+     - did one exercise three different ways in clojure
+       - conditional
+       - match
+       - stream processing
+     - showed them my solutions
+       - they already understood the problems because they'd solved
+         them themselves
+   - clojure performance was a surprise, because I'd come from ruby (!)
+     - clojure is /fast/
+     - there was an underlying feeling that "we need scala for
+       performance"
+   - I'm a consultant, so was happy for the team to make the language
+     decisions
+     - "if you're keen on scala, let's find out a way to pitch it to
+       management"
+   - web stack: kept hearing "async async async"
+     - felt like premature optimization
+     - but still we used http-kit
+       - benchmark started to allay fears that clojure was slow
+** feature: make a payment on a bill
+   - not necessarily a full payment
+
+:     POST /bills/:bill-id/payments
+:     Session: user-id
+:     Post Data: amount
+
+   - GET credit card token for user
+     - POST request to payment gateway
+   - GET how much left to be paid
+   - if payment succeeds: display amount remaining
+   - if payment fails: display error
+** candidates solutions
+   - synchronous promises
+   - promise monad
+   - lamina
+   - etc etc
+** solution 0: synchronous
+   - http-kit's requests return a promise
+     - just @deref the promise (blocks the thread)
+** solution 1.1: promise monad
+   - ~do~ is aware of promises
+     - doesn't block thread, but waits for promise to be executed
+       before continuing
+     - felt natural way to write with promises
+     - but incorrect: too much waiting, no concurrency
+** solution 1.2: promise monad let/do
+   - ~let~ to define promises
+     - ~do~ to pseudo-block on them
+     - introduces correctness but reduces readability
+** solution 1.3: let/do/do
+   - okay, let's step away from monads
+** solution 2: (?)
+** solution 3: raw promises
+   - ~when~ to explicitly wait for a particular promise
+** solution 4: raw callbacks
+   - not viable
+   - would have just written a hacky little promise library
+** solution 5: core.async:
+   - great! same shape as synchronous code, but correct concurrency
+** solution 6: lamina
+   - didn't feel totally suited to the situation
+** solution 7: meltdown (LMAX disruptor based)
+   - not appropriate
+** solution 8: pulsar promises
+   - looks exactly the same as the synchronous code, except for one
+     character
+   - pulsar rearranges your code at the bytecode level
+     - uses JVM agents (normally used for tracing/debugging)
+   - pass a fn to one of pulsar's functions
+     - turns synchronous code to async code
+** solution 9: pulsar actors
+   - not appropriate
+** winners
+   - 0: synchronous
+   - 5: core.async
+   - 8: pulsar
+** scala solution, for comparison
+   - scala futures (basically promises)
+   - all monadic
+   - I don't understand it entirely
+   - concise
+   - battle of the benchmarks, fastest first
+     - pulsar-async
+     - pulsar-sync
+     - core-async
+     - raw-callback
+     - scala-play-future (significantly less than others)
+** CQRS (command-query responsibility segregation)
+   - want fast reads
+   - reduce number of queries
+   - don't want to have to update write code every time we add a new
+     reader
+** structure
+   - service A → cassandra → service B
+   - custom triggers in cassandra in clojure (just drop in the .jar!)
+     - publish to rabbitmq
+     - notify index maintainer
+     - write index to cassandra
+     - service B reads from cassandra
+** cassandra triggers
+   - can just throw the clojure jar in there
+   - everything is byte buffers
+     - you need to know the type of all the fields out-of-band
+     - not self-describing data at all
+** microservices
+   - I thought we would have a user service and a provider service and
+     a mail service
+     - but this gets tricky when you want data about users and providers
+   - you need to split things much more fine grained
+   - user service →
+     - authentication
+     - multi-factor auth
+     - authorization
+     - user profile
+     - password reset
+       - does it belong in user profile?
+       - there's a bit of workflow here
+         - send out email
+         - get user to click link
+         - enough to warrant its own service
+   - drama: needed to talk to systems team to deploy
+     - I did things badly
+     - I didn't get anything into production in my 6 months there
+     - systems team: we need monitoring and config and stuff
+       - if we'd had something early on which had gone through these
+         barriers, we would have had much less stress
+       - benchmarks end petty arguments
+** Q&A
+*** can you share some experience with monitoring & resilience?
+    - appdynamics
+    - classnames are expected to be java-style class names
+      - clojure ones are close enough
+    - clj-metrics to expose more high-level metrics
+      - requests/second from ring
+      - number of bills paid
+      - appdynamics could pick it up from jmx
+    - nomad for configuration
+*** with http-kit+core.async, what happens when server dies and there's loads of threads?
+    - bottleneck was amount of memory
+    - when server runs out, it slows down a lot
+    - way to get around that is to monitor resources on your machine
+      and ideally have autoscaling
+*** were the scala guys finally writing clojure in the end?
+    - we have one person still hardcore for scala, but sees the merits
+      of clojure
+    - a few who did the online scala courses are clojure folks now
+    - people who come from the java world of static typing feel they
+      need that
+    - but now they've written code that actually works, they're more
+      comfortable with that now
+* Tom Hall, Escaping DSL Hell by having parens all the way down
+  - @thattommyhall
+** DSLs
+   - languages made for specific purposes
+     - config mgmt
+     - science
+     - learning
+   - distinction between:
+     - internal DSLs: embedded in another language
+     - external DSLs: implemented in another language
+** problems with puppet
+   - zen of python:
+     - namespaces are a honking great idea, let's do more of them!
+*** puppet namespaces
+    - Exec['install'] in two different modules will result in a
+      naming collision
+    - fail :(
+    - end up with Exec['tom::install'] but this is a hack
+*** iteration
+    - file type lets you pass in an array
+    - nagios_host doesn't
+    - iteration is responsibility of type, not language
+      - as far as I know
+*** but you need to know ruby anyway
+    - if you want to extend puppet, you need ruby
+    - if you need to know ruby, why do we bother with the puppet DSL
+      in the first place?
+*** experimental features: lambdas and iteration
+    - any language where lambdas arrive late is not a good language
+** ansible
+   - just YAML
+     - oh wait, I might want to iterate
+     - oh wait, I've got embedded ginger templates in my YAML strings
+       - what's the scope of names in my templates?
+** if you give people a "language" they will expect loops
+   - maybe lambdas
+   - probably namespaces
+   - this has been done before
+** chef gets it right
+   - it's embedded in ruby
+   - you get iteration and namespaces from ruby
+** teaching people to program
+   - if you design a language:
+     - you need a parser, which is hard
+     - you need an interpreter/compiler, which is hard
+   - if you embed it, you get that stuff for free
+** geomlab
+   - minimal language for teaching
+   - talks about pictures
+   - intro to FP
+   - gets you into recursion early on
+   - ~man $ woman~ - "next to"
+   - ~man & man~ - "on top of"
+   - ~(man $ woman) $ tree~ = ~man $ (woman $ tree)~
+   - ~man $ (woman & tree)~ -- scales nicely to get a nice aspect ratio
+   - learn about operator precedence
+   - de morgan's laws
+     - although not always held, due to scale
+   - define functions
+:    define manrow(n) = manrow(n-1) $ man when n>1
+:                     ~ manrow(1) = man
+   - builds up to an escher tiling
+   - but once you've done that, where do we go?
+     - only exists in this sim
+     - if you want to extend it, you need java
+     - "I'm really excited about FP now, but I've got nowhere to go"
+** what if we did it in clojurescript?
+   - let's use 'below and 'beside instead of $ and &
+   - ~(below man woman)~
+   - ~(beside tree star)~
+   - http://cljsfiddle.net/fiddle/thattommyhall.geomlab.demo
+   - let's say I want to change ~man~ -- what does it mean?
+     - it's implemented in the same sort of language
+     - I can see there's a url in there where I fetch an image from
+       the internet
+     - I know recursion, because I learned that from the geomlab
+       exercises
+     - I can extend the language itself
+** science languages
+   - R
+   - wolfram alpha
+   - maple
+   - matlab
+   - these things just aren't very good languages, even if they are
+     good at their domain
+** another problem with DSLs
+   - netlogo
+     - http://ccl.northwestern.edu/tortoise/2013-10-25/Ants.html
+   - If you're based on applets, and Oracle drops applet support, you
+     find you need to port your whole language to a new platform (in
+     this case javascript)
+   - again, reimplement in clojurescript?
+     - anyone interested in hacking on this with me?
+** conclusion
+   - you probably don't need to make a new language
+   - if you do it will probably be rubbish
+     - at least for a while
+   - think about power and reach
+   - *you should embed /deeply/ into clojure*
+** links
+   - http://twitter.com/otfrom
+   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.core
+   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.demo
+   - http://cljsfiddle.net/fiddle/thattommyhall.ceomlab.bruce
+   - http://www.complexityexplorer.org/
+   - http://cljsfiddle.net/fiddle/thattommyhall.ants.core
+   - http://ccl.northwestern.edu/tortoise/2013-10-25/Ants.html
+** Q&A
+*** what makes a good first language?
+    - clojure needs a better day 0 story
+    - at some coder dojos where I've taught kids, some don't even know
+      about files and folders
+      - so if you say "open a terminal, cd into a directory" you've
+        lost them
+        - and it's not their fault
+*** have you had any kids look at your examples here?
+    - I've done the geomlab example
+    - otherwise this is all a recent exploration
+    - errors in cljsfiddle are not reported well
+      - again problematic for day zero
No results found