Welcome everyone.
Thank you so much for joining us today here
at Neo4j for our 3.0 product webinar.
My name is Corie Brickman, community programs
manager here at Neo, and I'm joined today
by William Lyon, one of our fabulous developer
relations engineers.
We're very excited to bring you today's presentation.
I will let you know that today's presentation
is being recorded, so you will receive that
via email.
We will also be taking questions live at the
end of today's presentation, so feel free
to submit those at any point during the webinar
into the questions box on your go-to webinar
control panel.
With that, I will go ahead and hand it over
to Will.
Great.
Thanks Corie.
Good morning everyone, and thanks for joining
today.
I'm going to be talking about Neo4j 3.0, which
is our most recent release, we're very excited
about.
3.0 was released last week during an announcement
at GraphConnect Europe in London.
So my name is Will.
As Corie said, I'm on the developer relations
team at Neo4j.
If you have any questions or follow-up, feel
free to send me an email or reach out on Twitter.
My details are on the screen there.
So let's just jump right in here.
So if there were a title for the Neo4j 3.0
release, it would be A New Foundation.
And the reason for that title is that every
piece of Neo4j, from the storage engine to
Cypher, the graphic programming language,
to the drivers, the product surface that we
used to interact with Neo4j, the configuration,
the documentation - this has all all been
updated for this release.
And so there are really three main themes
here that I'm going to talk about today.
First is how developers can develop bigger
and faster graph applications with 3.0, how
we can develop applications faster and easier,
and how we can deploy Neo4j anywhere easily.
If you think about this another way, we're
going to be talking about scale and performance,
developer productivity, and operability.
So let's jump in to scale and performance.
There's a very important concept when we're
talking about graph databases.
This is index-free adjacency.
This is essentially what makes a graph database
a graph database.
So with index-free adjacency, we're able to
traverse from one node to any node that it's
connected to without doing an index look-up.
So every node has a direct pointer to every
node that it's connected to with relationship.
This gives us native graph processing, allows
us to do local graph traversals with constant
time performance, so that as the size of the
graph grows and grows, the performance of
a local graph traversal is not impacted.
So when we're talking about performance, this
is the key property that gives us native graph
processing.
And this has been very important, and has
been built into Neo4j since the beginning.
Now, scale on the other hand, we're typically
talking about dealing with large enormous
graphs, we're talking high availability, so
things like Neo4j Enterprise, high availability
with the server clustering.
There was a recent article from a consulting
group that was looking at putting together
a proof of concept architecture for a social
network, dealing with one billion members.
So this is a very large graph.
I think the data they were using was about
one billion nodes, and somewhere between seven
and eight billion relationships.
And then they tested this on several different
architectures, several different databases,
and Neo4j was determined to be the best choice,
and very performant.
And this was using the previous 2.3 release.
So Neo4j has always been extremely scalable,
and able to handle graphs up to the tens of
billions of nodes and relationships.
However, there's always been this hard limit
on the size of the graph that we're able to
deal with, right around the tens of billions.
However, with Neo4j 3.0 we are upping the
limits, essentially removing the limits of
the size of the graph that we're able to store
in Neo4j.
So there is now no longer a explicit limit
on the size of the graph that we can store.
This makes Neo4j extremely scalable.
We do this using something called dynamic
pointer compression.
So this applies to nodes, relationships, and
properties.
So we're able to preserve index-free adjacencies,
so we still have that native graph processing
performance on graphs of essentially unlimited
size.
And there's a smart algorithm that optimizes
the storage there, so we're able to store
these large graphs very efficiently.
So that was an overview of scale and performance
improvements in 3.0.
Let's talk about developer productivity.
So when we're talking about developer productivity,
we're talking about building graph applications
quickly and easily.
Let's take a look at the developer surface
historically for Neo4j.
So Neo4j started as an embedded Java API.
So if you wanted to interact with Neo4j, this
required writing Java code to do that.
Later in version 1.0, and later in Neo4j,
we introduced a REST API, so at that point
interacting with the graph involved sending
REST requests over HTTP, and dealing with
JSON to interact with the graph.
In 2.0 we introduced Cypher, the query language
graphs.
So rather than using REST, you interacted
and queried the graph using Cypher, although
typically sent over HTTP as well.
This is now changing with Neo4j 3.0.
We're introducing Bolt, which is a new binary
protocol for Neo4j.
Along with official language drivers for using
Bolt and Cypher for Neo4j.
So what is Bolt?
Bolt is a binary protocol.
So this is replacing HTTP for interacting
with Neo4j using Cypher.
I should say HTTP will still be there, or
this will be the preferred method of sending
Cypher queries to Neo4j.
Benefits of Bolt are that we can avoid the
overhead of HTTP.
We can compress the data being sent back and
forth, so it's much more efficient, and security
is on by default.
Along with Bolt, we're introducing official
language drivers in Javascript, Java.net,
and Python.
So these official language drivers implement
the Bolt protocol, and provide a way to interact
with Neo4j in a way that is idiomatic to their
language, while still providing some consistency
across drivers.
And you'll see what I mean by that with a
few examples here.
One thing that's important to note with these
official language drivers, and with Bolt,
is this now gives us an authoritative mapping
to the native type system, and this is uniform
across all drivers.
Another thing to note here, is that these
drivers are meant to be built on top of, so
pluggable into other frameworks.
So for example, Spring Data Neo4j uses a Bolt
based Java driver.
So this is a great way to integrate in to
frameworks going forward.
Let's look at some simple code samples using
the official drivers.
So here's the Javascript version.
Each driver has a concept of the driver of
a session, which is your connection to Neo4j,
and a concept of a result, which is the results
of running a Cypher statement, using the session.
So simple example in Javascript.
In Python, this looks very similar.
In Java, we have some thing still similar,
so we still have the concept of driver session
and a result.
And the .net version looks very similar to
Java.
We'll see some examples of this again going
forward.
The other piece of developer productivity
that we're excited to talk about with Neo4j
3.0, is the concept of Java stored procedures.
This is new in 3.0.
These are pieces of Java code that are deployed
to the Neo4j server, and callable from Cypher.
So this essentially allows us to extend the
functionality of Cypher with some custom code.
We can do things like make network requests,
want to be able to connect to web APIs.
If you want to connect to another database,
or if you want to use something in native
Java, you can write a Java stored procedure,
and deploy that to the database.
We'll see some examples of this later on in
the webinar also.
And the last piece of developer productivity
that I want to talk about is a new feature
called Neo4j Browser Sync.
This is a companion cloud service for the
Neo4j browser.
You've always been able to save your favorite
queries or save a Neo4j GRASS file, that's
a GRASS style sheet - a style that defines
how your graph should look in the Neo4j browser.
However, those have always been saved locally
and within your web browser.
With Neo4j Browser Sync you can sign in with
GitHub, Google, or Twitter, and your favorite
queries and style sheets are synced to the
cloud.
So then when you move to other computers,
and other browser sessions, you still have
those queries that you've saved and the style
sheets that you've saved.
This is great.
You don't have to save your queries and text
files, and move them around anymore.
They'll always be there in your Neo4j browser.
So let's talk about some changes in operability
with Neo4j 3.0.
So it's really important that we're able to
deploy Neo4j locally, that we can play it
to the cloud, we can containerize it using
Docker.
And to align Neo4j with some of these more
modern deployment techniques, we've updated
the file structure, the configuration, and
log structure as well to make deploying across
these different methods much more straightforward,
and in line with what you would expect.
So basically we've unified some of the configuration
files, and the log files to enable this.
We've also updated the official Docker image,
so we introduced an official Docker image
last year.
And of course with 3.0 we've updated that,
so it's even easier to use.
Later on in the webinar, we'll go through
some of these changes to the configuration
and log structure.
So just to highlight some of the big features
that we talked about with Neo4j 3.0.
We have a new storage engine with no limits
so that we can store graphs of any size in
the Neo4j.
Bolts, the new binary protocol for efficiently
connecting to new Neo4j, along with new language
drivers that implement the binary protocol
and provide a idiomatic but consistent way
of interacting with the Neo4j in your favorite
language.
As well as updates to the file configuration
and log structures to support modern deployments.
So that was a high-level overview of the awesome
new features in Neo4j 3.0.
For the next part of this webinar, we're going
to dive in and take a look at how we can take
advantage of some of these features and talk
about some other things that are included
as well.
So first of all let's take a look at the release
notes here.
[silence] And I just want to point out that
if you scroll through this there is a ton
of new stuff introduced in Neo4j 3.0.
So I can't talk about all these things, but
if you want to see everything that we've updated
with this version, be sure to check out the
change log.
One thing that, as is noted here, that going
forward all versions of 3.0 will require Java
8.
Just something to note there.
So one thing that I mentioned just briefly
was [inaudible].
So with Neo4j 3.0 we've completely updated
the documentation.
So previously we had what was known as the
Neo4j manual.
And that contained information about developing
applications with Neo4j, and information about
deploying Neo4j, server tuning, things like
that.
Now we've broken that manual up into two pieces.
So we now have the Neo4j developers manual,
and we have versions of this for each of the
four offical language drivers that we have,
with code samples for each.
And we also have the Neo4j operations manual.
So the developers manual is useful for developers
building applications with Neo4j.
While the operations manual is intended for
folks that are deploying Neo4j, that are interested
in things like server tuning, things like
that.
So if you're looking for the Neo4j manual,
you'll now see that it's in these two pieces.
You can also find documentation specific to
each of the different drivers, and this is
in the format that you would expect for that
language.
So here's the documentation for the Python
driver, and you can find all of those at Neo4j.com/docs.
So as I mentioned, the configuration has been
unified in Neo4j 3.0, as well as the log file.
Let's just take a brief look at that.
So here I have a version of Neo4j 3.0, and
you'll see we have this conf directory.
And if you look in here, we have basically--
Neo4j.conf is our main config file.
Previously we had Neo4j properties, Neo4j
server, all of those configuration options
have been moved into this one unified file.
If you look at some of these options here,
one thing that's interesting to note is we
have a new option for active database.
So we can actually specify the name of the
database we want to mount when starting Neo4j.
This allows us to have multiple stores of
Neo4j, and just choose which one we want to
mount on start up.
We can specify the import directory that we
want to use, if we are using Load CSV.
This is an important configuration if we're
upgrading from a previous version of Neo4j.
So as I mentioned the file store is completely
different with 3.0.
So if we're upgrading from a version of 2.X,
we need to specifically set this flag to allow
the file store to be upgraded.
We also have server tuning in this file, so
if you need to explicitly set the size of
the page cache, we can do that here, as well
as other network configurations.
So you can see that all of the configuration
options have been moved into Neo4j.conf.
That makes deployment much easier.
If we are specifying configuration options,
we just need to deal with one file.
We also have this logs directory, Neo4j.log
is the file that we are interested in for
debugging in Neo4j issues.
Great.
So that's just an overview of some of the
configuration changes.
Let's talk more about Java stored procedures.
Personally, I think this is one of the more
powerful features in Neo4j 3.0.
This really changes the way that we can interact
with other databases and frameworks, so I
am super excited for this feature.
There are several built-in procedures.
So we can do things like call dv.labels, and
that will give us all of the labels in the
database.
So that's helpful for inspecting the current
state of the database.
But as I said, Java stored procedures are
user defined procedures.
So anyone can write their own procedure, deploy
that to Neo4j, and use that with Cypher.
And to [inaudible] writing [inaudible] your
own procedures, we have a template project.
So with on GitHub this has instructions for
building and deploying.
If we look at the source code here, you can
see that basically with the procedure there
is an instance of graph database service that
is injected into the procedure, which you
can then use within the procedure file.
So, very cool.
In addition to building your own procedure,
you could also take advantage of a library
of useful procedures called APOC procedures.
This product was started by a colleague Michael
Hunger, who has written over a hundred of
these now, along with help from other internal
folks and members of the community.
This is really powerful.
There are lots and lots of things in here.
Things like inspecting the MediGraph, and
then getting a visual representation of the
data that you have.
So if you need to inspect the graph-- here
we're looking at the movie graph, and you
can see we have person nodes, we have movie
nodes.
And a person can either have acted in, wrote,
produced a movie.
So things like that.
There are lots of things for loading data
from web APIs, or refactoring the graph.
Lots and lots of useful things.
I really encourage you to look at the APOC
Procedures Project.
I want to take one closer look at a specific
procedure in APOC, and that is loading data
from a relational database.
So this is the ER diagram for Northwind, which
is sort of the canonical relational database
example.
This is dealing with orders, products, customers
that have placed these orders, employees that
have fulfilled them and whatnot.
You can see all the tables we're talking with
here.
Let's do a quick demo here and let's take
a subset of Northwind.
So these four tables - so orders, products,
the order details, joined table, and customers
- and let's see how we would import that into
the Neo4j as a graph.
Well if we were to do that, a graph model
might look something like this.
So we have customer nodes that have placed
an order, and that order contains some products.
Let's see how we can import a piece of Northwind
into the Neo4j using the new procedures.
So if we jump back to the command line here,
and we go to the plugins directory.
So I've installed the APOC procedures, so
that we just need to build that JAR, and move
it into the plugins directory.
And then I've also downloaded the MySQL connector
JAR, which you can download from MySQL, because
you're going to be using this to connect to
MySQL to import Northwind.
So that's how we install procedures, basically
just copy over the JAR and load Neo4j.
So the first thing we are going to do is register
the JDBC driver using one of these procedures.
This is the syntax for calling a procedure,
Cypher call APOC.load driver, and we're specifying
the package for the JDBC driver that we want
to use with MySQL.
So we do that.
That's now registered.
I should mention I have MySQL running locally
here.
So here we go.
Here's Northwind.
Here are different tables that we're dealing
with, so we just have that running locally.
So now we want to connect to MySQL and bring
some of that data in.
So to do that we'll say Cypher call APOC.load
JDBC.
We'll specify the JDBC connection string,
so MySQL running locally.
We're interested in the Northwind database
user's root.
I don't have a password set.
And then we're specifying that we want to
bring in every row of the products table.
Alternatively, the second parameter here could
be a SQL statement.
So if we wanted do some incremental updates,
[?] some of the results, whatever, this could
be a SQL statement.
So now we'll yield the results.
So each row from the products table will now
be passed on to the rest of the Cypher statement.
So we want to create product nodes, and set
the properties from that row.
So we'll run this, and we've created 77 product
nodes.
I will do almost the exact same thing for
orders for specifying the orders table; added
830 order nodes.
So now let's look at the order details joined
table.
So we just need to do a match on the product
ID, and a match on the order ID for that row,
and then we'll create this contained relationship
to say this order contains this product.
We'll run that; added 2,000 contained relationships.
We'll load all of the customers.
And then for each order, we need to just find
which customer placed that order and create
this placed relationship.
So we run that.
And let's see what we have here.
So let's just pick 25 products at random,
and let's start expanding out on those.
And so we can see, for this product, we can
see all the orders that contain that product.
We can expand out to see the other products
in that order, who placed the order, we can
expand out from there to see other orders
they've placed and whatnot.
So now we have the Northwind graph that we've
imported into your Neo4j just using Cypher.
I think that's pretty cool.
One thing I also wanted to point out while
we're in the new Neo4j browser, is we now
have a zoom feature so we can visualize very
large graphs in the browser.
That's pretty useful as well.
So that's great.
We've imported Northwind into your Neo4j.
Let's do something graphy with that.
One thing we know is Neo4j, and graphs in
general, are really good for generating recommendations.
So these type of collaborative filtering queries.
So if you know that I've purchased some products,
what are some other products I might be interested
in?
Well, now that we have Northwind as a graph,
we can use Cypher now, to generate some product
recommendations.
Here's a simple collaborative filtering product
recommendation query in Cypher.
We'll just pick Roland Mandel here, someone
I picked at random from a customer table.
We'll find his customer node.
We'll look for orders that Roland has placed,
and we'll look at what products are contained
in those orders.
Then we'll look for other orders that have
the products that Roland has purchased, and
we'll look who are the customers that have
purchased those products.
This is basically, find customers that have
purchased the same thing as Roland.
Then we'll traverse out from those other customers
to find their orders, and other products that
they've purchased.
Who's purchased the same things as Roland?
What else have they purchased?
Let's recommend those to Roland because he
might be interested in those products as well.
We'll order those product recommendations
with a count of the number of [PAZ?] that
we found for that product.
So if we have lots of customers that have
purchased something in common with Roland,
we'll rate those higher.
If we run that and we immediately get back
some product recommendations for Roland.
The first is this Gorgonzola Taino.
I don't know what that is, maybe some sort
of pasta.
Anyway, that's how we can do product recommendations
with Cypher.
We were able to import a piece of Northwind
into Neo4j, and then immediately start doing
recommendation queries on them.
So that's pretty cool.
I think that really demonstrates the usefulness
of Java stored procedures.
There's a lot more you can do, from graph
algorithms, things like that.
So really excited for that feature.
And these are the queries that we just saw,
just in the slides for reference.
Let's talk about some of the updates to Cypher.
There are quite a few here.
I'm just going to mention a few.
The first is an update that enables Cypher
to use a cost-based optimizer for writes.
So we introduced a cost-based optimizer previously;
however, we were only able to use that for
read queries.
We weren't able to take advantage of this
for writes.
So we can now-- Cypher is now smart enough
to inspect the current state of the database
to optimize the product plan for that.
We'll see some more detail of that in a moment
here.
There's also been some updates for fast index
population with parallel indexes.
And of course using Bolt with Cypher now really
increases the volume of data that you can
move back and forth.
Regarding text search, so some keywords were
introduced in a previous version of Cypher
- that's STARTS WITH, ENDS WITH, and CONTAINS
- so doing string comparison operations.
Previously, only STARTS WITH was able to take
advantage of the index, however now with Neo4j
3.0 both ENDS WITH and CONTAINS will use an
index.
So it's really important for full-text search
in Neo4j.
Also been some improvements with global aggregation
queries.
So for example, if I want to count all the
nodes with a certain label.
Previously, the query would actually touch
all those nodes and count them, but now we
are able to use statistics on global aggregation,
so essentially this does a look-up, doesn't
need to count all of the nodes.
We also have value joins where we can find
nodes that share the same value for a property,
even where no relationship exists.
So this will really increase performance of
Cypher as well.
Let's look at an example for the cost planner
for rights.
Here we're looking up products and categories,
where the category ID matches and then creating
this contains relationship.
So similar to what we were just doing with
the Northwind set.
Using the rule planner, this would create
a Cartesian product, and this resulted in
millions of database hits and took about seven
seconds to run.
Using the cost planner for writes, we now
take advantage of an index look-up, and this
is much, much, more performant.
We're only doing 50,000 database hits, and
it takes half the time to run those.
So where we're really going to see performance
increase using this cost planner is on Load
CSV.
So if you ever use Load CSV to import data
from CSV files into Neo4j, you should see
some performance improvements there while
taking advantage of this cost planner.
Another improvement to Cypher in Neo4j 3.0
is that predicates are now pulled into the
shortest path function.
So let's look at an example.
So if I have two location nodes, one for Berlin,
one for Dresden, and I wanted to fine the
shortest path along all roads from Berlin
to Dresden, I can use the Shortest Path function.
Previously though, I was not able to specify
predicates.
So I couldn't say, "Find me the shortest path
where all the roads are open, and the speed
limit is above a certain amount, and the distance
is less than a certain amount."
However, now with Neo4j 3.0, I can specify
those predicates, and they're pulled into
shortest path operation.
So now we can say, "Find the shortest path
from Berlin to Dresden, where none of the
roads are closed, none of the roads' speed
is less than 30, and the total distance is
less than three 300 kilometers, and not more
than 50 roads.
So these predicates are pulled in the shortest
path.
This is awesome for anyone who's tried to
do these type of routing queries in Cypher
previously.
So again another super awesome enhancement
to Cypher.
There are also some spatial functions that
have been introduced into Neo4j 3.0, specifically
distance and point.
So if we have a node that has a latitute and
longitude property, we can use the point function
in Cypher to coerce that to a point type,
and we can then compute the distance between
two points using the distance function.
So again, this is super useful for things
like routing problems, and sort of spatial
problems that we're working with.
So we talked about the official language drivers,
and we saw some code snippets.
So let's look at an example of how we can
use the Python driver using some actual data
here.
If you want to follow along, I'm just going
to go through a IPython Notebook.
The code is on GitHub here.
First I'm just going to delete everything.
Delete our Northwind graph here.
And let's make sure we don't have anything
in the database.
So nothing in the database.
Let's jump over to our IPython Notebook.
So the first thing that we need to do is install
the Neo4j driver.
One thing that's great about these official
language drivers is that they are available
through whatever package manager you would
typically use to get a package in that language.
For example, for Python, the Python driver
is available in pip, ipi, the Python package
manager for JavaScript.
The driver's available on NPM or Bower.
So we just need to do a pip install Neo4j
driver.
Of course, I already have that installed,
so we're good to go there.
And then we just need to import the packages
here.
I'm going to restart the Kernel here.
There's an issue.
There we go.
So pip installing your Neo4j driver.
Determine are you satisfied.
I've already installed that.
We're now going to import graph database and
BASIC A from Version One of the Neo4j package.
We're also importing requests, because we're
going to load some data in JSON because we're
going to [?] that data.
So what data are we going to deal with here?
Well, we're going to load the OSCON graph,
or the conference graph.
So OSCON is the open source conference taking
place in a couple of weeks.
Lots of folks from Neo4j will be there.
So we wanted to take a look at the conference
schedule, see how we can model that as a graph,
and just interesting queries on the conference
schedule for OSCON.
So this is the data model for the graph, we
want to build here basically speakers, the
talks they're presenting, the tracks, organizations
that the attendees are associated with.
This data is available as JSON.
So we'll just the use the Python request library
to load that.
I have a copy locally, so let's see what this
looks like.
We just have one JSON object.
We have an array of events.
These are the talks.
We have an array of speakers, and we have
an array of venues, which is essentially the
room where the talk is being held.
So this is the data that we're pulling in.
We look at the keys for that object, and see
the events, speakers, venues as you'd expect.
So next I've just defined some Cypher queries
for inserting that into Neo4j.
So iterate through all of the events, create
the talk notes, and then create relationships
for the room, the speaker, categories, and
so on.
Next we need to instantiate the driver.
We're specifying we want to use Bolt local
host.
That's just my Neo4j instance that's running
locally.
And then we need to specify the authentication.
So we're using Basic Off.
The username is Neo4j, and I have a super
secure password here.
So once we've instantiated the driver, we'll
go ahead and instantiate the session, which
is our connection to Neo4j.
So now we can run a query here.
We'll say session.run and then our insert
events query.
So that is this up here, and as you can see,
we have a parameter in here.
So this event is a query parameter, so we're
going to be passing in the data that we loaded
previously, the JSON data that we loaded previously
from the web.
The syntax to do that is to specify parameters,
and then specify the object that we're passing
in as a parameter.
So in this case, events is they key, and the
value is the events array that we pulled in
from the web.
We run that, and we see if we get back a statement
result object.
Let's do the same to insert the speakers,
and to insert the venues.
You can see for each of those, we're getting
a statement result object.
Let's jump into Neo4j and see what we actually
inserted here.
So just grabbing 25 nodes at random, we can
see here's a talk.
Intro to Apache's Spark.
That's part of the data track presented by
Ted.
We expand out and see that Ted is affiliated
with [?] Vera.
Expand out on the data track node, and we
see all of the other talks that belong in
the data track node, or the data track.
So that's cool.
Let's now query that and view the results
in Python.
We want to answer the question is, what organizations
have talks in The New Stuff track with the
most speakers?
So every speaker has some affiliation - the
company, or the organization that they work
with.
So we want to start at this track, The New
Stuff, The Hot New Stuff in Open Source.
We want to look at all of the talks that are
parts of this track, we want to see who's
presenting them, and then we want to see what
organization is that person affiliated with.
Group by organization, and return to organization,
and account of the number of prox that organization
has in The New Stuff track.
So we'll run this with session.run, we're
not passing any parameters, we're just passing
the grey screen.
And then we can iterate through our results.
So for each record in the results, print out
the organization and number.
We can see here, Microsoft has four speakers
in The New Stuff Track.
Secret Lab has four, and so on.
So that was just a simple example using the
new Neo4j Python driver.
Again, this code is online, if you want to
look at that.
So one thing that I think is really exciting
with our new drivers is the way that they
enable integrations with other technologies.
So I mentioned Spring Data Neo4j, which is
built on top of the Bolt Java driver.
But we can also do things like build this
Spark connector that my friend Michael Hungers
been working on.
So this uses the Java Bolt driver to pull
data out of the Neo4j very efficiently, push
that into Spark, do some Spark graph operations,
and then write that back to the Neo4j, powered
by the Java drivers with Bolt.
We can also do integrations with visualization
frameworks.
So a project I've been working on connects
bizjs which is a JavaScript visualization
framework, and uses the Bolt JavaScript driver
for the client.
So there's both a node Neo4j JavaScript driver
for JavaScript on the server, and then there's
a version for the client for the browser,
that uses web sockets to connect to Neo4j.
So oftentimes you'll find you want to embed
some graph visualization into your application;
the Neo4j browser is great, but it's not very
easy to embed those visualizations.
So projects like this allow you to connect
directly to Neo4j from the browser to build
these type of visualizations.
So here's an example of that.
This is connecting directly to a Neo4j instance
and pulling the data out using the JavaScript
driver.
This network is the Game of Thrones network,
and we did some centrality and clustering
on this so we can see who are the most influential
characters in the Game of Thrones.
If you're curious, Tyrion is the most central,
followed closely by Jon Snow.
And we did some clustering here, so we can
see the communities around each of those characters.
So pretty cool.
I just want to talk a little bit about upgrading
to Neo4j 3.0 from previous versions.
As I mentioned previously, Neo4j 3.0 requires
Java 8, and there's a completely new store
format with Neo4j 3.0, so be sure to set that
configuration to upgrade the store.
The Lucene indexes need to be built.
Neo4j 3.0 upgraded to a new version of Lucene,
so that's why the indexes need to be rebuilt.
We talked about the directory structure, and
the configurations, where is a migration configuration
tool.
So you can pass in your current configuration
files and this little tool will generate the
3.0 configuration files, so you don't have
to go through that manually.
And all of that information is available in
this 3.0 upgrade guide.
So if you're doing that, be sure to read through
this guide.
So Neo4j 3.0 sounds great.
How do I get it?
How do I get started?
Well, you have a few options here.
Neo4j.com/download.
You can download Neo4j locally.
There's also a Sandbox instance, so you can
also just spin that up in the browser and
get started right away with Neo4j Sandbox.
There's an official Docker image for Neo4j.
So we can get Neo4j through Docker, and there
are also Debian Linux packages for installing
Neo4j as well.
So lots of options there.
If you're looking for more resources of code
snippets, information about getting started,
information about some of the integrations
I talked about, Neo4j.com/developer is a great
place to start.
We have code snippets using the drivers in
different languages for some different use
cases.
Lots of guides and tutorials there.
And also connect with us on the Neo4j Slack
community channel.
And Stack Overflow is a great place for asking
questions.
I spend a lot of my day on Stack Overflow
and Slack chatting with the folks from the
community.
And with that, that is all I have.
I think we have a little bit of time left
for questions.
So with that we'll move on to the Q&A session.
Thank you so much, Will.
That was an excellent presentation, and I
know we're all very excited about these updates.
We do have a ton of questions, so I will let
everyone know, we'll take questions for about
five minutes, and then we will go ahead and
answer your questions directly via email.
So if you do have any other questions, feel
free to go ahead and keep submitting those,
and know that we'll reach out to you in the
next day or so via email.
With that Will, I'll go ahead and ask you
a few on here that are popping up.
The first one is about limits in the Enterprise
version versus the Community version.
Can you speak to that a little bit?
Yeah, sure.
So as I mentioned, we've removed the explicit
limits on the size of the graph that you can
store in Neo4j.
This is an Enterprise Edition only feature.
The file store that takes advantage of dynamic
pointer compression to enable this is only
available in the Enterprise version of Neo4j.
The community version does not support this,
but again, the community version can still
support graphs in the tens of billions of
entities size.
So Community version can still store very
large graphs.
Thank you.
We do have quite a few questions about Bolt.
Can you speak to how fast Bolt is compared
to the previous version, and also how Bolt
will work with high availability deployments,
and whether or not there's a recommended load-balancer
to help facilitate the data transfer to Bolt?
Yeah.
Sure, I can speak to some of that.
In terms of how much faster is Bolt than HTTP,
that's a great question.
I'm not sure.
I haven't seen the the official benchmarks
on that.
I think we'll be pushing some of that information
later on as we're able to put it together.
So sorry, I don't have the specific numbers
there.
The other piece of that question was regarding
HA deployments using Bolts and the load balancer.
If you take a look in the operations manual,
there's some great information on that.
So I'll just point this out.
Because as I mentioned, there's a new operations
manual that's separate from the developer's
manual.
And we have a whole section in here on HA
deployments, so certainly be sure to refer
to that information to your specific user
cases.
Thank you very much.
Our next question is regarding cloud support
for Neo 3.0.
[inaudible] cloud support?
Sure.
As I mentioned, a lot of the operability changes
in Neo4J 3.0 are around configuration and
logging structure to enable modern deployments.
So things that are more consistent with some
of the modern deployment tools that we're
used to, things like Docker [inaudible] the
cloud.
These will make those much more straightforward.
We do have a few hosting partners that provide
Neo4j as a service.
Also have a AWS cloud formation, our recipe
that takes advantage of the Debian packages
for Neo4j.
That's really easily spin up an AWS instance
in the Neo4j.
The browser sync we refer to you as a sort
of first cloud offering from Neo4j.
It allows you to sync your saved queries and
style sheets across browsers and across instances.
So certainly if you're interested in Neo4j
in the cloud it-- many of your customers are
using AWS and Azure as long with some of our
hosting partners to make that possible.
Great.
Thank you so much.
We are right up against the hour.
So we'll thank everyone so much for joining
us today.
If you do have any further questions, go ahead
and those submitted in the questions box of
your [inaudible], and we will send you a personal
email response.
We will be sending out the slides from this
presentation as well as the recording.
You will receive those via email shortly.
Again, I want to say thank you to everyone
so much for joining us today.
And thank you Will for that excellent presentation.
And we look forward to hearing how you are
enjoying Neo 3.0.
Thank you so much.
