- Learning Neo4j 3.x(Second Edition)
- Jér?me Baton Rik Van Bruggen
- 578字
- 2021-07-08 09:37:46
Granulate nodes
The typical graph modeling pattern that we will discuss in this section will be called the granulate pattern. This means that in graph database modeling, we will tend to have much more fine-grained data models with a higher level of granularity than we would be used to having in a relational model.
In a relational model, we use a process called database normalization to come up with the granularity of our model. Wikipedia defines this process as follows:
The reality of this process is that we will create smaller and smaller table structures until we reach the third normal form. This is a convention that the IT industry seems to have agreed on: a database is considered to have been normalized as soon as it achieves the third normal form. Visit http://en.wikipedia.org/wiki/Database_normalization#Normal_forms for more details.
As we discussed before, this model can be quite expensive as it effectively introduces the need for join tables and join operations at query time. Database administrators tend to denormalize the data for this very reason, which introduces data-duplication--another very tricky problem to manage.
In graph database modeling, however, normalization is much cheaper for the simple reason that these infamous join operations are much easier to perform. This is why we see a clear tendency in graph models to create thin nodes and relationships, that is, nodes and relationships with few properties on them. These nodes and relationships are very granular and have been granulated.
Related to this pattern is a typical question that we ask ourselves in every modeling session--should I keep this as a property or should the property become its own node? For example, should we model the alcohol percentage of a beer as a property on a beer brand? The following diagram shows the model with the alcohol percentage as a property:

The alternative would be to split the alcohol percentage off as a different kind of node.
The following diagram illustrates this:

Which one of these models is right? I would say both and neither. The real fundamental thing here is that we should be looking at our queries to determine which version is appropriate. In general, I would present the following arguments:
- If we don't need to evaluate the alcohol percentage during the course of a graph traversal, we are probably better off keeping it as a property of the end node of the traversal. After all, we keep our model a bit simpler when doing this, and everyone appreciates simplicity.
- If we need to evaluate the alcohol percentage of a particular (set of) beer brands during the course of our graph traversal, then splitting it off into its own node category is probably a good idea. Traversing through a node is often easier and faster than evaluating properties for each and every path.
As we will see in the next paragraph, many people actually take this approach a step further by working with in-graph indexes.
- Boost程序庫完全開發(fā)指南:深入C++”準(zhǔn)”標(biāo)準(zhǔn)庫(第5版)
- Java 開發(fā)從入門到精通(第2版)
- 單片機C語言程序設(shè)計實訓(xùn)100例:基于STC8051+Proteus仿真與實戰(zhàn)
- C語言從入門到精通(第4版)
- C語言課程設(shè)計
- Hands-On Natural Language Processing with Python
- Python時間序列預(yù)測
- 新一代SDN:VMware NSX 網(wǎng)絡(luò)原理與實踐
- Java程序員面試筆試寶典(第2版)
- CRYENGINE Game Development Blueprints
- SQL Server 2016 從入門到實戰(zhàn)(視頻教學(xué)版)
- 零基礎(chǔ)學(xué)C語言(升級版)
- C語言程序設(shè)計
- Learning Unreal Engine Game Development
- 算法超簡單:趣味游戲帶你輕松入門與實踐