Gremlin入门

一、Gremlin简介

Gremlin是Apache ThinkerPop框架下的图遍历语言，Gremlin是一种函数式数据流语言，可以使用户使用简洁的方式表述复杂的属性图的遍历或查询。每个Gremlin遍历由一系列步骤（可能存在嵌套）组成，每一步都在数据流（data stream）上执行一个原子操作。

Gremlin 语言包括三个基本的操作：

map-step：对数据流中的对象进行转换；
filter-step：对数据流中的对象就行过滤；
sideEffect-step：对数据流进行计算统计；

Tinkerpop3 模型核心概念

Graph: 维护节点&边的集合，提供访问底层数据库功能，如事务功能
Element: 维护属性集合，和一个字符串label，表明这个element种类
Vertex: 继承自Element，维护了一组入度，出度的边集合
Edge: 继承自Element，维护一组入度，出度vertex节点集合.
Property: kv键值对
VertexProperty: 节点的属性，有一组健值对kv，还有额外的properties 集合。同时也继承自element，必须有自己的id, label.
Cardinality: 「single, list, set」节点属性对应的value是单值，还是列表，或者set。

二、Gremlin查询示例

先介绍一下图中比较核心的几个概念：

Schema：Schema是一种描述语言，这里就是指所有属性和类型的集合，包括边和点的属性，边和点的Label等；
属性类型（PropertyKey ）：只边和点可以使用的属性类型；
顶点类型（VertexLabel）：顶点的类型，比如User，Car等；
边类型（EdgeLabel）：边的类型，比如know，use等；
顶点（Vertex）：就是图中的顶点，代表图中的一个节点；
边（Edge）：就是图中的边，连接两个节点，分为有向边和无向边；

创建属性类型

graph.schema().propertyKey("name").asText().ifNotExist().create()
graph.schema().propertyKey("age").asInt().ifNotExist().create()
graph.schema().propertyKey("city").asText().ifNotExist().create()
graph.schema().propertyKey("lang").asText().ifNotExist().create()
graph.schema().propertyKey("date").asText().ifNotExist().create()
graph.schema().propertyKey("price").asInt().ifNotExist().create()

创建顶点类型

person = graph.schema().vertexLabel("person").properties("name", "age", "city").primaryKeys("name").ifNotExist().create()
software = graph.schema().vertexLabel("software").properties("name", "lang", "price").primaryKeys("name").ifNotExist().create()

创建边类型

knows = graph.schema().edgeLabel("knows").sourceLabel("person").targetLabel("person").properties("date").ifNotExist().create()
created = graph.schema().edgeLabel("created").sourceLabel("person").targetLabel("software").properties("date", "city").ifNotExist().create()

创建顶点和边

marko = graph.addVertex(T.label, "person", "name", "marko", "age", 29, "city", "Beijing")
vadas = graph.addVertex(T.label, "person", "name", "vadas", "age", 27, "city", "Hongkong")
lop = graph.addVertex(T.label, "software", "name", "lop", "lang", "java", "price", 328)
josh = graph.addVertex(T.label, "person", "name", "josh", "age", 32, "city", "Beijing")
ripple = graph.addVertex(T.label, "software", "name", "ripple", "lang", "java", "price", 199)
peter = graph.addVertex(T.label, "person","name", "peter", "age", 29, "city", "Shanghai")

marko.addEdge("knows", vadas, "date", "20160110")
marko.addEdge("knows", josh, "date", "20130220")
marko.addEdge("created", lop, "date", "20171210", "city", "Shanghai")
josh.addEdge("created", ripple, "date", "20151010", "city", "Beijing")
josh.addEdge("created", lop, "date", "20171210", "city", "Beijing")
peter.addEdge("created", lop, "date", "20171210", "city", "Beijing")

展示图

g.V() //创建使用graph，查询使用g，其实g就是graph.traversal()

查询点

g.V().limit(5) // 查询所有点，但限制点的返回数量为5，也可以使用range(x, y)的算子，返回区间内的点数量。
g.V().hasLabel('person') // 查询点的label值为'person'的点。
g.V('11') // 查询id为‘11’的点。

查询边

g.E() // 查询所有边，不推荐使用，边数过大时，这种查询方式不合理，一般需要添加过滤条件或限制返回数量。
g.E('55-81-5') // 查询边id为‘55-81-5’的边。
g.E().hasLabel('knows') // 查询label为‘knows’的边。
g.V('46').outE('knows') // 查询点id为‘46’所有label为‘knows’的边。

查询属性

g.V().limit(3).valueMap() // 查询点的所有属性（可填参数，表示只查询该点， 一个点所有属性一行结果）。
g.V().limit(1).label() // 查询点的label。
g.V().limit(10).values('name') // 查询点的name属性（可不填参数，表示查询所有属性， 一个点每个属性一行结果，只有value，没有key）。

删除点

g.V('600').drop() // 删除ID为600的点。

删除边

g.E('501-502-0').drop() //删除ID为“501-502-0”的边。

查询二度好友和共同好友数

//查询一度好友
g.V('1500771').out()
//查询二度好友
g.V('1500771').out().out().dedup().not(hasId('1500771'))
//查询共同好友数
g.V('1500771').out().out().hasId('2165197').path().simplePath().count()

此外，还有查询，遍历，过滤，路径，迭代，转换，排序，逻辑，统计，分支等语法，可以参考：http://tang.love/2018/11/15/gremlin_traversal_language/。

参考：
http://tang.love/2018/11/15/gremlin_traversal_language/
https://hugegraph.github.io/hugegraph-doc/quickstart/hugegraph-studio.html

PreviousRedis特性 NextElasticsearch安装使用

Last updated 5 years ago

Was this helpful?