SQLAlchemy 学习笔记（三）：ORM 中的关系构建

ryan4yin 收录于类别 Tech

2019-05-21 2019-05-21 约 3125 字预计阅读 7 分钟

/posts/sqlalchemy-notes-3-relationship-and-foreignkey/sqlalchemy-relationships.webp

个人笔记，不保证正确。

一、关系构建：`ForeignKey` 与 `relationship`

关系构建的重点，在于搞清楚这两个函数的用法。ForeignKey 的用法已经在SQL表达式语言 - 表定义中的约束讲过了。主要是 ondelete 和 onupdate 两个参数的用法。

二、`relationship`

relationship 函数在 ORM 中用于构建表之间的关联关系。与 ForeignKey 不同的是，它定义的关系不属于表定义，而是动态计算的。用它定义出来的属性，相当于 SQL 中的视图。

这个函数有点难用，一是因为它的有几个参数不太好理解，二是因为它的参数非常丰富，让人望而却步。下面通过一对多、多对一、多对多几个场景下 relationship 的使用，来一步步熟悉它的用法。

首先初始化：

1
2
3
4
5
from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

1. 一对多

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)

    # 因为 Child 中有 Parent 的 ForeignKey，这边的声明不需要再额外指定什么。
    children = relationship("Child")  # children 的集合，相当于一个视图。

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

一个 Parent 可以有多个 Children，通过 relationship，我们就能直接通过parent.children 得到结果，免去繁琐的 query 语句。

1.1 反向引用

1.1.1 `backref` 与 `back_populates`

那如果我们需要得知 child 的 parent 对象呢？能不能直接访问 child.parent？

为了实现这个功能，SQLAlchemy 提供了 backref 和 back_populates 两个参数。

两个参数的效果完全一致，区别在于，backref 只需要在 Parent 类中声明children，Child.parent 会被动态创建。

而 back_populates 必须在两个类中显式地使用 back_populates，更显繁琐。（但是也更清晰？）

先看 backref 版：

1
2
3
4
5
6
7
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref="parent")  # backref 表示，在 Child 类中动态创建 parent 属性，指向当前类。

# Child 类不需要修改

再看 back_populates 版：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")  # back_populates

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

    # 这边也必须声明，不能省略！
    parent = relationship("Parent", back_populates="children")  # parent 不是集合，是属性！

NOTE：声明的两个 relationship 不需要多余的说明，SQLAlchemy 能自动识别到parent.children 是 collection，child.parent 是 attribute.

1.1.2. 反向引用的参数：`sqlalchemy.orm.backref(name, **kwargs)`

使用 back_populates 时，我们可以很方便地在两个 relationship 函数中指定各种参数：

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent",
                                        lazy='dynamic')  # 指定 lazy 的值

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship("Parent", back_populates="children",
                                      lazy='dynamic')  # 指定 lazy 的值

但是如果使用 backref，因为我们只有一个 relationship 函数，Child.parent 是被隐式创建的，我们该如何指定这个属性的参数呢？

答案就是 backref() 函数，使用它替代 backref 参数的值：

1
2
3
4
5
6
7
8
9
from sqlalchemy.orm import backref

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref=backref("parent", lazy='dynamic'))  # 使用 backref() 函数，指定 Child.parent 属性的参数

# Child 类不需要修改

backref() 的参数会被传递给 relationship()，因此它俩的参数也完全一致。

2. 多对一

A many-to-one is similar to a one-to-many relationship. The difference is that this relationship is looked at from the “many” side.

3. 一对一

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child = relationship("Child",
                                    uselist=False,   # 不使用 collection！这是关键
                                    back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

     # 包含 ForeignKey 的类，此属性默认为 attribute，因此不需要 uselist=False
    parent = relationship("Parent", back_populates="child")

4. 多对多

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# 多对多，必须要使用一个关联表！
association_table = Table('association', Base.metadata,
    Column('left_id', Integer, ForeignKey('left.id')),  # 约定俗成的规矩，左边是 parent
    Column('right_id', Integer, ForeignKey('right.id'))  # 右边是 child
)

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=association_table)  # 专用参数 secondary，用于指定使用的关联表

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

要添加反向引用时，同样可以使用 backref 或 back_populates.

4.1 user2user

如果多对多关系中的两边都是 user，即都是同一个表时，该怎么声明？

例如用户的「关注」与「粉丝」，你是 user，你的粉丝是 user，你关注的账号也是 user。

这个时候，关联表 association_table 的两个键都是 user，SQLAlchemy 无法区分主次，需要手动指定，为此需要使用 primaryjoin 和 secondaryjoin 两个参数。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# 关联表，左侧的 user 正在关注右侧的 user
followers = db.Table('followers',
    db.Column('follower_id', db.Integer, db.ForeignKey('user.id')),  # 左侧
    db.Column('followed_id', db.Integer, db.ForeignKey('user.id'))  # 右侧，被关注的 user
)

class User(UserMixin, db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), index=True, unique=True, nullable=False)
    email = db.Column(db.String(120), index=True, unique=True, nullable=False)
    password_hash = db.Column(db.String(128), nullable=False)

    # 我关注的 users
    followed = db.relationship(
        'User',
        secondary=followers,  # 指定多对多关联表
        primaryjoin=(followers.c.follower_id == id),  # 左侧，用于获取「我关注的 users」的 join 条件
        secondaryjoin=(followers.c.followed_id == id),  # 右侧，用于获取「我的粉丝」的 join 条件
        lazy='dynamic',  # 延迟求值，这样才能用 filter_by 等过滤函数
        backref=db.backref('followers', lazy='dynamic'))  # followers 也要延迟求值

这里比较绕的，就是容易搞混 primaryjoin 和 secondaryjoin 两个参数。

primaryjoin：（多对多中）用于从子对象查询其父对象的 condition（child.parents），默认只考虑外键。
secondaryjoin：（多对多中）用于从父对象查询其所有子对象的condition（parent.children），同样的，默认情况下只考虑外键。

三、ORM 层的 “delete” cascade vs. FOREIGN KEY 层的 “ON DELETE” cascade

之前有讲过 Table 定义中的级联操作：ON DELETE 和 ON UPDATE，可以通过 ForeignKey 的参数指定为 CASCADE.

可 SQLAlchemy 还有一个 relationship 生成 SQL 语句时的配置参数 cascade，另外passive_deletes 也可以指定为 cascade。

有这么多的 cascade，我真的是很懵。这三个 cascade 到底有何差别呢？

外键约束中的 ON DELETE 和 ON UPDATE，与 ORM 层的 CASCADE 在功能上，确实有很多重叠的地方。但是也有很多不同：

数据库层面的 ON DELETE 级联能高效地处理 many-to-one 的关联；我们在 many 方定义外键，也在这里添加 ON DELETE 约束。而在 ORM 层，就刚好相反。SQLAlchemy 在 one 方处理 many 方的删除操作，这意味着它更适合处理 one-to-many 的关联。
数据库层面上，不带 ON DELETE 的外键常用于防止父数据被删除，而导致子数据成为无法被索引到的垃圾数据。如果要在一个 one-to-many 映射上实现这个行为，SQLAlchemy 将外键设置为 NULL 的默认行为可以通过以下两种方式之一捕获：
1. 最简单也最常用的方法，当然是将外键定义为 NOT NULL. 尝试将该列设为 NULL 会触发 NOT NULL constraint exception.
2. 另一种更特殊的方法，是将 passive_deletes 标志设置为字 all. 这会完全禁用 SQLAlchemy 将外键列设置为 NULL 的行为，并且 DELETE 父数据而不会对子数据产生任何影响。这样才能触发数据库层面的 ON DELETE 约束，或者其他的触发器。
3. 数据库层面的 ON DELETE 级联比 ORM 层面的级联更高效。数据库可以同时在多个 relationship 中链接一系列级联操作。
4. SQLAlchemy 不需要这么复杂，因为我们通过将 passive_deletes 选项与正确配置的外键约束结合使用，提供与数据库的 ON DELETE 功能的平滑集成。

方法一：ORM 层的 cascade 实现

relationship 的 cascade 参数决定了修改父表时，什么时候子表要进行级联操作。它的可选项有（str，选项之间用逗号分隔）：

save-update：默认选项之一。在 add（对应 SQL 的 insert 或 update）一个对象的时候，会 add 所有它相关联的对象。
merge：默认选项之一。在 merge（相当字典的update操作，有就替换掉，没有就合并）一个对象的时候，会 merge 所有和它相关联的对象。
expunge ：移除操作的时候，会将相关联的对象也进行移除。这个操作只是从session中移除，并不会真正的从数据库中删除。
delete：删除父表数据时，同时删除与它关联的数据。
delete-orphan：当子对象与父对象解除关系时，删除掉此子对象（孤儿）。（其实还是没懂。。）
refresh-expire：不常用。
all：表示选中除 delete-orphan 之外的所有选项。（因此 all, delete-orphan 很常用，它才是真正的 all）

默认属性是 “save-update, merge”.

这只是简略的说明，上述几个参数的详细文档见SQLAlchemy - Cascades

方法二：数据库层的 cascade 实现

将 ForeignKey 的 ondelete 和 onupdate 参数指定为 CASCADE，实现数据库层面的级联。
为 relationship 添加关键字参数 passive_deletes="all"，这样就完全禁用 SQLAlchemy 将外键列设置为 NULL 的行为，并且 DELETE 父数据不会对子数据产生任何影响。

这样 DELETE 操作时，就会触发数据库的 ON DELETE 约束，从而级联删除子数据。

目录

SQLAlchemy 学习笔记（三）：ORM 中的关系构建

一、关系构建：`ForeignKey` 与 `relationship`

二、`relationship`

1. 一对多

1.1 反向引用

1.1.1 `backref` 与 `back_populates`

1.1.2. 反向引用的参数：`sqlalchemy.orm.backref(name, **kwargs)`

2. 多对一

3. 一对一

4. 多对多

4.1 user2user

三、ORM 层的 “delete” cascade vs. FOREIGN KEY 层的 “ON DELETE” cascade

方法一：ORM 层的 cascade 实现

方法二：数据库层的 cascade 实现

参考

相关内容

目录

SQLAlchemy 学习笔记（三）：ORM 中的关系构建

一、关系构建：ForeignKey 与 relationship

二、relationship

1. 一对多

1.1 反向引用

1.1.1 backref 与 back_populates

1.1.2. 反向引用的参数：sqlalchemy.orm.backref(name, **kwargs)

2. 多对一

3. 一对一

4. 多对多

4.1 user2user

三、ORM 层 的 “delete” cascade vs. FOREIGN KEY 层的 “ON DELETE” cascade

方法一：ORM 层的 cascade 实现

方法二：数据库层的 cascade 实现

参考

相关内容

分布式数据库的一致性问题与共识算法

Python 实用技巧与常见错误集锦

Python 并发编程：PoolExecutor 篇

一、关系构建：`ForeignKey` 与 `relationship`

二、`relationship`

1.1.1 `backref` 与 `back_populates`

1.1.2. 反向引用的参数：`sqlalchemy.orm.backref(name, **kwargs)`

三、ORM 层的 “delete” cascade vs. FOREIGN KEY 层的 “ON DELETE” cascade