jeudi 21 janvier 2021

SQLAlchemy: Traverse all related tables in a reflected database to generate sample data for testing

Background, Objective

Given an existing database, generate sample data for testing which:

  1. Should consist of actual rows,
  2. Should allow queries across related tables using joins across any depth of hierarchy (e.g. parent/child/grandchild... etc)

Current approach

I have the database reflected, and am using SLQAlchemy's inspect to find all related tables. Iterating this method n times should give me a traversal of n-depth levels of hierarchy:

from sqlalchemy import create_engine, select, MetaData, Table, func, inspect
from sqlalchemy.orm import Session
from sqlalchemy.ext.automap import automap_base
import pymysql

metadata = MetaData()
db_connection_str = f'mysql+pymysql://{db_user}:{db_password}@localhost/{db_name}'
Base = automap_base()
Base.prepare(db_connection, reflect=False)

Event = Base.classes.event # Here, 'Event' is the root/main table from where the traversal begins
num_rows = 10 # Sample
stmt = select(*inspect(Event).persist_selectable.columns).limit(num_rows)
with Session(db_connection) as session:
    resultlist = session.execute(
         stmt
     ).all()


'''
Store the Event table query result somehow
'''
'''
Following section is a first-level traversal. This needs to be iterated to get nth-level traversals
'''

sample_event_id_list = [row[0] for row in resultlist] # 1st column is ID, is there a better way that involves key mapping rather than integer indexing?
sample_event_id_list

for rel in event_inspector.relationships:
    object_class = rel.mapper.class_ 
    stmt  = select(*inspect(object_class).persist_selectable.columns).\
            join(Event).\
            where(Event.id.in_(sample_event_id_list))
    '''
    Store the Event table query result somehow
    '''

Question

Is there a better, more comprehensive way of generating the sample data?

I've looked at Factory Boy, which integrates with SQLAlchemy and also can generate row samples from actual data. But it's not clear whether it can traverse all related tables in a database.

Aucun commentaire:

Enregistrer un commentaire