Sunday, November 12, 2023

On SQLAlchemy

I've been using SQLAlchemy and reading about it for a few months now, and I don't get it. I don't mean I don't get SQLAlchemy, I've learned how to use it. But I don't get why people use it, and are actually enthousiastic about it. 

I'm just writing this article because I can't seem to find any articles on the internet that are critical towards it, and I feel there should be one.

To explain why I don't like SQLAlchemy I will just go through all the advantages and disadvantages that are mentioned and respond to them.

Let's abbreviate SQLAlchemy to Alchemy and explain what Alchemy should do. It is an ORM (Object-Relation-Mapper, so it converts database records to Python objects, and the other way around), and it has a query generation language that allows you to write queries using method chaining. Alchemy changed it's syntax in version 2.

Is there an alternative to Alchemy? Yes you can use psycopg to perform plain SQL queries and manually write simple functions that create objects from database records.

I read SQLAlchemy 2 In Practice which is a good book, but the author doesn't even ask the question why we should use Alchemy, and implies that it is the only way to work with database. That got me thinking. 

Grinberg actually wrote the book because he created a tutorial on Flask and found that most questions were asked about Alchemy:

Many of the questions are from developers that need to know how to do something a little more complex than what I present in the article, but are unable to figure it out how to do it on their own.

Advantages

Easy to set up

If you know what to do, creating a few simple models to set up the ORM, is indeed easy to do, and creating ORM functionality manually also takes some time. So setting up the models and generating database tables is easy enough. It gets more tricky when you need to set up a many-to-many table, especially when it has extra columns. 

Alchemy gives abstraction to the backend database. So, an average developer does not have to worry about SQL statements.

This is not only not true, it's deceptive. You need to know SQL, and you need to know the the syntax of Alchemy as well. If you know how to do something in SQL and want to do it in Alchemy, you have to find out how to do it, and its syntax has many quirks and idiosyncrasies.

The transition to other databases becomes easier.

This is true, if you have used SQLite and want to switch to Postgres, and you haven't used any SQLite specific constructs, you can switch without changing any code. But this is not a big advantage. Most of the syntax is standard SQL anyway. And how often does it occur that you change one database for another? Finally, changing one database for another and expecting this has no side effects is naive. In fact it may bring quite substantial problems with it.

Readability and Maintainability: SQLAlchemy code tends to be more readable and easier to maintain than raw SQL queries. It leverages Python classes and objects to represent database tables and rows, making your code intuitive and self-explanatory.

It is true that adding a join to a query requires less code. Except from that, plain SQL is just as readable and just as easy to maintain.

Security: SQLAlchemy includes built-in protection against SQL injection. It automatically escapes user input, effectively thwarting malicious SQL code execution.

Every modern database library includes protection against SQL injection.

Queries are optimized and may work better than SQL if you wrote it yourself unless you are an SQL veteran.

This is also deceptive. Alchemy is slower by default, simply because it forms a wrapper around SQL. Furthermore, since Alchemy has a different conceptual model than SQL, it is not clear what SQL queries are actually performed. Only by learning the ins and outs of Alchemy, and tweaking many of the model declarations, will you be able to ensure that Alchemy performs efficient queries. Of course there will be some edge cases in which Alchemy queries are actually more efficient than plain SQL queries, but only the ones written by someone with little experience in SQL.

On StackOverflow, there are, at the time of writing, 16,000 questions with SQLAlchemy tag. Therefore, you should not be concerned about unsolvable bugs when using this tool.

Quite a relief!

Disadvantages

There could be instances where Alchemy might become inefficient. Therefore, knowing SQL is always desired.

When performance matters you will need to learn Alchemy from A to Z. Alchemy still allows you to use plain SQL queries, so this may be a way to escape.

Performance overhead: SQLAlchemy introduces a slight performance overhead compared to raw SQL queries. While this overhead is generally inconsequential for most applications, it may become a concern for high-performance scenarios.

True

Learning Curve: SQLAlchemy has a learning curve, particularly for developers new to ORMs. It takes some time to become proficient in SQLAlchemy’s API and concepts.

Believe me, it takes a lot of time. Alchemy has many fixes for problems it created itself. It is a completely different mental model you need to acquire and you need to switch between this model and the plain SQL model.

Additional disadvantages

These disadvantages come from me.

Alchemy diffuses domain model code with storage code.

It's good practice to separate domain model code with database storage code. It makes you aware of database queries and allows you to focus on them. Alchemy combines the two. It may create a database request when you just get an attribute. This is not a problem for a simple operation. But when you operate on thousands of objects, you will need to know exactly what Alchemy does internally, and make tweaks to the model to ensure that it performs the right kind of queries.

Asynchronous access

When developing asynchronous code for Alchemy, you need to make changes in a number of places to prevent Alchemy from making implicit database calls. Reading about that in Grinberg's book positively scared me.

Additional advantages

Alembic

To be fair, Alchemy comes with a tool that is actually quite nice, and which is not often mentioned in the advantages I read. It's Alembic, the database migration tool that generates queries based on the changes to the model you make. This allows you to keep your model up to date and have Alembic take care of migrations.

Conclusion

As this is a critical article, I recommend against using SQLAlchemy. Writing plain SQL queries is just easier, and gives you more control. Creating simple ORM functionality and writing migration scripts doesn't need to be complicated either.

References

https://www.analyticsvidhya.com/blog/2022/07/a-brief-introduction-to-sql-alchemy/

https://python.plainenglish.io/sqlalchemy-vs-raw-sql-queries-in-python-a-comparative-example-b7a838ebef82

https://trantriducs.medium.com/orm-and-sqlalchemy-the-magic-wand-in-database-management-a2d02877a57a


No comments:

Post a Comment

On SQLAlchemy

I've been using SQLAlchemy and reading about it for a few months now, and I don't get it. I don't mean I don't get SQLAlchem...