fredag 28 januari 2011

Why I store images as database blobs

A lot of sources tells you that putting images in the db is simply not a good thing to do.

First, I list advantages, then I meet the critics.

Advantages of storing images in the db
There will be no additional error source, the data will always be consistent
I have not found any performance hits (yet)
In php, image data and strings are equivalent, and you store strings in the database, right?
You can use the same database from all hosts in a web cluster environment.
You dont need to invent filenames.
You dont need to give write permissions to certain directories.
Your website will be a clone of the contents in the version manager.
You dont need to delete images afterwards.
You write less code to accomplish the task.
You can backup by only dumping the db.

Sometimes you need to temporarily manage an image, for instance when a user creates a user account and also are uploading an avatar. Using the "input" tag with the type "file" for that can make the account creation a multi step process. Then it can be handy to store the image in the session. The advantages of storing images in the session are almost the same as for storing them in the db.

Critics against storing images in the db
Here are some critics against putting images in the db.
  • Harder to migrate to a different database platform.

    I have never needed to do that, so if I had taken this into account, I would have done something I dont need. If I some day need to convert a db, I will not see it as a problem however. Writing conversion scripts is something I can do blindfolded.

  • Better backup to have the images as files

    I backup my db automatically twice a day. If I felt that I must have a backup consisting of image files, I would write a script that sends the files over sftp to the host where I keep backups (which I have in another town btw). The script that sent the files would extract the images from the db and send them over sftp without letting them land on the fs in between. The only place where I needed to store the images would be the backup host. After some weeks, the backed up data would be so old that I can remove it, using a cronjob on the backup server.

  • Accessing the images over ftp or using the web browser is easier than accessing them in the db

    I have never felt the need to access the files outside its application, so if I had taken this into account, I would have done something I dont need to. The day I need to "access" them, I write a simple script that will list them on a webpage. But not before.

  • Data length can be limited to say 2G.


    I dont use images that big. Actually, my mediumblobs have a max size of 16 MB.

  • In some sql servers, when a table has blob column, a page is reserved for every row, whether or not that column has any data

    Thats possible. As long as I cannot notice a performance disadvantage however. It doesnt bother me a lot.

  • better performance storing images in the fs than having a script that pulls the image from the db, stores it as a temporary file and then "streaming that to the user".


    No, thats the wrong way. Instead, you create an img tag like this: <img src="echo_image.php?id="<?=>">. The script echo_image.php fetches the image data from the db by using the id given to it in $_GET. Then it just echoes the blob. It will not land on the fs but be outputted directly to the browser. No need to maintain the original filename if the user will only see the image directly in the browser without storing them on his/hers computer. So dont use the Content-Disposition header if you dont need to. And do not store the original filename in the db.

  • An OLE header can be added in some cases.


    Has not happened to me, but I do not use access very often. If I get stuck in such a problem, I can modify the method and maybe put images on the fs.

Golden rule
Dont make benchmarks if you dont have to. Concentrate on first create clean code and a clean database solution. If it is not slow and does its job, you are done!

Having a blob in a separate table
Some people sais that they prefer to store the blob data in a table itself and then refer to it. That would be good for performance. A lot of persons agree that its a good idea to at least put a blob as the last column in a table.

Having certain tables that just stores blobs is not clean if you ask me. It adds tables to your database and also relations. It makes your database ugly instead of clean. You dont have certain tables for large strings, do you? As long as you cannot notice the difference, put the blob as a column in the table itself. Dont make entities of blobs.

This summarizes my personal opinion and I respect people with other views.

/Anders

Inga kommentarer:

Skicka en kommentar